The DoSS Toolkit is a bunch of self-paced modules to help you learn and use R.
We all know that R is a critical part of applied statistics and data science these days, but it can have a steep learning curve and be intimidating to get started with.
The Department of Statistical Sciences (DoSS) toolkit is a free series of open source online modules written by undergraduates, that their fellow students and the public can use to learn the essentials of R.
How to use this resource
If you have never used R before
You use this resource by running R code! This may sound intimidating if you’ve never used R before, so we’ve made a video that walks through what you need to do.
Get started by going to R Studio Cloud - https://rstudio.cloud - and creating an account. When you’ve signed up, start a new project, and copy-paste the code below to install packages. (If you already have R and R Studio working on your local computer then you don’t have to use R Studio Cloud, you can install the packages on your local machine instead.)
Writing R Packages, written by Matthew Wankiewicz.
Getting started with Blogdown, written by Annie Collins.
Getting started with Shiny, written by Matthew Wankiewicz.
Annie Collins is an undergraduate student in the Department of Mathematics specializing in applied mathematics and statistics with a minor in history and philosophy of science. In her free time, she focuses her efforts on student governance, promoting women’s representation in STEM, and working with data in the non-profit and charitable sector.
Haoluan Chen is an undergraduate student in the Department of Statistical Science specializing in applied statistics. He is interested in applying data science techniques, especially in NLP, to gain insight from the data.
Isaac Ehrlich is an undergraduate student in Statistics and Cognitive Science at the University of Toronto. He enjoys using R for everything from analysing trends in his recent movie-viewing history, to his past research building models on human categorization.
Mariam Walaa is an undergraduate student in the Department of Computer and Mathematical Sciences at University of Toronto Scarborough, majoring in Mathematics and minoring in GIS and statistics. Mariam enjoys learning about how to work with data such as spatial and text data to extract insights.
Marija Pejcinovska is a graduate student in the Department of Statistical Sciences. Her research is motivated by modelling challenges that arise with “complicated” data (sparse/highly biased/poor quality data), usually in the context of social or health inequalities.
Matthew Wankiewicz is an undergraduate student at the University of Toronto, majoring in Statistics and minoring in Mathematics and the History and Philosophy of Science.
Michael Chong is a graduate student in the Department of Statistical Sciences. His research aims to build statistical models for demographic estimation in contexts where high quality data are unavailable. There is almost always an active R session on his computer!
Paul Hodgetts is a Master of Information candidate concentrating in Human-Centred Data Science in the Faculty of Information, University of Toronto. He sincerely believes that Calvin and Hobbes is the greatest comic ever produced.
Rohan Alexander is an assistant professor in the Faculty of Information and the Department of Statistical Sciences. Some people convert to catholicism upon marriage, Rohan converted to R. His greatest professional achievement is probably getting a pull request accepted into R for Data Science (it was just fixing a minor typo but still).
Samantha-Jo Caetano is an assistant professor (teaching stream) in the Department of Statistical Sciences. She loves statistics, socializing, her family, and her dogs, not necessarily in that order.
Shirley Deng is an undergraduate student specializing in Statistics and majoring in Mathematics. Meticulous and soft-hearted, she often finds herself engrossed in new pastimes by the second at the influence of her peers. One of which that has remained a longtime constant - spending an excessive amount of time helping people debug their R code.
Yena Joo is an undergraduate student majoring in Economics and double minoring in Statistics and Computer Science.
We gratefully acknowledge the support of Professor Bethany White, Chair Radu Craiu, and the U of T Faculty of Arts & Sciences Pedagogical Innovation and Experimentation Fund.
We’d like to acknowledge the help of:
We’d like to thank Alex Cookson for his collection of datasets.
This toolkit builds on, and complements, the work of many others, including:
Hester, Jim, Gábor Csárdi, Hadley Wickham, Winston Chang, Martin Morgan and Dan Tenenbaum (2020). remotes: R Package Installation from Remote Repositories, Including ‘GitHub’. R package version 2.2.0. https://CRAN.R-project.org/package=remotes
R Core Team (2020). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.
The best way to contribute fixes and minor typos is to make a pull request on GitHub.
If you are interested in contributing lessons or modules, then please contact Rohan Alexander. We are particularly interested in partnering with an institution where the language of instruction is French to develop a French language version.