Introduction to Data Wrangling using R and tidyverse
- Level(s) of Study: Short course
- Start Date(s): 26 April 2023
- Duration: Wednesday to Thursday 9.30 am - 5.30 pm
- Study Mode(s): Short course
- Campus: City Campus
On this two-day course, you will gain a comprehensive practical introduction to data wrangling using R. In particular, we focus on tools provided by R's `tidyverse`, including `dplyr`, `tidyr`, `purrr`, etc. Data wrangling is the art of taking raw and messy data and formating and cleaning it so that data analysis and visualization etc may be performed on it. Done poorly, it can be a time consuming, labourious, and error-prone. Fortunately, the tools provided by R's `tidyverse` allow us to do data wrangling in a fast, efficient, and high-level manner, which can have dramatic consequence for ease and speed with which we analyse data.
This course is aimed at anyone who is involved in real world data analysis, where the raw data is messy and complex. Data analysis of this kind is practiced widely throughout academic scientific research, as well as widely throughout the public and private sectors.
Level: CPD, Advanced / Professional
The course will cover these key topics:
- Reading in data into R using tools such as readr and readxl
- Wrangling with the powerful `dplyr` R package, focusing on filtering observations, selecting and modifying variables, and other major data manipulation operations
- Summarising data in `dplyr` using descriptive statistics
- Merging and joining data independent data frames
- Pivoting and reshaping data using the `tidyr` R package
The course will take 6 contact hours per day plus two 2-hour breaks.
The sessions will be as follows:
- Session 1: 9:30am-11:30am;
- Session 2: 12:30am-2:30pm;
- Session 3: 3:30pm-17:30pm
Tutor Profile: Mark Andrews is an Associate Professor at Nottingham Trent University whose research and teaching is focused on statistical methodology in research in the social and biological sciences. He is the author of 2021 textbook on data science using R that is aimed at scientific researchers, and has a forthcoming new textbook on statistics and data science that is aimed at undergraduates in science courses. His background is in computational cognitive science and mathematical psychology.
Other available online CPD courses in this series include
Introduction to statistics using R and Rstudio CPD course
Introduction to Data Visualization with R using ggplot
Introduction to Generalized Linear Models in R
Introduction to Multilevel (hierarchical, or mixed effects) Models in R
Introduction to Bayesian Data Analysis with R
Any questions? Contact email@example.com, Commercial Manager, School of Social Sciences.
The course tutor was fantastic at explaining everything, the pace was just right, and the content was exactly what I was expecting and more. I will definitely be using all of the techniques covered in the course in my own data analysis.