2019 - Pre Convention Workshop #6 - Better Data Cleaning Using R and the Tidyverse

May 30, 2019 09:00AM to May 30, 2019 04:30PM
Halifax Marriott Harbourfront Hotel (1919 Upper Water St., Halifax, NS, B3J 3J5)

Presented by:

Mark C. Adkins, Robert Cribbie

Sponsored by:

 

Continuing Education Credits:

6

Notes:

Attendees should bring personal laptops and pre-install R, RStudio, and the tidyverse package

Cost:

CPA Member: Early Registration ($250+tax) - Regular Registartion ($300+tax)

CPA Student Affiliate: Early Registration ($190+tax) - Regular Registartion ($225+tax)

Non-Member: Early Registration ($325+tax) - Regular Registartion ($400+tax)

Student Non Affiliate: Early Registration ($230+tax) - Regular Registartion ($250+tax)

Please note: early registration (until end-of-day April 30th, 2019) and regular registration (after April 30th, 2019)

Duration:

Full Day (9:00 – 16:30)

Target Audience:

Academic, non-academic, and graduate student researchers.

Skill/Difficulty Level:

Introductory

Workshop Description:

 

Data cleaning is a time consuming and often error-prone process which every researcher will experience. It is said that 80% of data analysis time is spent on the process of cleaning and preparing the data (Dasu & Johnson, 2003), but this time spent cleaning and preparing data can be dramatically reduced by using the right tools for the job. The workshop will begin with a foundational discussion on data structures within R and general coding practices. The focus will then shift to the more practical topics of using the tidyverse collection of packages to import/export data, inspect and manipulate specific types of data (including categorical, dates, and character type), reformat data, and produce quality graphics to showcase your data. A publicly available dataset will be used as a running example throughout the workshop to provide hands-on experience using each of the techniques discussed. Periodic interactive exercises will be presented to allow attendees an opportunity to work collaboratively to solve common data cleaning dilemmas. To help tie the whole workshop into a cohesive process, the concept of a data "pipeline" will be introduced as way to conceptualize the data cleaning process. By teaching better coding practices throughout the workshop, our resulting data cleaning pipelines will be coded in a more human-readable format making it both easier to debug and share our code when working collaboratively on projects.

Learning Outcomes:

 

  1. Attendees will have a firm grasp of managing and manipulating data within R
  2. Learn to code in a cleaner, less error-prone, fashion to facilitate sharing R-scripts and working collaboratively
  3. Keep in step with open science practices in terms of sharing code that other users can more easily understand
LocationHalifax Marriott Harbourfront Hotel (1919 Upper Water St., Halifax, NS, B3J 3J5)