DATA1001 Monday 3pm Semester 2

Substitue

Hi! My name is Tom and I’m filling in for your regular tutor for the first two weeks of term. This page here will not be updated past week two.

Assessment Dates

Rough assessment dates have been provided below. These have been added purely to help you have an idea about what is coming up. These assessment dates should not be taken as fact. It is on the onus of students to check Canvas, EdStem and the Unit of Study outline.

Extra Help/Staff Contact

Any extra help about course material should be asked on EdStem.

Calendar

Week Slides Class Notes Misc. Further Learning Assessments
Week 1
(Jul 31)
Introduction
Lab 1
Lab_1.Rmd
Lab_1.html
Software Installation Guide
We won’t have much time in this lab to debug R installation issues. Please pop in to the drop in sessions if you need help!
Britannica Simpson’s Paradox article
R Markdown Cheat Sheet
Evaluate quiz 1 (Aug 6)
Week 2
(Aug 7)
Lab 2 Lab_2.Rmd
Lab_2.html
See the “Lab_2” html or Rmd files to the left for examples of customising histograms. ggplot2 Cheat Sheet
Article on how to pick the right chart type
Evaluate quiz 2 (Aug 13)
Week 3
(Aug 14)
Lab 3 Lab_3.Rmd
Lab_3.html
Please see the “Week 3: Notes/Clarifications” section below for claricfication on quartiles in R! Evaluate quiz 3 (Aug 20)
Week 4
(Aug 21)
Lab 4 Lab_4.Rmd
Lab_4.html
Project 1 Individual and Group Parts Due (Aug 25)
Evaluate quiz 4 (Aug 27)
Week 5
(Aug 28)
Regular tutor back to campus.

Week 3: Notes/Clarifications


In the tutorial, there was a question where we were provided the following list of data: x <- c(1,2,3,4,5,6,7,8,9,10). When we ran summary(x) on the data, we were given that the first quartile has the value 3.25, and that the third quartile has the value 7.75. This would seem to go against the convention taught in NSW highschools, where typically we would have that the first quartile has the value 3, and the third quartile has the value 8. Shoutout to the group who noticed the inconsistency here! So what’s going on?

Firstly, a note has been added to the bottom of the “3.3 Challenge” Canvas page.

But, to explain it here, there are different ways of calculating quantiles. In fact, if you run ?quantile, you’ll see nine different ways of calculating them. In practice though, we just use ggplot to create our boxplots, so we’ll stick with the output that ggplot uses.

If you’re interested in further reading, check out the following links:

  • https://bookdown.org/dli/rguide/descriptive-statistics-for-a-vector.html#describing-distribution

  • https://stackoverflow.com/questions/40634693/lower-and-upper-quartiles-in-boxplot-in-r/40639848#40639848

  • https://stackoverflow.com/questions/70637398/calculating-the-five-number-summary-in-r-results-in-incorrect-values