DATA1001 Monday 3pm Semester 2
Substitue
Hi! My name is Tom and I’m filling in for your regular tutor for the first two weeks of term. This page here will not be updated past week two.
Assessment Dates
Rough assessment dates have been provided below. These have been added purely to help you have an idea about what is coming up. These assessment dates should not be taken as fact. It is on the onus of students to check Canvas, EdStem and the Unit of Study outline.
Extra Help/Staff Contact
Any extra help about course material should be asked on EdStem.
Important Links
Calendar
Week  Slides  Class Notes  Misc.  Further Learning  Assessments 

Week 1 (Jul 31)  Introduction Lab 1  Lab_1.Rmd Lab_1.html  Software Installation Guide We won’t have much time in this lab to debug R installation issues. Please pop in to the drop in sessions if you need help!  Britannica Simpson’s Paradox article R Markdown Cheat Sheet  Evaluate quiz 1 (Aug 6) 
Week 2 (Aug 7)  Lab 2  Lab_2.Rmd Lab_2.html  See the “Lab_2” html or Rmd files to the left for examples of customising histograms.  ggplot2 Cheat Sheet Article on how to pick the right chart type  Evaluate quiz 2 (Aug 13) 
Week 3 (Aug 14)  Lab 3  Lab_3.Rmd Lab_3.html  Please see the “Week 3: Notes/Clarifications” section below for claricfication on quartiles in R!  –  Evaluate quiz 3 (Aug 20) 
Week 4 (Aug 21)  Lab 4  Lab_4.Rmd Lab_4.html  –  –  Project 1 Individual and Group Parts Due (Aug 25) Evaluate quiz 4 (Aug 27) 
Week 5 (Aug 28)  –  –  Regular tutor back to campus.  –  – 
Week 3: Notes/Clarifications
In the tutorial, there was a question where we were provided the following list of data: x < c(1,2,3,4,5,6,7,8,9,10)
. When we ran summary(x)
on the data, we were given that the first quartile has the value 3.25, and that the third quartile has the value 7.75. This would seem to go against the convention taught in NSW highschools, where typically we would have that the first quartile has the value 3, and the third quartile has the value 8. Shoutout to the group who noticed the inconsistency here! So what’s going on?
Firstly, a note has been added to the bottom of the “3.3 Challenge” Canvas page.
But, to explain it here, there are different ways of calculating quantiles. In fact, if you run ?quantile
, you’ll see nine different ways of calculating them. In practice though, we just use ggplot to create our boxplots, so we’ll stick with the output that ggplot uses.
If you’re interested in further reading, check out the following links:

https://bookdown.org/dli/rguide/descriptivestatisticsforavector.html#describingdistribution

https://stackoverflow.com/questions/40634693/lowerandupperquartilesinboxplotinr/40639848#40639848

https://stackoverflow.com/questions/70637398/calculatingthefivenumbersummaryinrresultsinincorrectvalues