Click the link here https://classroom.github.com/a/SN5XgV-z to create your private repository for lab #02 on GitHub.
Click “Accept this assignment” and refresh the page.
Click the link following “Your assignment repository has been created:”. This opens the repository for lab #02 on GitHub.
Click on the green CODE button, select Use HTTPS (this might already be selected by default), and click on the clipboard icon to copy the repo URL.
Log in to the RStudio Docker container.
Go to File \(\rightarrow\) New Project \(\rightarrow\) Version Control \(\rightarrow\) Git.
Copy and paste the URL of your assignment repo into the dialog box Repository URL.
Click Create Project, and the files from your GitHub repo will be displayed in the Files pane in RStudio.
Click lab02.Rmd in the lower-right “Files” pane to open the template R Markdown file.
Update the YAML header with your name and today’s date. Then, knit the document and make sure the resulting PDF file has the correct date. Stage, commit, and push your changes.
The data we will examine is loaded automatically with the tidyverse. It is called mpg
and contains fuel economy and characteristics of cars from the Environmental Protection Agency (EPA) from http://fueleconomy.gov.
To begin, familiarize yourself with the dataset by reading the documentation. Remember, you can pull up the documentation by running ?mpg
in the console.
All plots should follow the best visualization practices we have discussed in lecture. Plots should include an informative title, axes should be labeled, and careful consideration should be given to aesthetic choices.
In addition, code and narrative should not exceed the 80 character limit. To help police this, add a vertical line at 80 characters by clicking “Tools” \(\rightarrow\) “Global Options” \(\rightarrow\) “Code” \(\rightarrow\) “Display”, then set “Margin Column” to 80 and click “Apply”.
Your assignment should have at least three meaningful commits and all code chunks should have meaningful names.
Generate a scatterplot of city miles per gallon (cty
) versus highway miles per gallon (hwy
) with points colored by class
.
Note that there are only so many possibilities of highway and city miles per gallon, so some of the points are on top of each other. Using geom_jitter()
or a position =
argument in geom_point()
, add a small amount of random variation to each point. Briefly comment on the differences between the plots you constructed in 1 and 2. What are the advantages and disadvantages of each?
Examine the relationship between city and highway miles per gallon, with a separate plot for each type of drive train (drv
).
Create side-by-side boxplots of city miles per gallon for each class
. Briefly comment on what you notice.
Create a segmented bar chart with one bar per class
, each bar going from 0 - 1, with the fill determined by the type of drive train (drv
). What do you notice?
Recreate the plot below. The functions theme_bw()
and labs()
will be helpful. The size
of the points is 0.50.
Once you are fully satisfied with your lab, Knit to PDF to create a PDF document.
Before you wrap up the assignment, make sure all documents are updated on your GitHub repo. we will be checking these to make sure you have been practicing how to commit and push changes.
Remember – you must turn in a PDF file to the Gradescope page before the submission deadline for full credit.
Once your work is finalized in your GitHub repo, you will submit it to Gradescope. Your assignment must be submitted on Gradescope by the deadline to be considered “on time”.
Be sure to identify which problems are on each page using Gradescope.