library(tidyverse)
library(infer)
In this lab you will be working with a dataset containing information on individuals from the Donner party. The Donner party was a group of pioneers traveling to California from Missouri on the Oregon trail by wagon train. They were trapped in the Sierra Nevada mountains by extremely heavy snowfall during the winter of 1846-1847 and eventually ran out of food supplies. Of the 90 members of the party, only 48 survived. We will use logistic regression to model the probability of survival based on age and sex. Relevant data is contained in donner.csv
. Start by reading in the data, then answer the questions below.
What is the relationship between sex and survival? Effectively visualize the relationship and summarize what you observe in a brief sentence.
What is the relationship between age and survival? Effectively visualize the relationship and summarize what you observe in a brief sentence.
Fit a logistic regression model to predict survival based on sex and age. Do not include the interaction term. Report the model output in tidy format.
Write out the logistic regression model.
Provide an interpretation of \(e^{\hat{\beta}_0}\) in the context of the problem.
Provide an interpretation of \(e^{\hat{\beta}_\text{age}}\) in the context of the problem.
Provide an interpretation of \(e^{\hat{\beta}_\text{sex}}\) in the context of the problem.
What is the predicted probability of survival for a 60 year old man? For a 20 year old man? For a female newborn?
Create a predicted probability plot showing the effect of age and sex on survival. Comment on what you observe.
How young or old must a female member of the Donner party be in order to have a predicted probability of survival greater than 0.75 based on your logistic regression model? Use algebra (not code) to answer.
What are limitations of your model? Answer in a brief paragraph.
Knit to PDF to create a PDF document. Stage and commit all remaining changes, and push your work to GitHub. Make sure all files are updated on your GitHub repo.
Only upload your PDF document to Gradescope. Before you submit the uploaded document, mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages. Associate the “Overall” section with the first page.