Lab #09: Logistic Regression

due Sun, Apr 11 11:59 PM

Goals

Getting started

Every team member should go to the course GitHub organization and locate their lab09 repository, which should have the prefix lab09. Copy the URL of the repository and clone the remote repo in RStudio.

As you work on this lab, merge conflicts may arise. Refer back to Lab #05 for how to fix them. You and your team are free to divide up the work how you think is best. However, everyone should understand all code in the lab’s final submission.

Packages

This lab will use the packages below.

library(tidyverse)
library(broom)

Data

The 2018 Congressional election saw a near-record number of retirements. In this lab, you will work with data from the 2018 election to investigate retirement.

The variables in this dataset are:

Four of the seats have missing values for retiring, gopseat, and age. These are seats that are new or the representative died. Before beginning the exercises, use filter() to overwrite retirement so there is no missing data.

Exercises

  1. Run and display a logistic regression model with retirement as the response variable and the predictors below. Note you may need to create new variables for this problem. Write out the model.
  1. What do each of these coefficients mean? Discuss what each means in terms of odds ratios.

  2. Create a predicted probability plot showing the effect of presidential vote for both Democrats and Republicans. Set age at 60 when creating this plot. Comment on what you observe.

  3. The current representative for North Carolina’s 4th district is Democrat David Price, who was a Duke Public Policy and Political Science Professor before being elected to Congress. In 2018, Price was 78 years old and Hillary Clinton received 70.75% of the vote in his district four years earlier. According to the model, what was the probability that Price would retire in 2018?

  4. Based on our model, did any member of Congress have over a 50% chance of retiring in 2018? Which district’s representative was given the highest probability of a retirement and what was that probability? Did they actually retire? (Hint: augment()is very useful for this problem.)

Sources