Lab #05: Fixing merge conflicts & computing probabilities

due Sun, Feb 28 11:59 PM

Goals

Getting started

Every team member should go to the course GitHub organization and locate their lab05 repository, which should have the prefix lab05.

Merge conflicts

You may have seen this already through the course of your collaboration last week in Lab #04. When two collaborators make changes to a file and push the file to their repository, git merges them.

If these two files have conflicting content on the same line, git will produce a merge conflict. Merge conflicts need to be resolved manually.

To resolve the merge conflict, decide if you want to keep only your text/code, the text/code on GitHub, or incorporate changes from both sets. Delete the conflict markers <<<<<<<, =======, >>>>>>> and make the changes you want in the final merge.

Assign numbers 1, 2, 3, and 4 to each of your team members (if only 3 team members, just numbers 1 through 3). Go through the following steps carefully, which simulate a merge conflict. Completing this exercise will be part of the lab grade.

Resolving a merge conflict

Members 3 & 4: Look at the group’s repo on GitHub to ensure that the other members’ files are pushed to GitHub after every step.

Before pushing, remember to stage your changes. The stage box next to lab05.Rmd should have a check mark inside it.

Step 1: Everyone: Copy the URL and clone the remote repository in RStudio.

Step 2: Member 1: Change the team name to your team name. Knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub.

Step 3: Member 2: Change the team name to something different (not your team name). Knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub.

Member 2 should get an error on the attempted push.

Pull and review the document with the merge conflict. Member 2 should display and read the error to the entire team. A merge conflict occurred because Member 2 edited the same part of the document as Member 1. Resolve the conflict with whichever name you want to keep (your real team name), knit to PDF, then stage, commit, and push your .Rmd and PDF to GitHub.

Step 4: Member 3 or 4: Write some narrative below the last code chunk in your lab05.Rmd file. Knit to PDF, then stage, commit and push your .Rmd and PDF to GitHub.

You should get an error. Read this error to your teammates and show them the error by sharing your screen.

Pull and share your screen with your team. You should notice that the author line in the header has been updated. Knit to PDF, then stage, commit and push your .Rmd and PDF to GitHub.

Step 5: Everyone: Pull and delete the narrative. All team members should have the same content in the .Rmd file before proceeding to the exercises.

Exercises

Packages

library(tidyverse)

Data

The data comes from a cohort study of collegiate athletes using the National Collegiate Athletic Association (NCAA) Injury Surveillance System; certified athletic trainers recorded data during the 1997–2000 academic years.

The objective of the study was to compare sex differences regarding the incidence of concussions among collegiate athletes across three seasons in various sports.

More about the study can be found in the second reference of the References section.

concussion <- read_table("http://users.stat.ufl.edu/~winner/data/concussion.dat",
                         col_names = FALSE)

Write all R code according to the style guidelines discussed in class. Be especially careful about staying within the 80 character limit. Have your resulting tibble displayed for each exercise. Use dplyr functions where applicable.

For each probability exercise, assume we are randomly selecting individuals from this cohort study.

  1. Take a look at the variable names in concussion. Rename them with rename() so they are gender, sport, year, concussed, and count. Overwrite concussion.

  2. Convert year and concussed to be factors. Overwrite concussion.

  3. Compute the probability a male from this cohort study had a concussion. Do the same for a female. Your output should have two variables – gender, probability, and two rows.

  4. Given an athlete played soccer, what is the probability they had a concussion? Your output should have one variable – probability, and one row. How does this number compare with your results in Exercise 3?

  5. Display in a bar plot the conditional probability of a concussion given sport and gender. It should look similar to what you see below, but you may implement your own theme and design. You don’t have to have the style features match to earn full credit here.

  1. Given an athlete had a concussion, what is the probability it was a female soccer player? Your output should have one variable – probability, and one row.

Upload your team’s PDF to Gradescope. Include every team member’s name in the Gradescope submission and identify which problems are on each in Gradescope. Associate the “Overall” section with the first page of your PDF.

Include all team members’ names with the team name in the author portion of the YAML header.

There should only be one submission per team on Gradescope.

References

“Datasets”. Users.Stat.Ufl.Edu, 2021, http://users.stat.ufl.edu/~winner/datasets.html. Accessed 20 Feb 2021.

T. Covassin, C.B. Swanik, M.L. Sachs (2003). “Sex Differences and the Incidence of Concussions Among Collegiate Athletes”, Journal of Athletic Training, Vol. (38)3, pp238-244