tidyverse
, infer
, or base
R. However, for simulation-based inference, you must use tidyverse
or infer
.library(tidyverse)
library(infer)
Read Diabetes Rates Rise in 18 States in Past Decade along with the Survey Methods (at the bottom of the article). Use the information provided in the article to complete Exercises 1 - 4.
What was the sample size Gallup took in the 2016 - 2017 nationwide study?
Gallup states, “Nationwide, the diabetes rate rose to 11.5% in 2016-2017, up 0.7 percentage points compared with the 10.8% measured in 2008-2009 and representing a net increase of about 1.7 million U.S. adults who report having been diagnosed with the disease over that time.” What do the quantities 11.5% and 10.8% represent?
Provide and interpret a 95% confidence interval for the proportion of adult individuals that have diabetes. Use the information gathered from the 2016 - 2017 data. Hint: if we know the margin of sampling error (this is a function of the variability in the sample statistic), then we can compute the confidence interval by point estimate \(\pm\) margin of sampling error.
Provide and interpret a 95% confidence interval for the proportion of adult individuals in Alaska that have diabetes. Use the information gathered from the 2016 - 2017 data.
An insurance company states that 90% of its claims are settled within 30 days. A consumer group believes that less than 90% of the claims are settled within 30 days. They selected a random sample of 75 of the company’s claims to investigate. The consumer group found that 55 of the claims were settled within 30 days. At the 0.05 significance level, test the company’s claim that 90% of its claims are settled within 30 days. Be sure to clearly outline the process you take to arrive at your conclusion. Also, give an appropriate conclusion within the context of the problem.
Perform the hypothesis test for the problem defined in the exercise above again, but this time use another approach. For example, if you used a CLT-based approach, now use a simulation based approach. Compare your results from both methods.
For questions 7 and 8, use the data available at https://www.openintro.org/data/index.php?data=gifted. Read it into R with
<- read_csv("https://www.openintro.org/data/csv/gifted.csv") gifted
You may assume this data is from a random sample of all gifted children in the city in which the data was obtained.
Use a simulation-based inference approach to compute a 98% confidence interval for the mean number of months until gifted children from the specified city can count to 10. Provide an interpretation of your interval.
Do gifted children watch more cartoons on average per week than they do educational TV? Based on the sample data, create a 99% confidence interval to answer this question.
Consider a confidence interval for a population proportion. “A” and “B” use the same data. “A” creates a 90% confidence interval, and “B” creates a 95% confidence interval. “C” collects his own sample data that is half of the size “A” and “B” obtained. “C” found that \(\hat{p} = 0.50\) from his sample data. “C” also creates a 95% confidence interval for the population proportion. Given this information, rank the width of the intervals produced by “A”, “B”, and “C” from largest to smallest. If a ranking cannot be determined state that. Your answer must include a justification.
During an angiogram, heart problems can be examined via threading a catheter into the heart. A medical device company claims that the mean diameter of their catheters is 2.00mm. Quality control personnel would like to test if the mean catheter diameter exceeds the company’s claimed value. Complications can arise if the mean diameter is substantially different from 2.00mm. Describe some potential consequences from the company’s perspective of making a Type I and Type II error in this test.
Knit to PDF to create a PDF document. Stage and commit all remaining changes, and push your work to GitHub. Make sure all files are updated on your GitHub repo.
Only upload your PDF document to Gradescope. Before you submit the uploaded document, mark where each answer is to the exercises. If any answer spans multiple pages, then mark all pages. Associate the “Overall” section with the first page.