Go to the appex-10-[GITHUB USERNAME] repo, clone it, and
start a new project in RStudio.
inferNOTE: you may need to install the infer
package. You may do so by typing install.packages("infer)
in the Console.
library(tidyverse)
library(infer)
The General Social Survey (GSS) gathers data on contemporary American society in order to understand and explain trends and constants in attitudes, behaviors, and attributes. Hundreds of trends have been tracked since 1972. In addition, since the GSS adopted questions from earlier surveys, trends can be followed for up to 70 years.
The GSS contains a standard core of demographic, behavioral, and attitudinal questions, plus topics of special interest. Among the topics covered are civil liberties, crime and violence, intergroup tolerance, morality, national spending priorities, psychological well-being, social mobility, and stress and traumatic events.
We will analyze data from the 2018 GSS, using to understand how much time US adults spend on email.
gss_email <- read_csv("data/gss_email_2018.csv") %>%
slice(150:300)
We’ll use the variable email: the number of minutes the
respondents spend on email weekly. We will begin by live-coding
together, and then you must complete the remainder of the Application
Exercise on your own!
We want to calculate a 95% confidence interval for the mean amount of time Americans spend on email weekly.
Set a seed to control R’s random sampling process.
We will use the infer package to calculate a 95%
confidence interval for the median amount of time Americans spend on
email weekly.
We start by setting a seed. We’ll use 1234 to set our seed today but you can use any value you want on assignments.
set.seed(1234)
Un-comment the lines and fill in the blanks to create a data frame containing the bootstrapped data - sample medians in our case.
boot_dist <- gss_email #%>%
Glimpse the data frame of the bootstrapped data.
### Glimpse the bootstrapped data frame
Fill in the blanks to construct the 95% bootstrap confidence interval for the median amount of time Americans spend on email weekly.
Write the interpretation for the interval calculated above.