The dataset car_prices.csv contains attributes of cars
offered for sale on cars.com in 20171.
type: Model (Accord, Maxima, Mazda6)age: Age of the used car (in years)price: Price (in thousands of dollars)mileage: Previous miles driven (in thousands of
miles)car_prices <- read.csv("data/car_prices.csv")
Consider a regression model with the response price and
a single predictor mileage.
Create a scatterplot of price and mileage.
Use appropriate functions in R to find the fitted model
and display the results in tidy format. Write out the equation of the
fitted model, then include a visualization of the linear model on the
scatterplot from Exercise 1.
Interpret the slope and intercept in the context of the problem.
What is the predicted selling price of a car with 50,000 miles?
Suppose my friend has a Honda Accord with 225,000 miles. Is it appropriate to use this model to make a prediction for the selling price? Why or why not?
Consider a regression model with the response price and
the categorical predictor type (Accord, Maxima,
Mazda6).
Use appropriate functions in R to find the fitted model
and display the results in tidy format. Write out the equation of the
fitted model.
Interpret the intercept and slope(s) in the context of the problem.
How many terms are in the model for type? Is this equal
to the number of car types in the data set? If not, briefly explain why
the number of terms for type in the model differs from the
number car types in the data set.
How does the average price of Maximas compare to the average price of Accords?
What are possible limitations of your two regression models?
The data is from the Stat2Data R package.↩︎