Data and Introduction

paintings <- read.csv("data/paris_paintings.csv", 
                            na = c("n/a", "", "NA")) %>%
  dplyr::select(Height_in, Width_in, landsALL) %>%
  na.omit()

We will be looking at data about Paris Paintings in today’s application exercise. These data were collected by Hilary Coe Cronheim and Sandra van Ginhoven (previous PhD students at Duke University Art, Art History & Visual Studies) as part of the Data Expeditions project sponsored by iiD. These data consist of the physical and artistic features of paintings featured in printed catalogues of 28 auction sales in Paris, 1764-1780. In total, these students analyzed 3,393 paintings, their prices, and descriptive details from sales catalogues over 60 variables

We’ll primarily focus on the variables:

Exercises

Exercise 0

Go onto GitHub and edit the README.md file to complete the data dictionary (Becky will show you how to do this).

Exercise 1

First, pull your changes! Right now, landsALL is coded as a numeric variable in the dataset. Modify the dataset to make landsALL a factor so R treats it as a categorical or factor variable.

Exercise 2

Create a scatterplot to visualize the relationship between width and height. Color the points based on the whether the painting has any landscape elements.

Exercise 3

Based on your scatterplot, does the relationship between Height and Width differ between paintings with landscape elements and those without? Briefly explain.

Exercise 4

Fit a model using width and whether the painting has landscape elements to predict the height. Present the model in a tidy format.

Exercise 5

Using your model from the previous exercise,

  • Interpret the intercept.
  • Interpret the coefficient (slope) of Width_in.
  • Interpret the coefficient (slope) of landsAll.

Exercise 6

Write the equation of the model for

  • paintings with no landscape elements.
  • paintings with landscape elements.

How do the slopes compare between the two models? How do the intercepts compare?

Exercise 7

Now let’s see how well the model fits the data. Obtain the \(R^2\) for your model and interpret it!