Enter your answers in the empty code chunks. Replace “# your code here” with your code.
Make sure you run this chunk before attempting any of the problems:
library(tidyverse)
2 Basics
Calculate
2+2
## [1] 4
Calculate
# your code here
Calculate
# your code here
3 dplyr
Let’s work with the data set diamonds:
data(diamonds) # this will load a dataset called "diamonds"
Calculate the average price of a diamond. Use the %>% and summarise() syntax (hint: see lectures).
# your code here
Calculate the average, median and standard deviation price of a diamond. Use the %>% and summarise() syntax.
# your code here
Use group_by() to group diamonds by color, then use summarise() to calculate the average price and the standard deviation in price by color:
# your code here
Use filter() to remove observations with a depth greater than 62, then usegroup_by() to group diamonds by clarity, then use summarise() to find the maximum price of a diamond by clarity:
# your code here
Use mutate() and log() to create a new variable to the data called “log_price”. Make sure you add the variable to the dataset diamonds.
# your code here
(Hint: if I wanted to add a variable called “max_price” that calculates the max price, the code would look like this:)
diamonds = diamonds %>%
mutate(max_price = max(price))
4 ggplot2
Continue using diamonds.
Use geom_histogram() to plot a histogram of prices:
# your code here
Use geom_density() to plot the density of log prices (the variable you added to the data frame):
# your code here
Use geom_point() to plot carats against log prices (i.e. carats on the x-axis, log prices on the y-axis):
# your code here
Same as above, but now add a regression line with geom_smooth():
# your code here
Use stat_summary() to make a bar plot of average log price by cut:
# your code here
Same as above but change the theme to theme_classic():
# your code here
5 Inference
Use lm() to estimate the model
and store the output in an object called “m1”:
# your code here
Use summary() to view the output of “m1”:
# your code here
Use lm() to estimate the model
and store the output in an object called “m2”:
# your code here
Use summary() to view the output of “m2”:
# your code here


0 comments