- The Ames based, non-profit company OAITI provides aoe open-source data sets. One of these data sets consists of information on all house sales in Ames between 2008 and 2010. The following piece of code allows you to read the dataset into your R session. How many house sales were there between 2008 and 2010? Which type of variables are we dealing with?
housing <- read.csv("https://raw.githubusercontent.com/OAITI/open-datasets/master/Housing%20Data/Ames-Housing.csv")
- Do sales prices change over time? (Don’t test significances) Provide a graphic that supports your statement.
- What is the relationship between sales prices and the size of the house (living area)? Make a chart and describe the relationship.
- Use
dplyrfunctions to:
- introduce a variable consisting of price per square foot,
- find the average price per square foot in each of the Ames neighborhoods,
- exclude averages that are based on fewer than 10 records,
- reorder the remaining neighborhoods according to the mean sales prices.
- Draw a chart of the average sale prices by neighborhood and comment on it. Only consider neighborhoods with at least 10 sales.
Bonus: write the code for this question and the previous one in a single statement for +0.5 point extra credit.
- Use
dplyrfunctions to:
- introduce a logical variable called ‘garage’ that is FALSE if the garage area is zero, and TRUE otherwise,
- exclude all sales of houses that do not have a garage,
- only consider 1 and 2 story houses (
HouseStyle), - create a new variable
YBCutfromYearBuiltthat introduces age categories that groups the year a house was built into intervals: 1800-1850, 1850-1900, 1950-2000, 2000+ (see?cut).
- Draw a chart of the previous data set. Draw side-by-side boxplots of the garage area by
YBCut. Facet by the style of house. Describe and summarise the chart.


0 comments