This question involves the Boston housing data set. To begin, load in the Boston data set. The Boston data set is part of the MASS library in R.
1.Imagine you are interested in studying crime (i.e., crime is your target variable). For linear regression to be appropriate for your analysis of crime (column crim), the dependent variable should be approximately normally distributed. Create a histogram of the variable crim and overlay a graph of a normal density function with the same mean and standard deviation as crim. Is the variable crimapproximately normally distributed? What could you do to make it look more like a normal distribution?
2.Fit a classification model of your choice (SVM, logistic regression, decision tree) to predict whether or not the tract of land borders the Charles River. Comment on your result.


0 comments