SAS Critical Thinking Project

0 comments

1. As described above, you must upload the Real Estate Database into SAS.

2. You must perform the described analysis in Excel and submit the Spreadsheet.

3. You must perform the described analysis in SAS and submit a PDF.

4. You must complete the Word document (answers from 2 and 3 above) and submit.

SAS® Forecasting Project for Critical Thinking

This project utilizes the “Real Estate – Base” database. The purpose is twofold:

  • Build critical thinking skills needed to structure data analysis appropriately for effective decision making.
  • Analyze available data practically and skillfully in order to build an explanatory regression model.

The Real Estate – Base database includes the following variables for 101 homes (* NOTE: These variables are shown as qualitative variables within the database):

  • *Unit# (An assigned database key)
  • *Type (H = House, C = Condo/Apartment)
  • *Location (1 through 10 – voting district where located)
  • *U/S/R (Urban vs. Suburban vs. Rural location)
  • Price (The price the house ended up selling for in 2017)
  • Sq. Ft. (Heated/Cooled & Attached square footage)
  • Lot (Acres) (Acreage of property)
  • Garage (Number of attached covered and/or enclosed parking positions)
  • BRs (Number of qualified bedrooms)
  • Baths (Number of bathrooms – no tub or shower indicated as .5)
  • *Pool (No=No Access; HA=Shared Pool; AG=Above Ground; IG=In Ground)
  • Age (Age of home in rounded year at end of 2017)

At a high level, here are the steps you are going to perform:

  • Download the Excel spreadsheet with the Real Estate Data in it and create the requested Scatterplots. NOTE: It is important that the Dependent Variable (Price) is on the Y-axis and the Independent Variable is on the X-axis. The order of the two columns will dictate that.
  • Perform Regression Analysis within Excel to determine how well the prescribed Independent Variables explain changes in the Dependent Variable.
  • Upload the Real Estate dataset into SAS Studio.
  • Perform a series of Regression Analyses in SAS Studio to find a better set of explanatory variables.
  • Answer a critical thinking exercise regarding forecasting and the data set we have.

Here are the steps in detail:

  • Create the following charts in Excel using the charting tools and the indicated variables in “Real Estate – Base.xlsx” (Remember, Price is your Dependent Variable)
    • Create a new tab in the spreadsheet called “Scatterplots”. After creating each Scatterplot on the original tab, move it to the Scatterplot tab you created.
    • Create a Scatterplot using the variables Price and Sq. Ft.
    • Create a Scatterplot using the variables Price and Lot (Acres).
    • Create a Scatterplot using the variables Price and Garage.
    • Create a Scatterplot using the variables Price and BRs.
    • Create a Scatterplot using the variables Price and Baths.
    • Create a Scatterplot using the variables Price and Age.
  • What sort of relationship do you see between these variables based on the scatterplots?
    • Between Price and Sq. Ft. (Circle)?
    • Between Price and Lot (Circle)?
    • Between Price and Garage (Circle)?
    • Between Price and BRs (Circle)?
    • Between Price and Baths (Circle)?
    • Between Price and Age (Circle)?
  • In the Excel spreadsheet provided, using the Data Analysis Add-in, run a regression analysis with Price as the Dependent Variable and Lot, Garage and BRs as the Independent Variables and select to have Excel create a new tab called “Regression Model”. It is recommended that you run individual regressions with each variable alone to see how strong each R2 is.
  • Provide the following from the “Excel Model”:
    • Coefficient of Determination (R-squared) ___________________
    • Y-Intercept for the Regression Model ___________________
    • Slope value for X1 (Lot) ___________________
    • Slope value for X2 (Garage) ___________________
    • Slope value for X3 (BRs) ___________________
  • Do you think we need all three current Independent variables in our Regression model to predict changes in Price (Circle)? Yes No
  • Which variable(s) would you remove (Circle)?
  • Of the following variables in the spreadsheet, which variable would you select next to add to the model (i.e., you think it would create a stronger prediction of Price)?
  • Run a SAS Regression Model on the Real Estate – Base database using Price as the Dependent Variable (Y) and include the original Independent Variables (minus any you removed in step 6) and adding the variable you chose in step 7. Print your model output and turn it in with the assignment. (NOTE: You may have to repeat this exercise until you find a combination of variables that gives you a higher R2).
  • Provide the following from the SAS Model:
    • Coefficient of Determination (R-squared ). ________________________
    • Y-Intercept for the Regression Model ________________________
    • Slope value for each of your Independent Variables.
      • Var_______________________ ________________________
      • Var_______________________ ________________________
      • Var_______________________ ________________________
      • Var_______________________ ________________________
      • Var_______________________ ________________________
  • Did your SAS model provide a stronger Coefficient of Determination (Circle)? Yes No
  • A large real estate company is trying to use similar data plus their own sales data to forecast total sales for the coming year for each of their agents and they have pulled data from their Finance records. They are trying to assemble the best data to build a Regression model.
    • Would it make sense to use the same data as we used above in the SAS model? Why or why not?
    • Recommend two data elements you think they probably have available to help them predict sales for each of their sales people.

No relationship Weak Moderate Strong

No relationship Weak Moderate Strong

No relationship Weak Moderate Strong

No relationship Weak Moderate Strong

No relationship Weak Moderate Strong

No relationship Weak Moderate Strong

Explain: _________________________________________________________________________

_______________________________________________________________________________

_______________________________________________________________________________

Lot Size Garage BRs

Type Location U/S/R Sq. Ft. Baths Pool Age

Critical Thinking Question:

__________________________________________________________________________________

__________________________________________________________________________________

  • ______________________________________________
  • ______________________________________________

GRADING RUBRIC

Overall Score Possible = 100

Problem Area

Possible Points

Points Awarded

Did the student create the Excel tab for Scatterplots?

2

Did the student create the correct scatterplots and move them to the new tab?

3

Did the student make a selection for each type of relationship?

5

Did the student run Data Analysis on the Excel spreadsheet creating a new tab for the model output?

10

Did the student provide the correct model output values from the spreadsheet in the problem document?

10

Did the student answer Critical Thinking questions 5, 6 and 7?

20

Did the student run a regression model in SAS and provide a print out of the model output?

20

Did the student provide the correct model output values from SAS in the problem document and answer the decision problem (#10)?

10

Did the student complete all parts of the Critical Thinking problem #11?

20

Total Critical Thinking Points

100

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}