data science project

0 comments

2 files:
1. CSV file containing the data set in its original form
a. Name file LastName FirstName Original Data Set

2. MS Excel file containing your processed clean data set and data dictionary

1. Find a data set that is interesting to you that is appropriate for multiple regression.

2. A minimum of 1,000 records, (observations, examples).

3. A minimum of 15 numeric variables, (attributes, features).
The project requires a minimum of 15 numeric variables…that are usable to make your
predictive model. Time variables, street addresses, latitude, and longitude, for example, all
appear to be numeric, and could be, but generally are not usable in a regression model.

4. It is best to not use a data set that contains extensive missing data.

5. Save the data set as a .CSV file.

6. Save the data set as a MS Excel, name file as described above.

7. In the MS Excel workbook, name the worksheet containing the data, “Data”.

8. In the MS Excel workbook create in worksheet and name it “Data Dictionary”.

1. In the Data Dictionary worksheet:

2. Create three labeled columns: Variable Name, Data Type, Explanation of Variable.

3. Make sure to indict the dependent variable, and the independent variables.

4. List your variables and complete the Data Dictionary worksheet.

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}