• Home
  • Blog
  • Data Science & Big Data Analysis Assignment

Data Science & Big Data Analysis Assignment

0 comments

Guidelines

  • Share screen shot on your response
  • Share the code and the plots
  • Put your name and id number
  • Clear mark question number
  • Upload Word document
  • Insert Cover page Questions Attempted

HW09Review of Major Topics

Q1 – Classifier Performance Comparison

Q1a –Analyze the data set Social_Network_Ads.csv and create the plot with correct titles on axes:

Q1b Use the following classifiers

  • Naïve Bayes
  • Logistic Regression
  • Decision Trees
  • KNN
  • Support Vector Machine
  • Random Forest

For each classifier show

  • The classifier boundary for training and test
  • Printout your 1st name on all graphs

Q1c Compare the confusion matrix in the following table for the above data set

TP

TN

FP

FN

Accuracy

Naïve Bayes

Logistic Regression

Decision Trees

KNN

Support Vector Machine

Random Forest

Q2 – Principal Component Analysis

Summarize how the PCA algorithm works using the following link and recreate the code for the IRIS data set.

https://plot.ly/ipython-notebooks/principal-component-analysis/

Q3 Review the material on PCA in the following and visually describe how PCA works (use snapshots)

http://setosa.io/ev/principal-component-analysis/

Q4LDA Explain how LDA differs from PCA

https://sebastianraschka.com/Articles/2014_python_lda.html

Q5 Compare accuracy of LDA vs PCA techniques using the dimensionality reduction on Wine data.

About the Author

Follow me


{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}