Part A on Decision Trees:
- Please work on the same Example ” identifying risky bank loans using C5.0 decision trees” (See Ch. 5 of Machine Learning with R – Second Edition, in pp. 136-149), using the same dataset which can be found at UCI Machine Learning Data Repository (http://archive.ics.uci.edu/ml) by Hans Hofmann of the University of Hamburg. The dataset contains information on loans obtained from a credit agency in Germany.
- Follow the same five steps to get the same results as in the text.
- Give a good summary and conclusion of your findings with insight.
Part B on the Decision Rules:
- Please work on the Example – identifying poisonous mushrooms with rule learners, in pp. 160-168 in Ch. 5. Utilize the Mushroom dataset by Jeff Schlimmer of Carnegie Mellon University. The raw dataset is available freely at the UCI Machine Learning Repository (http://archive.ics. uci.edu/ml).
- Follow the same five steps as in the example to get similar results as in the text.
- Give a good summary and conclusion of your findings with insight.
The project should have cover page, following APA format with at least 1000 words (excluding title page and references page) and references page. Please use subtitles to make your assignment reader friendly.
References:
- Machine Learning with R Second Edition by Brett Lantz (2015). (Ch5 Page 136-149)
- APA format https://owl.english.purdue.edu/owl/resource/560/01/
Note:
There is a glitch in one of the statement of Week 3 assignment from page 141.
Below statement will not work as C5.0 package now only accepts factor outcome.
credit_model <- C5.0(credit_train[-17], credit_train$default)
Changed it to below statement which should work.
credit_model<-C5.0(credit_train[-17],factor(credit_train$default))


0 comments