I’m working on a computer science multi-part question and need a reference to help me learn.
Question 1
Show that the entropy of a node never increases after splitting it into smaller successor nodes.
Question 2 (everything below this line is question 2)
Compute a two-level decision tree using the greedy approach described in this chapter. Use the classification error rate as the criterion for splitting. What is the overall error rate of the induced tree?
Note: To determine the test condition at the root note, you first need to computer the error rates for attributes X, Y, and Z.
For attribute X the corresponding counts are:
|
x |
c1 |
c2 |
|
0 |
60 |
60 |
|
1 |
40 |
40 |
For Y the corresponding counts are:
|
y |
c1 |
c2 |
|
0 |
40 |
60 |
|
1 |
60 |
40 |
For Z the corresponding counts are:
|
Z |
c1 |
c2 |
|
0 |
30 |
70 |
|
1 |
70 |
30 |


0 comments