Consider the data in above Table (see attachment). The target variable is salary
ID: 3755828 • Letter: C
Question
Consider the data in above Table (see attachment). The target variable is salary. Start by discretizing salary as follows:
Less than $35,000 Level 1
$35,000 to less than $45,000 Level 2
$45,000 to less than $55,000 Level 3
Above $55,000 Level 4
5. Construct a classification and regression tree to classify salary based on the other variables. Do as much as you can by hand, before turning to the software.
6. Construct a C4.5 decision tree to classify salary based on the other variables. Do as much as you can by hand, before turning to the software.
7. Compare the two decision trees and discuss the benefits and drawbacks of each.
8. Generate the full set of decision rules for the CART decision tree.
9. Generate the full set of decision rules for the C4.5 decision tree.
10. Compare the two sets of decision rules and discuss the benefits and drawbacks of each.
Explanation / Answer
level 1 less than 35000
yes No(go for level 2)
in service
yes No
Gender In Staff
yes No Yes
M F Gender
Yes NO
M F
level 2 35000 to less than 45000
yes No( go for level 3)
in Service
yes No
Gender In Sales
yes No yes No
M F Gender In Staff
yes No yes
M F Gender
yes no
F M
Level 3 45000 t0 less than 55000
yes No( go to level 4)
in service
yes No
Gender In Management
yes No yes No
F M Gender In Sales
yes no Gender
M F yes No
F M
Level 4 above 55000
yes
In Manegement
Gender
yes No
F M
by combining the all level tree we will get the classification tree for the target variable salary.
similarry regression tree and C4.5 decision tree are formed using there procedure