Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Consider the data in above Table (see attachment). The target variable is salary

ID: 3755828 • Letter: C

Question

Consider the data in above Table (see attachment). The target variable is salary. Start by discretizing salary as follows:

Less than $35,000 Level 1

$35,000 to less than $45,000 Level 2

$45,000 to less than $55,000 Level 3

Above $55,000 Level 4


5. Construct a classification and regression tree to classify salary based on the other variables. Do as much as you can by hand, before turning to the software.
6. Construct a C4.5 decision tree to classify salary based on the other variables. Do as much as you can by hand, before turning to the software.
7. Compare the two decision trees and discuss the benefits and drawbacks of each.
8. Generate the full set of decision rules for the CART decision tree.
9. Generate the full set of decision rules for the C4.5 decision tree.
10. Compare the two sets of decision rules and discuss the benefits and drawbacks of each.

Explanation / Answer

level 1 less than 35000

yes No(go for level 2)

in service

yes No

Gender In Staff

yes No Yes

M F Gender

Yes NO

M F

level 2 35000 to less than 45000

yes No( go for level 3)

in Service

yes No

Gender In Sales

yes No yes No

M F Gender In Staff

yes No yes

M F Gender

yes no

F M

Level 3 45000 t0 less than 55000

yes No( go to level 4)

in service

yes No

Gender In Management

yes No yes No

F M Gender In Sales

yes no Gender

M F yes No

F M

Level 4 above 55000

yes

In Manegement

Gender

yes No

F M

by combining the all level tree we will get the classification tree for the target variable salary.

similarry regression tree and C4.5 decision tree are formed using there procedure