220 likes | 380 Views
Decision Tree Learning. Presented by Ping Zhang Nov. 26th, 2007. Introduction. Decision tree learning is one of the most widely used and practical method for inductive inference
E N D
Decision Tree Learning Presented by Ping Zhang Nov. 26th, 2007
Introduction • Decision tree learning is one of the most widely used and practical method for inductive inference • Decision tree learning is a method for approximating discrete-valued target functions, in which the learned function is represented by a decision tree • Decision tree learning is robust to noisy data and capable of learning disjunctive expressions
Decision tree representation • Decision tree classify instances by sorting them down the tree from the root to some leaf node, which provides the classification of the instance • Each node in the tree specifies a test of some attribute of the instance, and each branch descending from that node corresponds to one of the possible values for this attributes
When to Consider Decision Trees • Instances describable by attribute-value pairs • Target function is discrete valued • Disjunctive hypothesis may be required • Possibly noisy training data Examples (Classification problems): • Equipment or medical diagnosis • Credit risk analysis
Hypothesis Space Search by ID3 • Hypothesis space is complete Target function surely in there • Only outputs a single hypothesis • No back tracking Local minima • Statically-based search choices Robust to noisy data • Inductive bias: “prefer shortest tree”
From ID3 to C4.5 C4.5 made a number of improvements to ID3. Some of these are: • Handling both continuous and discrete attributes • Handling training data with missing attribute value • Handling attributes with differing costs • Pruning trees after creation
Rule Post-Pruning • Convert tree to equivalent set of rules • Prune each rule by removing any preconditions that result in improving its estimated accuracy • Sort the pruned rules by their estimated accuracy, and consider them in this sequence when classifying subsequent instance • Perhaps most frequently used method
Continuous Valued Attributes • Create a discrete attribute to test continuous • There are two candidate thresholds • The information gain can be computed for each of the candidate attributes, Temperature>54 and Temperature>85, and the best can be selected(Temperature>54)
Attributes with many Values Problems: • If attribute has many values, Gain will select it • Imagine using the attribute Data. It would have the highest information gain of any of attributes. But the decision tree is not useful.
Attributes with Costs • Consider Medical diagnosis, BloodTset has cost 150 dallors • How to learn a consistent tree with low expected cost?
Conclusion Decision Tree Learning is • Simple to understand and interpret • Requires little data preparation • Able to handle both numerical and categorical data • Use a white box model • Possible to validate a model using statistical tests • Robust, perform well with large data in a short time