1 / 63

Machine Learning

Machine Learning. RG Knowledge Based Systems Hans Kleine Büning 14 August 2014. Outline. Learning by Example Motivation Decision Trees ID3 Overfitting Pruning Exercise Reinforcement Learning Motivation Markov Decision Processes Q-Learning Exercise. Outline. Learning by Example

tirza
Download Presentation

Machine Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Machine Learning RG Knowledge Based Systems Hans Kleine Büning 14 August 2014

  2. Outline • Learning by Example • Motivation • Decision Trees • ID3 • Overfitting • Pruning • Exercise • Reinforcement Learning • Motivation • Markov Decision Processes • Q-Learning • Exercise

  3. Outline • Learning by Example • Motivation • Decision Trees • ID3 • Overfitting • Pruning • Exercise • Reinforcement Learning • Motivation • Markov Decision Processes • Q-Learning • Exercise

  4. Motivation • Partly inspired by human learning • Objectives: • Classify entities according to some given examples • Find structures in big databases • Gain new knowledge from the samples • Input: • Learning examples with • Assigned attributes • Assigned classes • Output: • General Classifier for the given task

  5. Classifying Training Examples • Training Example for EnjoySport • General Training Examples

  6. Attributes & Classes • Attribute: Ai • Number of different values for Ai: |Ai| • Class: Ci • Number of different classes: |C| • Premises: • n > 2 • Consistent examples (no two objects with the same attributes and different classes)

  7. Possible Solutions • Decision Trees • ID3 • C4.5 • CART • Rule Based Systems • Clustering • Neural Networks • Backpropagation • Neuroevolution

  8. Decision Trees • Idea: Classify entities using if-then-rules • Example: Classifing Mushrooms • Attributes: Color, Size, Points • Classes: eatable, poisonous • Resulting rules: • if (Colour = red)and (Size = small) then poisonous • if (Colour = green)then eatable • …

  9. Decision Trees • There exist different decision trees for the same task. • In the mean the left tree decides earlier.

  10. How to measure tree quality? • Number of leafs? • Number of generated rules • Tree height? • Maximum rule length • External path length? • = Sum of the length of all paths from root to leaf • Amount of memory needed for all rules • Weighted external path length • Like external path length • Paths are weighted by the number of objects they represent

  11. Back to the Example

  12. Weighted External Path Length • Idea from information theory: • Given: • Text which should be compressed • Probabilities for character occurrence • Result: • Coding tree • Example: eeab • p(e) = 0.5 • p(a) = 0.25 • p(b) = 0.25 • Encoding: 110001 • Build tree according to the information content.

  13. Entropy • Entropy = Measurement for mean information content • In general: • Mean number of bits to encode each element by optimal encoding.(= mean height of the theoretically optimal encoding tree)

  14. Information Gain • Information gain = expected reduction of entropy due to sorting • Conditional Entropy: • Information Gain:

  15. Probability that one of the objects has attribute ai Probability that one of the objects has attribute ai Entropy & Decision Trees • Use conditional entropy and information gain for selecting split attributes. • Chosen split attribute Ak: • Possible values for Ak: • xi – Number of objects with value ai for Ak • xi,j – Number of objects with value ai for Ak and class Cj Probability that an object with attribute ai has class Cj Probability that one of the objects has attribute ai

  16. Decision Tree Construction • Choose split attribute Ak which gives the highest information gain or the smallest • Example: colour

  17. Decision Tree Construction (2) • Analogously: • H(C|Acolour) = 0.4 • H(C|Asize)≈ 0.4562 • H(C|Apoints) = 0.4 • Choose colour or points as first split criterion • Recursively repeat this procedure

  18. Decision Tree Construction (3) • Right side is trivial: • Left side: both attributes have the same information gain

  19. Generalisation • The classifier should also be able to handle unknown data. • Classifing model is often called hypothesis. • Testing Generality: • Divide samples into • Training set • Validation or test set • Learn according to training set • Test generality according to validation set • Error computation: • Test set X • Hypothesis h • error(X,h) – Function which is monotonously increasing in the number of wrongly classified examples in X by h

  20. Overfitting • Learnt hypothesis performs good on training set but bad on validation set • Formally:h is overfitted if there exists a hypothesis h’ witherror(D,h) < error(D,h’) and error(X,h) > error(X,h’)X validation setD training set

  21. Avoiding Overfitting • Stopping • Don‘t split further if some criteria is true • Examples: • Size of node n:Don‘t split if n contains less then ¯ examples. • Purity of node n:Don‘t split of purity gain is not big enough. • Pruning • Reduce decision tree after training. • Examples: • Reduced Error Pruning • Minimal Cost-Complexity Pruning • Rule-Post Pruning

  22. Pruning • Pruning Syntax: • If T0 was produced by (repeated) pruning on T we write

  23. Maximum Tree Creation • Before pruning we need a maximum tree Tmax • What is a maximum tree? • All leaf nodes are smaller then some threshold or • All leaf nodes represent only one class or • All leaf nodes have only objects with the same attribute values • Tmax is then pruned starting from the leaves.

  24. Reduced Error Pruning • Consider branch Tn of T • Replace Tn by leaf with the class that is mostly associated with Tn • If error(X, h(T)) < error(X, h(T/Tn)) take back the decision • Back to 1. until all non-leaf nodes were considered

  25. Exercise Fred wants to buy a VW Beetle and classifies all offering in the classes interesting and uninteresting. Help Fred by creating a decision tree using the ID3 algorithm.

  26. Outline • Learning by Example • Motivation • Decision Trees • ID3 • Overfitting • Pruning • Exercise • Reinforcement Learning • Motivation • Markov Decision Processes • Q-Learning • Exercise

  27. How to program a robot to drive a bicycle?

  28. Reinforcement Learning: The Idea • A way of programming agents by reward and punishment without specifying how the task is to be achieved

  29. Learning to Balance on a Bicycle • States: • Angle of handle bars • Angular velocity of handle bars • Angle of bicycle to vertical • Angular velocity of bicycle to vertical • Acceleration of angle of bicycle to vertical

  30. Actions: Torque to be applied to the handle bars Displacement of the center of mass from the bicycle’s plan (in cm) Learning to Balance on a Bicycle

  31. Angle of bicycle to vertical is greater than 12° no yes Reward = -1 Reward = 0

  32. Reinforcement Learning: Applications • Board Games • TD-Gammon program, based on reinforcement learning, has become a world-class backgammon player • Control a Mobile Robot • Learning to Drive a Bicycle • Navigation • Pole-balancing • Acrobot • Robot Soccer • Learning to Control Sequential Processes • Elevator Dispatching

  33. Deterministic Markov Decision Process

  34. Value of Policy and Agent’s Task

  35. Nondeterministic Markov Decision Process P = 0.8 P = 0.1 P = 0.1

  36. Methods Model (reward function and transition probabilities) is known Model (reward function or transition probabilities) is unknown discrete states continuous states discrete states continuous states Dynamic Programming Value Function Approximation + Dynamic Programming Reinforcement Learning Valuation Function Approximation + Reinforcement Learning

  37. Q-learning Algorithm

  38. Q-learning Algorithm

  39. Example

  40. Example: Q-table Initialization

  41. Example: Episode 1

  42. Example: Episode 1

  43. Example: Episode 1

  44. Example: Episode 1

  45. Example: Episode 1

  46. Example: Q-table

  47. Example: Episode 1

  48. Episode 1

  49. Example: Q-table

  50. Example: Episode 2

More Related