1 / 57

Ch10 Machine Learning: Symbol-Based

Ch10 Machine Learning: Symbol-Based. Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011. Machine Learning Outline . The book present four chapters on machine learning, reflecting four approaches to the problem: Symbol Based Connectionist Genetic/Evolutionary Stochastic.

nirav
Download Presentation

Ch10 Machine Learning: Symbol-Based

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch10 Machine Learning: Symbol-Based Dr. Bernard Chen Ph.D. University of Central Arkansas Spring 2011

  2. Machine Learning Outline • The book present four chapters on machine learning, reflecting four approaches to the problem: • Symbol Based • Connectionist • Genetic/Evolutionary • Stochastic

  3. Ch.10 Outline • A framework for Symbol-Based Learning • ID3 Decision Tree • Unsupervised Learning

  4. The Framework for Symbol-Based Learning

  5. The Framework Example • Data • The representation: • Size(small)^color(red)^shape(round) • Size(large)^color(red)^shape(round)

  6. The Framework Example • A set of operations: Based on • Size(small)^color(red)^shape(round) replace a single constant with a variable produces the generalizations: Size(X)^color(red)^shape(round) Size(small)^color(X)^shape(round) Size(small)^color(red)^shape(X)

  7. The Framework Example • The concept space • The learner must search this space to find the desired concept. • The complexity of this concept space is a primary measure of the difficulty of a learning problem

  8. The Framework Example

  9. The Framework Example • Heuristic search: Based on • Size(small)^color(red)^shape(round) The learner will make that example a candidate “ball” concept; this concept correctly classifies the only positive instance If the algorithm is given a second positive instance • Size(large)^color(red)^shape(round) The learner may generalize the candidate “ball” concept to • Size(Y)^color(red)^shape(round)

  10. Learning process • The training data is a series of positive and negative examples of the concept: examples of blocks world structures that fit category, along with near misses. • The later are instances that almost belong to the category but fail on one property or relation

  11. Examples and near misses for the concept arch

  12. Examples and near misses for the concept arch

  13. Examples and near misses for the concept arch

  14. Examples and near misses for the concept arch

  15. Learning process • This approach is proposed by Patrick Winston (1975) • The program performs a hill climbing search on the concept space guided by the training data • Because the program does not backtrack, its performance is highly sensitive to the order of the training examples • A bad order can lead the program to dead ends in the search space

  16. Ch.10 Outline • A framework for Symbol-Based Learning • ID3 Decision Tree • Unsupervised Learning

  17. ID3 Decision Tree • ID3, like candidate elimination, induces concepts from examples • It is particularly interesting for • Its representation of learned knowledge • Its approach to the management of complexity • Its heuristic for selecting candidate concepts • Its potential for handling noisy data

  18. ID3 Decision Tree

  19. ID3 Decision Tree • The previous table can be represented as the following decision tree:

  20. ID3 Decision Tree • In a decision tree, each internal node represents a test on some property • Each possible value of that property corresponds to a branch of the tree • Leaf nodes represents classification, such as low or moderate risk

  21. ID3 Decision Tree • A simplified decision tree for credit risk management

  22. ID3 Decision Tree • ID3 constructs decision trees in a top-down fashion. • ID3 selects a property to test at the current node of the tree and uses this test to partition the set of examples • The algorithm recursively constructs a sub-tree for each parturition • This continues until all members of the partition are in the same class

  23. ID3 Decision Tree • For example, ID3 selects income as the root property for the first step

  24. ID3 Decision Tree

  25. ID3 Decision Tree • How to select the 1st node? (and the following nodes) • ID3 measures the information gained by making each property the root of current subtree • It picks the property that provides the greatest information gain

  26. ID3 Decision Tree • If we assume that all the examples in the table occur with equal probability, then: • P(risk is high)=6/14 • P(risk is moderate)=3/14 • P(risk is low)=5/14

  27. ID3 Decision Tree ID3 Decision Tree • I[6,3,5]= • Based on

  28. ID3 Decision Tree

  29. ID3 Decision Tree • The information gain form income is: Gain(income)= I[6,3,5]-E[income]= 1.531-0.564=0.967 Similarly, • Gain(credit history)=0.266 • Gain(debt)=0.063 • Gain(colletral)=0.206

  30. ID3 Decision Tree • Since income provides the greatest information gain, ID3 will select it as the root of the tree

  31. Attribute Selection Measure: Information Gain (ID3/C4.5) • Select the attribute with the highest information gain • Let pi be the probability that an arbitrary tuple in D belongs to class Ci, estimated by |Ci, D|/|D| • Expected information (entropy) needed to classify a tuple in D:

  32. Attribute Selection Measure: Information Gain (ID3/C4.5) • Information needed (after using A to split D into v partitions) to classify D: • Information gained by branching on attribute A

  33. ID3 Decision Tree Pseudo Code

  34. Another Decision Tree Example

  35. Decision Tree Example • Info(Tenured)=I(3,3)= • log2(12)=log12/log2=1.07918/0.30103=3.584958. • Teach you what is log2http://www.ehow.com/how_5144933_calculate-log.html • Convenient tool: http://web2.0calc.com/

  36. Decision Tree Example • InfoRANK (Tenured)= 3/6 I(1,2) + 2/6 I(1,1) + 1/6 I(1,0)= 3/6 * ( ) + 2/6 (1) + 1/6 (0)= 0.79 • 3/6 I(1,2) means “Assistant Prof” has 3 out of 6 samples, with 1 yes’s and 2 no’s. • 2/6 I(1,1) means “Associate Prof” has 2 out of 6 samples, with 1 yes’s and 1 no’s. • 1/6 I(1,0) means “Professor” has 1 out of 6 samples, with 1 yes’s and 0 no’s.

  37. Decision Tree Example • InfoYEARS (Tenured)= 1/6 I(1,0) + 2/6 I(0,2) + 1/6 I(0,1) + 2/6 I (2,0)= 0 • 1/6 I(1,0) means “years=2” has 1 out of 6 samples, with 1 yes’s and 0 no’s. • 2/6 I(0,2) means “years=3” has 2 out of 6 samples, with 0 yes’s and 2 no’s. • 1/6 I(0,1) means “years=6” has 1 out of 6 samples, with 0 yes’s and 1 no’s. • 2/6 I(2,0) means “years=7” has 2 out of 6 samples, with 2 yes’s and 0 no’s.

  38. Ch.10 Outline • A framework for Symbol-Based Learning • ID3 Decision Tree • Unsupervised Learning

  39. Unsupervised Learning • The learning algorithms discussed so far implement forms of supervised learning • They assume the existence of a teacher, some fitness measure, or other external method of classifying training instances • Unsupervised Learning eliminates the teacher and requires that the learners form and evaluate concepts their own

  40. Unsupervised Learning • Science is perhaps the best example of unsupervised learning in humans • Scientists do not have the benefit of a teacher. • Instead, they propose hypotheses to explain observations,

  41. Unsupervised Learning • The clustering problem starts with (1) a collection of unclassified objects and (2) a means for measuring the similarity of objects • The goal is to organize the objects into classes that meet some standard of quality, such as maximizing the similarity of objects in the same class

  42. Unsupervised Learning • Numeric taxonomy is one of the oldest approaches to the clustering problem • A reasonable similarity metric treats each object as a point in n-dimensional space • The similarity of two objects is the Euclidean distance between them in this space

  43. Unsupervised Learning • Using this similarity metric, a common clustering algorithm builds clusters in a bottom-up fashion, also known as agglomerative clustering: • Examining all pairs of objects, select the pair with the highest degree of similarity, and mark that pair a cluster • Defining the features of the cluster as some function (such as average) of the features of the component members and then replacing the component objects with this cluster definition • Repeat this process on the collection of objects until all objects have been reduced to a single cluster

  44. Unsupervised Learning • The result of this algorithm is a Binary Tree whose leaf nodes are instances and whose internal nodes are clusters of increasing size • We may also extend this algorithm to objects represented as sets of symbolic features.

  45. Unsupervised Learning • Object1={small, red, rubber, ball} • Object1={small, blue, rubber, ball} • Object1={large, black, wooden, ball} • This metric would compute the similary values: • Similarity(object1, object2)= ¾ • Similarity(object1, object3)=1/4

  46. Partitioning Algorithms: Basic Concept • Given a k, find a partition of k clusters that optimizes the chosen partitioning criterion • Global optimal: exhaustively enumerate all partitions • Heuristic methods: k-means and k-medoids algorithms • k-means (MacQueen’67): Each cluster is represented by the center of the cluster • k-medoids or PAM (Partition around medoids) (Kaufman & Rousseeuw’87): Each cluster is represented by one of the objects in the cluster

  47. The K-Means Clustering Method • Given k, the k-means algorithm is implemented in four steps: • Partition objects into k nonempty subsets • Compute seed points as the centroids of the clusters of the current partition (the centroid is the center, i.e., mean point, of the cluster) • Assign each object to the cluster with the nearest seed point • Go back to Step 2, stop when no more new assignment

  48. K-means Clustering

  49. K-means Clustering

  50. K-means Clustering

More Related