KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11

KDD Group Research SeminarFall, 2001 - Presentation 8 – 11 Incremental Learning Friday, November 16 James Plummer jwp1924@ksu.edu Reference Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997.

Outline • Machine Learning • Extracting information from data • Forming concepts • The Data • Arrangement of Data • Attributes, Labels, and Instances • Categorization of Data • Results • MLJ ( Machine Learning in Java ) • Collection of Machine Learning algorithms • Current Inducers • Incremental Learning • Description of technique • Nearest Neighbor Algorithm • Distance-Weighted Algorithm • Advantages and Disadvantages • Gains and Loses.

Machine Learning • Sometimes called Data Mining • The process of extracting useful information from data • Marketing databases, medical databases, weather databases • Finding Consumer purchase patterns • Used to form concepts • Predictions • Classifications • Numeric Answers

The Data • Arrangement of Data • A piece of data is a set of attributes ai which make up an instance xj • Attributes can be considered evidence • Each instance has a label or category f(xj) (outcome value) xj = a1, a2, a3, . . . ai; f(xj); • A set of data is a set of instances • Categorization • A set of instances is used as control for new query instances xq(training) • Calculate f^(xj) based on training data • f^(xj) is the predicted value of the actual f(xj) • Results • The number of correctly predicted values over the total number of query instances • f^(xq)correct/ f(xq)total

Yes No Yes Data Example • Predict the values of Example 6, 7, 8 given data examples 1 through 5

MLJ (Machine Learning in Java) • MLJ is a collection of learning algorithms • Inducers • Categorize data to learn concepts • Currently in Development • ID3 • Uses trees • Naïve Bayes • Uses complex calculations • C4.5 • Uses trees with pruning techniques • Incremental Learning • Uses comparison techniques • Soon to be added

Incremental Learning • Instance Based Learning • k-Nearest Neighbor • All instances correspond to points in an n-dimensional space • The distance between two instances is determined by: ar(x)is therthattribute of instancex • Given a query instance xq to be categorized the k-nearest neighbors are calculated • f^(xq) is assigned the most frequent value of the nearest k f(xj) • For k = 1, f^(xq) will be assigned f(xi) if xi is the closest instance in the space

Examine three cases for the 2 dimensional space to the right • k=1 • k=5 • Weighted, k=5 Distance-Weighted Nearest Neighbor • Same as k-Nearest Neighbor • Effect of f(xj) on f^(xq) based on d(xq, xj) • In the case xq = xithen f^(xq) = f(xi)

Advantages and Disadvantages • Gains of using k-Nearest Neighbor • Individual attributes can be weighted differently • Change d(xi, xq) to allow nearest xi to have stronger of weaker effect on f^(xq) • Unaffected by noise in training data • Very Effective when provided a large set of training data • Flexible, f^(xq) can be calculated in many useful ways • Very small training time • Loses • Not good when training data is insufficient • Not very effective if similar xi have disimilar f^(xi) • More computation time need to categorize new instances

Referrences • Mitchell, Tom M. “Machine Learning” MaGraw-Hill Companies. 1997. • Witten and Frank, “Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations”. Morgan Kaufmann publishers. 2000. * equation reduced for simplicity

KDD Group Research Seminar Fall, 2001 - Presentation 8 – 11