1 / 23

Classification

This article provides an overview of the problem of classification in data mining, including different approaches and key issues. It also explores specific algorithms such as decision trees, K-nearest neighbors, Naïve Bayesian classifiers, and neural networks.

gadsden
Download Presentation

Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Classification A task of induction to find patterns CSE 591: Data Mining by H. Liu

  2. Outline • Data and its format • Problem of Classification • Learning a classifier • Different approaches • Key issues CSE 591: Data Mining by H. Liu

  3. Data and its format • Data • attribute-value pairs • with/without class • Data type • continuous/discrete • nominal • Data format • flat CSE 591: Data Mining by H. Liu

  4. Sample data CSE 591: Data Mining by H. Liu

  5. Induction from databases • Inferring knowledge from data • The task of deduction • infer information that is a logical consequence of querying a database • Who conducted this class before? • Which courses are attended by Mary? • Deductive databases: extending the RDBMS CSE 591: Data Mining by H. Liu

  6. Classification • It is one type of induction • data with class labels • Examples - • If weather is rainy then no golf • If • If CSE 591: Data Mining by H. Liu

  7. Different approaches • There exist many techniques • Decision trees • Neural networks • K-nearest neighbors • Naïve Bayesian classifiers • Support Vector Machines • Ensemble methods • Co-training • and many more ... CSE 591: Data Mining by H. Liu

  8. Outlook sunny overcast rain Humidity Wind YES high normal strong weak NO YES NO YES A decision tree CSE 591: Data Mining by H. Liu

  9. Inducing a decision tree • There are many possible trees • let’s try it on the golfing data • How to find the most compact one • that is consistent with the data? • Why the most compact? • Occam’s razor principle • Issue of efficiency w.r.t. optimality CSE 591: Data Mining by H. Liu

  10. Entropy - Information gain - the difference between the node before and after splitting Information gain and CSE 591: Data Mining by H. Liu

  11. Building a compact tree • The key to building a decision tree - which attribute to choose in order to branch. • The heuristic is to choose the attribute with the maximum IG. • Another explanation is to reduce uncertainty as much as possible. CSE 591: Data Mining by H. Liu

  12. Learn a decision tree Outlook sunny overcast rain Humidity Wind YES high normal strong weak NO YES NO YES CSE 591: Data Mining by H. Liu

  13. K-Nearest Neighbor • One of the most intuitive classification algorithm • An unseen instance’s class is determined by its nearest neighbor • The problem is it is sensitive to noise • Instead of using one neighbor, we can use k neighbors CSE 591: Data Mining by H. Liu

  14. K-NN • New problems • lazy learning • large storage • An example • How good is k-NN? CSE 591: Data Mining by H. Liu

  15. Naïve Bayes Classifier • This is a direct application of Bayes’ rule • P(C|X) = P(X|C)P(C)/P(X) X - a vector of x1,x2,…,xn • That’s the best classifier you can build • But, there are problems CSE 591: Data Mining by H. Liu

  16. NBC (2) • Assume conditional independence between xi’s • We have • An example • How good is it in reality? CSE 591: Data Mining by H. Liu

  17. Classification via Neural Networks Squash  A perceptron CSE 591: Data Mining by H. Liu

  18. What can a perceptron do? • Neuron as a computing device • To separate a linearly separable points • Nice things about a perceptron • distributed representation • local learning • weight adjusting CSE 591: Data Mining by H. Liu

  19. Linear threshold unit • Basic concepts: projection, thresholding W vectors evoke 1 W = [.11 .6] L= [.7 .7] .5 CSE 591: Data Mining by H. Liu

  20. E.g. 1: solution region for AND problem • Find a weight vector that satisfies all the constraints AND problem 0 0 0 0 1 0 1 0 0 1 1 1 CSE 591: Data Mining by H. Liu

  21. E.g. 2: Solution region for XOR problem? XOR problem 0 0 0 0 1 1 1 0 1 1 1 0 CSE 591: Data Mining by H. Liu

  22. Learning by error reduction • Perceptron learning algorithm • If the activation level of the output unit is 1 when it should be 0, reduce the weight on the link to the ith input unit by r*Li, where Li is the ith input value and r a learning rate • If the activation level of the output unit is 0 when it should be 1, increase the weight on the link to the ith input unit by r*Li • Otherwise, do nothing CSE 591: Data Mining by H. Liu

  23. Multi-layer perceptrons • Using the chain rule, we can back-propagate the errors for a multi-layer perceptrons. Output layer Hidden layer Input layer CSE 591: Data Mining by H. Liu

More Related