1 / 70

Class 4 – More Classifiers

Class 4 – More Classifiers. Ramoza Ahsan, Yun Lu, Dongyun Zhang, Zhongfang Zhuang, Xiao Qin, Salah Uddin Ahmed. Lesson 4.1. Classification Boundaries. Classification Boundaries.

moswen
Download Presentation

Class 4 – More Classifiers

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Class 4 – More Classifiers Ramoza Ahsan, Yun Lu, Dongyun Zhang, Zhongfang Zhuang, Xiao Qin, Salah Uddin Ahmed

  2. Lesson 4.1 Classification Boundaries

  3. Classification Boundaries • Visualization of the data in the training stage of building a classifier can provide guidance in parameter selection • Wekavisuilization tool • 2 dimensional data set

  4. Boundary Representation With OneR • Color diagram shows the decision boundaries with training data • Spatial representation of the decision boundary on OneR algorithm

  5. Boundary Representation With IBk • Lazy classifier (instance based learner) • Chooses nearest instance to classify • Piece wise linear boundary • Increasing k will give blurry boundaries

  6. Boundary Representation With Naïve Bayes • Naïve Bayes treats each of the two attribute as contributing equally and independently to decision • When multiple along the two dimensions get a checkerboard pattern of probabilities.

  7. Boundary Representation With J-48 • Increasing the minNumObj parameter will result in simpler tree

  8. Classification Boundaries • Different classifiers have different capabilities for carving up instance space. (“Bias”) • Usefulness: • Important visualization tool. • Provides insight how the algorithm works on data. • Limitations: • Restricted to numeric attributes and 2-dimensional plot.

  9. Lesson 4.2 Linear Regression

  10. What Is Linear Regression? • In statistics, linear regression is an approach to model the relationship between a dependent variable y and one or more explanatory variables denoted X. • Straight-line regression analysis: one explanatory variable. • Multiple linear regression: more than one explanatory variable • In data mining, we use this method to make predictions based on numeric attributes for numeric classes. • NominalToBinary filter

  11. Why Linear Regression? • A regression models the past relationship between variables to predict the future behavior. • Businesses use regression to predict such things as future sales, stock prices, currency exchange rates, and productivity gains resulting from a training program. • Example: A person’s salary is related with years of experience. The dependent variable in this instance is salary and the explanatory variable (also called independent variable) is experience here.

  12. Mathematics Of Simple Linear Regression • The simplest form of regression function is: Where y is the dependent variable, x is the explanatory variable, b and w are regression coefficients. By thinking regression coefficients as weight, we could get: Where:

  13. Previous Example Salary Dataset • From the given dataset we could get: • Thus, we could get • For the instances, we can predict that a person with 10 years experience will get the salary of $58,600 per year.

  14. Run This Dataset On Weka

  15. Run This Dataset On Weka

  16. Run This Dataset On Weka

  17. Run This Dataset On Weka

  18. Multiple Linear Regression • Multiple linear regression is an extension of straight-line regression so as to involve more than one predictor variable. It allows the dependent variable to be modeled as linear summary of n predictor variables described by tuple ( ). Then adjust weights to minimize square error on training data: • This equation is hard to solve by hand, so we need a tool like Weka to do it.

  19. Non-linear Regression • Often no linear relationship between dependent variable(class attribute) and explanatory variables. • Often convert into a linear by a patchwork of serial linear regression models • In Weka, we have “model tree” named M5P method, which can solve this problem. A "model tree" is a tree where each leaf has one of these linear regression models. And we can calculate coefficients for each linear function and then we could make prediction based on this “model tree”.

  20. Lesson 4.3 Classification By Regression

  21. Review: Linear Regression • Several numeric attributes: • Weights of each attributes plus a constant: • Weighed sum of the attributes: • Minimize the squared error:

  22. Using Regression In Classification • Convert the class values to numeric values(usually binary) • Decide the class according to the regression result • The result is NOT the probability!!! • Set the threshold

  23. 2-Class Problems • Assign the binary values to the two classes • Training: Linear Regression • Output prediction

  24. Multi-Class Problems • Multi-Response Linear Regression • Divide into n regression problems • Build different model for each problem • Select the model with the largest output

  25. More Investigations • Cool stuff • Lead to the foundation of Logistic Regression • Convert the class value to binary • Add the Linear Regression result as an attribute • Detect the split using OneR

  26. Lesson 4.4 Logistic Regression

  27. Logistic Regression • In linear regression, we useto calculate weights from training data • In Logistic Regression, we useto estimate class probabilities directly. (1) (2)

  28. Classification • Email: Spam / Not Spam? • Online Transactions: Fraudulent (Yes / No)? • Tumor: Malignant / Benign? 0: “negative class” (e.g., Benign tumor) 1: “positive class” (e.g., Malignant tumor) Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  29. (Yes) 1 Malignant? (No) 0 0.5 Tumor Size Threshold classifier output y at 0.5: If predict “” If predict “” Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  30. (Yes) 1 Malignant? (No) 0 0.5 Threshold classifier output y at 0.5: If predict “” If predict “” Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  31. Classification: y = 0 or 1 In Linear Regression, can be >1 or <0 Logistic regression: Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  32. Logistic Regression Model • We want • Sigmoid functionLogistic function (3) (4) Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  33. Interpretation Of Hypothesis Output = estimated probability that on input Example: if Tell patient that 70% chance of tumor being malignant(). “Probability that , given , parameterized by . (5) (6) (7) Coursera– Machine Learning – Prof. Andrew Ng from Stanford University

  34. Lesson 4.5 Support Vector Machine

  35. Things About SVM • Better on small and not linear separable data • Low dimensions => High dimensions • Support Vectors • Maximum Marginal Hyperplane

  36. Overview • The support vectors are the most difficult tuples to classify and give the most information regarding classification.

  37. SVM searches for the hyperplane with the largest margin, that is, the Maximum Marginal Hyperplane(MMH).

  38. SVM Demo • CMsoft SVM Demo Tool • Question(s)

  39. More Very resilient to overfitting • Boundary depends on a few points • Parameter setting (regularization) Weka: functions>SMO Restricted to two classes • So use Multi-Response linear regression … or pairwise linear regression Weka: functions>libsvm • External library for support vector machines • Faster than SMO, more sophisticated options

  40. Lesson 4.6 Ensemble Learning

More Related