1 / 21

Ensemble Methods

Ensemble Methods. Construct a set of classifiers from the training data Predict class label of previously unseen records by aggregating predictions made by multiple classifiers In Olympic Ice-Skating you have multiple judges? Why?. General Idea. Why does it work?.

tyronek
Download Presentation

Ensemble Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ensemble Methods • Construct a set of classifiers from the training data • Predict class label of previously unseen records by aggregating predictions made by multiple classifiers • In Olympic Ice-Skating you have multiple judges? Why?

  2. General Idea

  3. Why does it work? • Suppose there are 25 base classifiers • Each classifier has error rate,  = 0.35 • Assume classifiers are independent • Probability that the ensemble classifier makes a wrong prediction: • Practice has shown that even when independence does not hold results are good

  4. Methods for generating Multiple Classifiers • Manipulate the training data • Sample the data differently each time • Examples: Bagging and Boosting • Manipulate the input features • Sample the featurres differently each time • Makes especially good sense if there is redundancy • Example: Random Forest • Manipulate the learning algorithm • Vary some parameter of the learning algorithm • E.g., amount of pruning, ANN network topology, etc. • Use different learning algorithms

  5. Background • Classifier performance can be impacted by: • Bias: assumptions made to help with generalization • "Simpler is better" is a bias • Variance: a learning method will give different results based on small changes (e.g., in training data). • When I run experiments and use random sampling with repeated runs, I get different results each time. • Noise: measurements may have errors or the class may be inherently probabilistic

  6. How Ensembles Help • Ensemble methods can assist with the bias and variance • Averaging the results over multiple runs will reduce the variance • I observe this when I use 10 runs with random sampling and see that my learning curves are much smoother • Ensemble methods especially helpful for unstable classifier algorithms • Decision trees are unstable since small changes in the training data can greatly impact the structure of the learned decision tree • If you combine different classifier methods into an ensemble, then you are using methods with different biases • You are more likely to use a classifier with a bias that is a good match for the problem • You may even be able to identify the best methods and weight them more

  7. Examples of Ensemble Methods • How to generate an ensemble of classifiers? • Bagging • Boosting • These methods have been shown to be quite effective • A technique ignored by the textbook is to combine classifiers built separately • By simple voting • By voting and factoring in the reliability of each classifier

  8. Bagging • Sampling with replacement • Build classifier on each bootstrap sample • Each sample has probability (1 – 1/n)n of being selected (about 63% for large n) • Some values will be picked more than once • Combine the resulting classifiers, such as by majority voting • Greatly reduces the variance when compared to a single base classifier

  9. Boosting • An iterative procedure to adaptively change distribution of training data by focusing more on previously misclassified records • Initially, all N records are assigned equal weights • Unlike bagging, weights may change at the end of boosting round

  10. Boosting • Records that are wrongly classified will have their weights increased • Records that are classified correctly will have their weights decreased • Example 4 is hard to classify • Its weight is increased, therefore it is more likely to be chosen again in subsequent rounds

  11. The Netflix Prize

  12. Netflix Prize Video • https://www.youtube.com/watch?v=ImpV70uLxyw

  13. Netflix • Netflix is a subscription-based movie and television show rental service that offers media to subscribers: • Physically by mail • Over the internet • Has a catalog of over 100,000 movies and television shows • Subscriber base of over 10 million

  14. Recommendations • Netflix offers users the ability to rate movies and television shows that they have seen • Depending on those ratings, Netflix provides recommendations of movies and television shows that the subscriber would like to watch • These recommendations are based on an algorithm called Cinematch

  15. Cinematch • Uses straightforward statistical linear models with a lot of data conditioning • This means that the more a subscriber rates, the more accurate the recommendations will become

  16. Netflix Prize • Competition for the best collaborative filtering algorithm to predict user ratings for movies and television shows, based on previous ratings • Offered a $1 million prize to the team who could improve Cinematch’s accuracy by 10% • Awarded a $50,000 progress prize for the team who makes the most progress for each year before the 10% mark was reached • The contest started on October 2, 2006 and would run until at least October 2, 2011, depending on when a winner was chosen

  17. Metrics • The accuracy of the algorithms was measured by using root mean square error, or RMSE • Chosen because it is a well-known, single value that can account for and amplify the contributions of errors such as false positives and false negatives

  18. Metrics • Cinematch scored 0.9525 on the test subset • The winning team needed to score at least 10% lower, with an RMSE of 0.8563

  19. Results • The contest ended on June 26, 2009 • The threshold was broken by the teams “BellKor's Pragmatic Chaos” and “The Ensemble”, both achieving a 10.06% improvement over Cinematch, with an RMSE of 0.8567 • “BellKor's Pragmatic Chaos” won the prize due to the team submitting their results 20 minutes before “The Ensemble”

  20. Netflix Prize Sequel • Due to the success of their contest, Netflix announced another contest to further improve their recommender system • Unfortunately, it was discovered that the anonymized customer data that they provided to the contestants could actually be used to identify individual customers • This, combined with a resulting investigation by the FTC and a lawsuit, led Netflix to cancel their sequel

  21. Sources • http://blog.netflix.com/2010/03/this-is-neil-hunt-chief-product-officer.html • http://www.netflixprize.com • http://www.nytimes.com/2010/03/13/technology/13netflix.html?_r=1

More Related