1 / 19

Margin-Based Active Learning by Maria-Florina Balcan at Carnegie Mellon University

Explore the innovative use of unlabeled data in active learning processes, including web page and document classification algorithms at Yahoo! Research. Maria-Florina Balcan presents methods for semi-supervised passive learning and active learning, optimizing label requests to improve classifier performance. Dive into margin-based active learning algorithms for linear separators, with a focus on realizable cases and bounded noise settings.

theresac
Download Presentation

Margin-Based Active Learning by Maria-Florina Balcan at Carnegie Mellon University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Margin-Based Active Learning Maria-Florina Balcan Carnegie Mellon University Joint with Andrei Broder & Tong Zhang Yahoo! Research Maria-Florina Balcan

  2. Incorporating Unlabeled Data in the Learning Process • OCR, Image classification • Web page, document classification • All the classification problems at Yahoo! Research. Unlabeled data cheap & easy to obtain. Labeled data much more expensive. Maria-Florina Balcan

  3. Semi-Supervised Passive Learning • Several SSL methods developed to use unlabeled data to improve performance, e.g.: • Transductive SVM[Joachims ’98] • Co-training[Blum & Mitchell ’98], • Graph-based methods[Blum & Chawla’01] • Unlabeled data - allows to focus on a priori reasonable classifiers. See Avrim’s talk at the “Open Problems” session. Maria-Florina Balcan

  4. Active Learning • The learner can choose specific examples to be labeled: - The learner works harder to use fewer labeled examples. • Get a set of unlabeled examples from PX. This talk: linear separators. Setting • P distribution over X £ Y; hypothesis class C. Interactively request labels of any of these examples. Goal: find h with small error over P. Minimize the number of label requests. Maria-Florina Balcan

  5. h3 h2 h1 h0 Can Adaptive Querying Help? [CAL ’92, Dasgupta ’04] C = {linear separators in R1}, realizable case. Active setting: O(log 1/) labels to find an -accurate threshold. Exponential improvement in sample complexity. In general,number of queries needed depends on C and P. C ={linear separators in R2}: for some target hyp. no improvement can be achieved. Learning to accuracy  requires 1/ labels. Maria-Florina Balcan

  6. When Active Learning Helps In general,number of queries needed depends on C and P. C - homogeneous linear separators in Rd, PX - uniform distribution over unit sphere. Realizable case • O(d log 1/) labels to find a hypothesis with error . [Freund et al., ’97; Dasgupta, Kalai, Monteleoni ’05] Agnostic Case • low noise, O(d2 log 1/) labels to find a hypothesis with error . A2 algorithm [Balcan, Beygelzimer, Langford ’06] [Hanneke ’07] Maria-Florina Balcan

  7. An Overview of Our Results Analyze a class of margin based active learning algorithms for learning linear separators. • C - homogeneous linear separators in Rd, PX - uniform distrib. over unit sphere get exponential improvement in the realizable case. • Naturally extend the analysis to the bounded noise setting. • Dimension independent bounds when we have a good margin distribution. Maria-Florina Balcan

  8. Margin Based Active-Learning, Realizable Case Algorithm Draw m1 unlabeled examples, label them, add them to W(1). • iteratek=2, …, s • find a hypothesis wk-1 consistent with W(k-1). • W(k)=W(k-1). • sample mk unlabeled samples x • satisfying |wk-1¢ x| ·k-1 ; • label them and add them to W(k). • end iterate Maria-Florina Balcan

  9. Margin Based Active-Learning, Realizable Case • Draw m1 unlabeled examples, label them, add them to W(1). • iteratek = 2, …, s • find a hypothesis wk-1 consistent with W(k-1). • W(k)=W(k-1). • sample mk unlabeled samples x satisfying |wk-1T¢ x| ·k-1 • label them and add them to W(k). 1 w2 w3 w1 2 Maria-Florina Balcan

  10. u (u,v) v v u v  Margin Based Active-Learning, Realizable Case Theorem PX is uniform over Sd. If and then after iterations ws has error ·. Fact 1 Fact 2 If and Maria-Florina Balcan

  11. w wk-1 w* k-1 Margin Based Active-Learning, Realizable Case • iteratek=2, … ,s • find a hypothesis wk-1 consistent with W(k-1). • W(k)=W(k-1). • sample mk unlabeled samples x • satisfying |wk-1T¢ x| ·k-1 • label them and add them to W(k). Proof Idea Induction: allw consistent with W(k) have error ·1/2k; so,wkhas error· 1/2k. For · 1/2k+1 Maria-Florina Balcan

  12. w wk-1 w* k-1 Proof Idea Under the uniform distr. for · 1/2k+1 Maria-Florina Balcan

  13. w wk-1 w* k-1 Proof Idea Under the uniform distr. for · 1/2k+1 Enough to ensure Can do with only labels. Maria-Florina Balcan

  14. w wk-1 w* Realizable Case, Suboptimal Alternative Could imagine: zero Suboptimal Need need so and labels to find a hyp. with error . Similar to [CAL’92, BBL’06, H’07] Maria-Florina Balcan

  15. Margin Based Active-Learning, Non-realizable Case Guarantee Assume PX is uniform over Sd. Assume that |P(Y=1|x)-P(Y=-1|x)| ¸ for all x. Assume w* is the Bayes classifier. Then The previous algorithm and proof extend naturally, and get again an exponential improvement. Maria-Florina Balcan

  16. Margin Based Active-Learning, Non-realizable Case Guarantee Assume PX is uniform over Sd. Assume that |P(Y=1|x)-P(Y=-1|x)| ¸ for all x. Assume w* is the Bayes classifier. Then The previous algorithm and proof extend naturally, and get again an exponential improvement. Maria-Florina Balcan

  17. Summary • Analyze a class of margin based active learning algorithms for learning linear separators. Open Problems • Analyze a wider class of distributions, e.g. log-concave. • Characterize the right sample complexity terms for the Active Learning setting. Maria-Florina Balcan

  18. Thank you ! Maria-Florina Balcan

  19. Thank you ! Also, special thanks to: Alina Beygelzimer, Sanjoy Dasgupta, and John Langford for useful discussions. Maria-Florina Balcan

More Related