Active Learning 02-750

Active Learning02-750 Jaime Carbonell, Language Technologies Institute Carnegie Mellon University www.cs.cmu.edu/~{jgc| pinard | jinruih | vamshi} 27 September 2010

Active Learning • Training data: • Special case: • Functional space: • Fitness Criterion: • a.k.a. loss function • Sampling Strategy: Jaime G. Carbonell, Language Technolgies Institute

Cost Sensitive Active Learning(pp37-39 Settles) • Suppose not all instances cost the same to label • Cytoplasmic vs membrane proteins for structure prediction via X-ray crystallography • Books vs web pages for topic labels • Near-misses vs clear examples • Suppose labelers vary in costs • Crystallography vs MRI for protein structures • Linguists vs Turkers for Machine Translation • How to cope with cost-accuracy tradeoffs? • Proactive learning (coming later) Jaime G. Carbonell, Language Technolgies Institute

Active Learning Beyond Instances • Active Class Selection (p33 Settles) • Given a class, query instances thereof • Typical vs boundary instances • Active Feature Selection • Query values of features across many instances • Enables meaningful “batch” experiments • Generalized to Instance-Feature matrix • Active Clustering (p33-34 Settles) • Semi-supervised: new classes can spawn • Subsampling for effective unsupervised clustering Jaime G. Carbonell, Language Technolgies Institute

Batch-Mode Active Learning • Why would we want Q-batch vs Q-1? • Amortize experimental set up • Keep human labeler efficiently busy • “Staleness” vs utilization (Ringer, 2010) • Crowd sourcing  parallelizable AL • How do we select batches? (pp 35-36 Settles) • Instance Diversity in batch as part of samling (Brinker 2003, Donmez & Carbonell, 2008) • Modular and submodular functions (Hoi 2006) • Need a joint optimization criterion Jaime G. Carbonell, Language Technolgies Institute

Noisy Labelers or Experiments(pp37-39 Settles) • Labeling noise  version-space learning flawed • E.g. cannot apply SVM shrinking-margin • Underlying ML algorithm must be noise resistant • Reducing noisy labels if p(correct) > 0.5 • Repeated labeling (if random noise) • Majority vote (if semi-independent labelers) • Tradeoffs in repeat vs new labels • Cost vs accuracy tradeoffs • What if the labeler accuracy is not known? • Learn/estimate labeler accuracy as part of AL •  Proactive Learning (later class) Jaime G. Carbonell, Language Technolgies Institute

Readings • Burr Settles – Comprehensive Survey of AL http://www.cs.cmu.edu/~bsettles/pub/settles.activelearning.pdf • Donmez, P. Carbonell, J. and Bennett, P. “Dual-Strategy Active Learning” http://www.cs.cmu.edu/~jgc/publication/Dual_Strategy_ECML_2007.pdf • Cohn, Ghahramani and Jordan, “Active Learning with Statistical Models” http://dspace.mit.edu/bitstream/handle/1721.1/7192/AIM-1522.pdf;jsessionid=13C2A9BF0DEC1567B9CA33F0C43BC3C3?sequence=2 Jaime G. Carbonell, Language Technolgies Institute

THANK YOU! Jaime G. Carbonell, Language Technolgies Institute

Active Learning 02-750

Active Learning 02-750

Presentation Transcript

Active Learning

Active Learning

Active Learning

Active learning

Active Learning

Active Learning

Active Learning

Active Learning

Active Learning

Active Learning

Active Learning

Active Learning 02-750

Active Learning

Machine Learning Challenges Comp Bio 02-750

Active Learning

Active Learning

Active Learning in Comp Bio 02-750

Active Learning

Active Learning

Active learning