1 / 45

Multiple Instance Learning with Query Bags

Multiple Instance Learning with Query Bags. Boris Babenko, Piotr Dollar, Serge Belongie [In prep. for ICML’09 – feedback appreciated!]. Outline. Multiple Instance Learning (MIL) Review Typical MIL Applications Query Bag Model for MIL Filtering Strategies Conclusion. Outline.

alika
Download Presentation

Multiple Instance Learning with Query Bags

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiple Instance Learning with Query Bags Boris Babenko, Piotr Dollar, Serge Belongie [In prep. for ICML’09 – feedback appreciated!]

  2. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  3. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  4. Multiple Instance Learning (MIL) • Ambiguity in training data • Instead of instance/label pairs, get bagof instances/label pairs • Bag is positive if one or more of it’s members is positive

  5. Multiple Instance Learning (MIL) • Supervised Learning Training Input • MIL Training Input • Goal: learning instance classifier

  6. MIL Assumptions • Bags are predefined/fixed & finite (size ) • Bag label determined by: • Typical assumption: instances all drawn i.i.d. • Refer to this as a classical bag

  7. MIL Theory • Best known PAC bound due to Blum et al. • = dimensionality • = bag size • = the desired error. • Problem harderfor larger bags • Result relies on i.i.d. assumption

  8. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  9. MIL Applications • Most MIL Apps: bag generated by breaking object into many overlapping pieces. • Let’s see some examples…

  10. Vision • Image known to contain object, but precise location unknown [Andrews et al. 02, Viola et al. 05]

  11. Audio • Audio wave can be broken up spatiallyor in frequency domain [Saul et al. 01]

  12. Biological Sequences • Known to contain short subsequence of interest ACTGTGTGACATGTAGC { ACTG, CTGT, TGTG…} … [Ray et al. 05]

  13. Text • Text document broken down into smaller pieces [Andrews et al. 02]

  14. Observations • Sliding windows: bags are large/infinite. • In practice, bag is sub-sampled • Could violate the assumption ! • Instances of bag not independent – often lie on low dim. manifold (i.e. image patches)

  15. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  16. Query Bags for MIL • Bag not fixed – can query oracle to get arbitrary number of instances • Each query bag represented by object • To retrieve instances, use query functionwith location parameter

  17. Query Bag for MIL • Instances often lie on low dim. manifold • Can query for nearby instances

  18. Query Bags for MIL • Can express bag as • Define bag label as

  19. Distribution of locations • Assume for each bag there is some distribution (known or unknown) • Could provide some prior information. • Let , how informative is

  20. Query Bag Size • To determine bag label with confidence need • Bigger bag = better. Less chance of missing correct positive instance • Note the difference between query bags and classical bags

  21. Example: Line Bags • Instances of a bag lie on a line.

  22. Example: Hypercube Bags • Instances of a bag lie in a hypercube

  23. Example: Image Translation Bags • Let be large image, be patch centered at location • Could easily extend this to rotations, scale changes, etc.

  24. Experiments • Goal: compare behavior of synthetic classical bags and query bags to real dataset (MNIST). • Use MILBoost (Viola et al. ’05). • Expect qualitatively similar results for other MIL algorithms. • For query bags, subsample instances

  25. Results

  26. Experiment: Variance • How does distribution affect error? • Repeat Line Bag experiment, increase variance of - spreads points out along the line.

  27. Observations • PAC results not applicable to query bags – performance increase as increases. • MNIST results closely resemble synthetic query bag examples. • Need computational strategy for dealing with large bags. • Take advantage of relationships between instances.

  28. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  29. MILBoost Review • Train a strong classifier (just like AdaBoost) • Optimize log likelihood of bagswhere and • Use Gradient Boosting (Friedman ’01) • In each iteration add close to

  30. MILBoost w/ Query Bags • Bag probability over all instances • In practice, subsample bag: • Could subsample once in the beginning, or do something more clever…

  31. Filtering Strategies • Recently, Bradley & Schapire proposed FilterBoost, which learns from continuous source of data. • Alternates between training weak classifier and querying oracle for more data. • Apply this idea to MILBoost

  32. Filtering Strategies • Want highest probability instances • Parameters: • = number of boosting iterations • = number of instances to evaluate • = frequency of filtering

  33. MEM Filtering Strategies RAND • Random Sampling (RAND) • Query instances, keep best • Memory (MEM) • Query new instances, combine with old ones, keep best

  34. MEM Filtering Strategies SRCH RAND • Search (SRCH) • Assume instances lie on low dimensional manifold • Search for nearby such that • Test nearby locations

  35. MNIST Filtering Experiments • Turn SRCH and MEM on and off. • Sweep through: • R = sampling amount (16) • m = bag size (4) • F = sampling frequency (1) • T = boosting iterations (64)

  36. MNIST Filtering Exp: m • Filtering converges w/ smaller memory usage

  37. MNIST Filtering Exp: R & F • MEM is very effective • SRCH helps when MEM is OFF, not as big of a difference when MEM is ON

  38. MNIST Filtering Exp: T • w/o MEM filtering does not converge • Positive region becomes sparse

  39. Why MEM Works • Let be log likelihood with • Can show (for a fixed classifier H) • Using MEM, we add new instances per bag in each iteration, so • In reality H is not fixed; hard to show convergence.

  40. Outline • Multiple Instance Learning (MIL) Review • Typical MIL Applications • Query Bag Model for MIL • Filtering Strategies • Conclusion

  41. Summary • Current assumptions for MIL are not appropriate for typical MIL applications. • We proposed query bag model, fits real data better • For query bags, sampling more instances is better. • We proposed some simple strategies for dealing with large/infinite query bags.

  42. Future Work • Develop more theory for the query bag model. • Experiments with other domains (audio, bioinformatics). • MCL – learning pedestrian parts automatically.

  43. Questions?

  44. Filtering Query Bags

  45. MILBoost with Filtering

More Related