1 / 75

Hierarchy and Reusability in Image Analysis Stuart Geman Eran Borenstein, Ya Jin, Wei Zhang

Hierarchy and Reusability in Image Analysis Stuart Geman Eran Borenstein, Ya Jin, Wei Zhang. Remarks on Computer Vision Approaches Bayesian Image Analysis Probability Models Demonstration System: Reading License Plates Generalization: Face Detection. Remarks on Computer Vision

Download Presentation

Hierarchy and Reusability in Image Analysis Stuart Geman Eran Borenstein, Ya Jin, Wei Zhang

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hierarchy and Reusability in Image AnalysisStuart GemanEran Borenstein, Ya Jin, Wei Zhang

  2. Remarks on Computer Vision • Approaches • Bayesian Image Analysis • Probability Models • Demonstration System: Reading License Plates • Generalization: Face Detection

  3. Remarks on Computer Vision • Vision is hard • Why is vision hard? • Approaches • Bayesian Image Analysis • Probability Models • Demonstration System: Reading License Plates • Generalization: Face Detection

  4. License plate images from Logan Airport Machines still can’t reliably read license plates

  5. Wafer ID’s Machines can’t read fixed-font fixed-scale characters as well as humans

  6. Super Bowl Machines can’t find the bad guys at the Super Bowl

  7. same Empire style table twins Instantiation Vision is content sensitive

  8. Human Interactive Proofs “Clutter” Background is structured, and made of the same stuff!

  9. Remarks on Computer Vision • Approaches • Pure learning • Fodor & Pylyshyun, 1988, and the critique of neural networks • Observations from the cognitive, neural, and mathematical sciences • Bayesian Image Analysis • Probability Models • Demonstration System: Reading License Plates • Generalization: Face Detection

  10. Pure learning • Google: • 2 billion images • train classifier: pornographic/not pornographic • good classification • not nearly as good as human performance

  11. Pure learning • Human learning: • 1 sample/10 seconds • 16 hours/day • 80 years • < 170 million samples/lifetime Enough examples? Did evolution have enough examples? D. Geman: “The interesting limit is N goes to zero, not N goes to infinity”

  12. Fodor & Pylyshyun, 1988, and the critique of neural networks Properties of human cognition: * compositionality roughly: representation through syntactically constrained hierarchy of reusable parts * productivity roughly: capable of an infinite number of well- formed actions, thoughts, sentences … * systematicity roughly: invariance

  13. If is a triangle, then so are Fodor & Pylyshyun, 1988, and the critique of neural networks systematicity if ‘the boy ran home’ makes sense, then so does ‘the girl ran home’ ‘john ran home’ ‘john ran to school’ 4 4 4 44 4 44

  14. Observations from the cognitive, neural, and mathematical sciences • Human brains utilize strong representations • Damassio (simulation=perception) • Kosslyn (“resolution of the imaging debate”) • Lakeoff, Fauconnier (the role of mental imagery in language understanding)

  15. Structure: retina LGN V1 V2 V4 IT Topography: highly retinotopic little topology Receptive Field: small large Specificity: low high Invariance: low high SUMMARY: A hierarchy of less-to-more invariant representations Observations from the cognitive, neural, and mathematical sciences Consider the ventral visual pathway:

  16. Observations from the cognitive, neural, and mathematical sciences Problem: test for ‘L’ Thought experiment: what if the world is a hierarchy of reusable parts?

  17. Observations from the cognitive, neural, and mathematical sciences impossible in practice… God’s ROC curve (Neyman-Pearson Lemma):

  18. Observations from the cognitive, neural, and mathematical sciences Testing against a universal null (pragmatic):

  19. Observations from the cognitive, neural, and mathematical sciences Testing “parts” against “wholes”:

  20. 1.00 0.99 0.98 0.97 probability of detection 0.96 0.95 0.94 0.0 0.2 0.4 0.6 0.8 1.0 probability of false alarm Observations from the cognitive, neural, and mathematical sciences ROCs God’s ROC parts against wholes universal null

  21. Observations from the cognitive, neural, and mathematical sciences Take Home Message: in a compositional world • background confusions occur at subsets of objects • objects come equipped with their own background models Theorem (S. Geman, Y. Jin, W. Zhang) At any fixed probability of detection (`type I error’): & various generalizations…

  22. Remarks on Computer Vision • Approaches • Bayesian Image Analysis • Overview • Interpretations • Probability Models • Demonstration System: Reading License Plates • Generalization: Face Detection

  23. Bayesian image analysis - overview Given an image Y: find a ‘good’ interpretation I using P(I|Y)

  24. Interpretations: built from hierarchies of reusable parts e.g. animals, trees, rocks e.g. contours, intermediate objects “Bricks” e.g. linelets, curvelets, T-junctions e.g. discontinuities, gradient

  25. Hierarchy of Disjunctions of Conjunctions

  26. Hierarchy of Disjunctions of Conjunctions

  27. Hierarchy of Disjunctions of Conjunctions

  28. Hierarchy of Disjunctions of Conjunctions

  29. Hierarchy of Disjunctions of Conjunctions

  30. Hierarchy of Disjunctions of Conjunctions

  31. Hierarchy of Disjunctions of Conjunctions

  32. selected complete subgraph Interpretations Interpretation

  33. selected complete subgraph Interpretations Interpretation

  34. Remarks on Computer Vision • Approaches • Bayesian Image Analysis • Probability Models • P(I) • P(Y|I) • Demonstration System: Reading License Plates • Generalization: Face Detection

  35. Probability modeling

  36. Architecture • Every brick is • off, or • on, and selects a set of children Bricks Image

  37. Image interpretation: a complete subgraph Image interpretation Bricks Image

  38. Notation Bricks B Image

  39. Markov backbone Bricks B Image

  40. Markov backbone Is this sufficiently constrained? Bricks B Image

  41. ? Beyond Markovian distributions Compositions depend on instantiations…e.g. the positioning of parts

  42. Compositional distribution – a perturbed Markov model Bricks B Image

  43. Content-Sensitive Perturbation

  44. Content-Sensitive Perturbation

  45. Content-Sensitive Perturbation …. but perturbations interact! …. nevertheless …

  46. Probability modeling

  47. Data Model (overview) patch locations Interpretation sufficiency assumption extend to overlap pixels template iid “background”

  48. Select a terminal brick

  49. pixels covered by Select a state (e.g. left-eye brick)

  50. Image data Assume C = Corr( , ) sufficient: -1 +1 Model Template – one for each state of brick

More Related