1 / 62

Grammar of Image

This paper delves into the challenges in image parsing, proposing a framework using And-Or Graphs and Stochastic Context-Free Grammar to bridge the semantic gap. Aiming to generalize small samples and synthesize configurations using Monte Carlo simulation to address overlapping parts and ambiguity in images. It introduces image grammar and discusses the formulation, learning, and testing processes, focusing on a new paradigm shift in image processing. The study investigates the vast dataset of Lotus Hill Institute to establish benchmarks for identifying various objects and scenes. Through visual vocabulary and relationships, it explores the synthesis of texture and structure in primal sketches to create high-level image representations.

eloiseh
Download Presentation

Grammar of Image

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Grammar of Image ZhaoyinJia, 03-30-2009

  2. Problems • Enormous amount of vision knowledge: • Computational complexity • Semantic gap …… Classification, Recognition

  3. Task of image parsing

  4. Objectives in this paper • Framework for vision • And-Or Graph • Algorithm for this framework • Top-down/bottom-up computation • Generalization of small sample • Use Monte Carlos simulation to synthesis more configurations • Fill the semantic gap

  5. Grammar • Language: co-occurance of s is more than chance • Image: Parallel; T-junction CONSTANTINOPLE

  6. Formulation of grammar • Start symbol: S • Non-terminal nodes: VN • Reproduction Rule: R • Terminal nodes: VT

  7. Formulation of grammar • Start symbol: S • Non-terminal nodes: VN • Reproduction Rule: R • Terminal nodes: VT

  8. Formulation of grammar • Start symbol: S • Non-terminal nodes: VN • Reproduction Rule: R • Terminal nodes: VT S NP VP VP VP PP VP V NP ……

  9. Formulation of grammar • Start symbol: S • Non-terminal nodes: VN • Reproduction Rule: R • Terminal nodes: VT

  10. Formulation of grammar • Start symbol: S • Non-terminal nodes: VN • Reproduction Rule: R • Terminal nodes: VT

  11. Image grammar • Start symbol: S • Reproduction Rules • Non-terminal nodes: VN • Terminal nodes: VT

  12. Overlapping parts/Ambiguity

  13. Overlapping parts/Ambiguity • Similar color, occlusion, etc.

  14. Stochastic Context Free Grammar • For each VN , we have reproduction rules: with a probability associated with each one: • Probability of parsing tree: • Probability of sentence:

  15. Stochastic Grammar with Context • From left to right: bi-gram model (Markov chain) a sentence with n words: • Non-local relations: tree model

  16. New issues in Image Grammar • Loss of “left to right” order: region adjacency graph

  17. New issues in Image Grammar • Scaling makes different terminal in parsing tree

  18. New issues in Image Grammar • Switch between texture and structure

  19. Building the image grammar • Visual Vocabulary: primitives, sketch graph, textons… • Relations and configurations: co-occurance, attached, hinged, supported, occluded… • And-or Graph representation embedding image grammar • Learning /testing the parse graph find the possible inference

  20. Database • Lotus Hill Institute Dataset • 636,748 images, 3,927,130 Physical Objects • A few hundred are free Benjamin Yao, Xiong Yang, and Song-Chun Zhu, “Introduction to a large scale general purpose ground truth dataset: methodology, annotation tool, and benchmarks.” EMMCVPR, 2007 http://www.imageparsing.com/

  21. Free Data http://yoshi.cs.ucla.edu/yao/data/ • 6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10 • Outline & segmentation of the object

  22. Free Data http://yoshi.cs.ucla.edu/yao/data/ • 6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10 • Segmentation of a scene (street)

  23. Free Data http://yoshi.cs.ucla.edu/yao/data/ • 6 categories, 145 subsets Manmade Object 75 Nature Object 40 Objects in Scene 6 Transportation 9 UCLA Aerial Image 5 UIUC Sport Activity 10 • Physical parts of the object

  24. Visual Vocabulary • The “Lego Land” • Language

  25. Visual Vocabulary • : function of image primitives : a) geometry transformation b) appearance • : bond between each primitives

  26. Visual Vocabulary • Sketch and Texture S. C. Zhu, Y. N. Wu, and D. B. Mumford, “Minimax entropy principle and its applications to texture modeling,” Neural Computation, vol. 9, no. 8, pp. 1627–1660, November 1997

  27. Primal sketch model Sketch graph Input image Texture pixels C. E. Guo, S. C. Zhu, and Y. N. Wu, “Primal sketch: Integrating texture and structure,” in Proceedings of International Conference on Computer Vision,2003.

  28. Primal sketch model C. E. Guo, S. C. Zhu, and Y. N. Wu, “Primal sketch: Integrating texture and structure,” in Proceedings of International Conference on Computer Vision,2003.

  29. High level visual vocabulary • Cloth: collar, left/right sleeves, hands H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu, “Composite templates for cloth modeling and sketching,” in Proceedings of IEEE Conference on Pattern Recognition and Computer Vision, New York, June 2006

  30. Relations and configurations • Definition of relation: bonds: relations: , : structure, : compatibility • Three types of relations • Bonds and connections • Joints and junctions • Object interactions/semantics • Definition of configurations:

  31. Relations • Bonds and connections connects primitives into bigger graphs intensity/color compatibility

  32. Relations • Joint and junctions

  33. Relations • Object interactions

  34. Configuration • Spatial layout of entities at a certain level Primal sketch – parts – object – scene

  35. Reconfigurable graphs • Treat bonds as random variables: address nodes

  36. Inference of the configuration • Have the primal sketch of the image • Detect the ‘T-junction’ • Simulated annealing to infer the Gestalt Law Red dot: connect region Black line: known edge Green line: inferred connection R. X. Gao and S. C. Zhu, “From primal sketch to 2.1D sketch,” Technical Report, Lotus Hill Institute, 2006

  37. Reconfigurable graphs • Layer extraction Inferred connection Source image T-junction Ru-Xin Gao1, Tian-Fu Wu, Song-Chun Zhu, and Nong Sang, “Bayesian Inference for Layer Representation with Mixed Markov Random Field ”

  38. Reconfigurable graphs R. X. Gao and S. C. Zhu, “From primal sketch to 2.1D sketch,” Technical Report, Lotus Hill Institute, 2006

  39. And-Or Graph • Parse graph of the image pt: parse tree of vocabulary E: relations • Inference the parse graph: Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu, “Recursive top-down/bottom up algorithm for object recognition,” Technical Report, Lotus Hill Research Institute, 2007.

  40. And-Or Graph • Contain all the valid parse graphs • And node, Or node, leaf-node • Relation between children of And node • Parse tree: assigning label on Or node Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu, “Recursive top-down/bottom up algorithm for object recognition,” Technical Report, Lotus Hill Research Institute, 2007.

  41. And-Or Graph • Definition: • image primitives • relations at all level • : probability model defined on the And-Or graph • : valid configuration of terminal nodes

  42. Stochastic Model on And-Or graph • Terminal (leaf) node: • And-Or node: • Set of links: • Switch variable at Or-node: • Attributes of primitives:

  43. Stochastic Model on And-Or graph • Terminal (leaf) node: • And-Or node: • Set of links: • Switch variable at Or-node: • Attributes of primitives: SCFG: weigh the frequency at the children of or-nodes

  44. Stochastic Model on And-Or graph • Terminal (leaf) node: • And-Or node: • Set of links: • Switch variable at Or-node: • Attributes of primitives: Weigh the local compatibility of primitives (geometric and appearance)

  45. Stochastic Model on And-Or graph • Terminal (leaf) node: • And-Or node: • Set of links: • Switch variable at Or-node: • Attributes of primitives: Spatial and appearance between primitives (parts or objects)

  46. Learning And-Or Graph • Learning the vocabulary • Learning the relation set R, given • Learning the parameters , given R and

  47. Learning And-Or Graph • Learning the vocabulary , and hierarchic And-Or Graph • Learning the relation set R, given • Learning the parameters , given R and Discussed in the paper

  48. Learning And-Or Graph • Learning and Pursuing Relation Set R: • Start from Stochastic Context Free Graph (a) • Learn the relations that maximally reduce the KL divergence to the observation (b-e) Observation: Learning model: J. Porway, Z. Y. Yao, and S. C. Zhu, “Learning an And–Or graph for modeling and recognizing object categories,” Technical Report, Department of Statistics,2007

  49. Learning And-Or Graph • Learning graph parameter • Approximating to • Similar to texture synthesis S. C. Zhu, Y. N. Wu, and D. B. Mumford, “Minimax entropy principle and its applications to texture modeling,” Neural Computation, vol. 9, no. 8, pp. 1627–1660, November 1997

  50. Case I: Rectangle • Nodes: Rectangle • Two vanishing points, four edge direction • Rules: F. Han and S. C. Zhu, “Bottom-up/top-down image parsing by attribute graph grammar”. Proceedings of International Conference on Computer Vision, Beijing,China, 2005.

More Related