1 / 103

Visual Grouping and Recognition

Visual Grouping and Recognition. Jitendra Malik U.C. Berkeley. Collaborators. Grouping: Jianbo Shi (CMU), Serge Belongie (UCSD) , Thomas Leung (Fuji) Database of human segmented images and ecological statistics: David Martin, Charless Fowlkes, Xiaofeng Ren

lyndon
Download Presentation

Visual Grouping and Recognition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Grouping and Recognition Jitendra Malik U.C. Berkeley

  2. Collaborators • Grouping: Jianbo Shi (CMU), Serge Belongie (UCSD) , Thomas Leung (Fuji) • Database of human segmented images and ecological statistics: David Martin, Charless Fowlkes, Xiaofeng Ren • Recognition: Serge Belongie, Jan Puzicha

  3. The visual system performs • Inference of lightness, shape and spatial relations • Perceptual Organization • Active interaction with environment

  4. A brief history of vision science • 1850-1900 • Trichromacy, stereopsis, eye movements, contrast, visual acuity.. • 1900-1950 • Apparent movement, grouping, figure-ground.. • 1950-2000 • Ecological optics, geometrical analysis of shape cues, physiology of V1 and extra-striate areas..

  5. Physiological Optics 1840-1894

  6. The Empiricist-Nativist debate

  7. The debate..(and sometimes both were right !) • Helmholtz argued that perception is unconscious inference. Associations are earned through experience. • Hering proposed physiological mechanisms—opponent color channels, contrast mechanisms, conjunctive and disjunctive eye movements..

  8. The Twentieth Century.. • The Gestalt movement emphasized perceptual organization. • Grouping • Figure/ground • Configuration effects on perception of brightness and lightness

  9. Gibson’s ecological optics (1950) • Emphasized richness of information about shape and surface layout available to a moving observer • Optical flow • Texture Gradients • ( and the classical cues such as stereopsis etc)

  10. Visual Processing Areas

  11. The visual system performs • Inference of lightness, shape and spatial relations • Perceptual Organization • Active interaction with environment

  12. From Images to Objects

  13. What enables us to parse a scene? • Low level cues • Color/texture • Contours • Motion • Mid level cues • T-junctions • Convexity • High level Cues • Familiar Object • Familiar Motion

  14. Grouping factors

  15. Grouping Factors

  16. The Figure-Ground Problem

  17. Focus of this talk • Provide a mathematical foundation for the grouping problem in terms of the ecological statistics of natural images. • This research agenda was first proposed by Egon Brunswik, more than 50 years ago, who sought to justify Gestalt grouping factors in probabilistic terms.

  18. Outline of talk • Creating a dataset of human segmented images • Measuring ecological statistics of various Gestalt grouping factors • Using these measurements to calibrate and validate approaches to grouping

  19. Outline of talk • Creating a dataset of human segmented images • Measuring ecological statistics of various Gestalt grouping factors • Using these measurements to calibrate and validate approaches to grouping

  20. What kind of segmentations? • What is a valid segmentation? • Is there a correct segmentation? • What granularity?

  21. The Image Dataset • 1000 Corel images • Photographs of natural scenes • Texture is common • Large variety of subject matter • 481 x 321 x 24b

  22. Establishing Ground truth • Def: Segmentation = Partition of image pixels into exclusive sets • Custom tool to facilitate manual segmentation • Java application, on website • Multiple segmentations/image • Currently: 1000 images, 5000 segmentations, 20 subjects • Data collection ongoing • Naïve subjects (UCB undergrads) given simple, non-technical instructions

  23. Directions to Image Segmentors • You will be presented a photographic image • Divide the image into some number of segments, where the segments represent “things” or “parts of things” in the scene • The number of segments is up to you, as it depends on the image. Something between 2 and 30 is likely to be appropriate. • It is important that all of the segments have approximately equal importance.

  24. Segmentations are not identical

  25. But are they consistent?

  26. Perceptual organization produces a hierarchy image Each subject picks a cross section from this hierarchy background left bird right bird beak grass bush far beak eye head body eye head body

  27. refinement of S1 S2 Quantifying inconsistency.. How much is segmentation S1 a refinement of segmentation S2 at pixel pi? E(S1,S2,pi) = |(R(S1,pi)\R(S2,pi)| |R(S1,pi)|

  28. Segmentation Error Measure • One-way Local Refinement Error: LRE(S1,S2,pi) = ||(R(S1,pi) \ R(S2,pi)|| ||R(S1,pi)|| • Segmentation Error defined to allow refinement in either direction at each pixel: SE(S1,S2) = 1/n imin{LRE(S1,S2,pi), LRE(S2,S1,pi)}

  29. Distribution of SE over Dataset

  30. Gray, Color, InvNeg Datasets • Explore how various high/low-level cues affect the task of image segmentation by subjects • Color = full color image • Gray = luminance image • InvNeg = inverted negative luminance image

  31. Color Gray InvNeg

  32. InvNeg

  33. Color Gray InvNeg

  34. Gray vs. Color vs. InvNeg Segmentations SE (gray, gray) = 0.047 SE (gray, color) = 0.047 SE (gray, invneg) = 0.059 • Color may affect attention, but doesn’t seem to affect perceptual organization • InvNeg seems to interfere with high-level cues 2500 gray segmentations 2500 color segmentations 200 invneg segmentations

  35. Outline of talk • Creating a dataset of human segmented images • Measuring ecological statistics of various Gestalt grouping factors • Using these measurements to calibrate and validate approaches to grouping

  36. Natural images aren’t generic signals • Filter statistics are far from Gaussian.. • Ruderman 1994,1997 • Field, Olshausen 1996 • Huang,Mumford 1999,2000 • Buccigrossi,Simoncelli 1999 • These properties (e.g. scale-invariance, sparsity, heavy tails) can be exploited for image compression.

  37. P (SameSegment | Proximity)

  38. P (SameSegment | Luminance)

  39. Quantifying the power of cues • Bayes Risk • Mutual information

  40. Bayes Risk for Proximity Cue

  41. Mutual information where x is a cue and y is indicator of being in same segment

  42. Bayes Risk for Various Cues Given Proximity

More Related