1 / 53

Attentive People Finding

Attentive People Finding. James Elder Centre for Vision Research York University Toronto, Canada. Joint work with: Simon Prince Bob Hou. Collaborative Project: “Monitoring Changes to Urban Environments with a Network of Sensors”

perryruth
Download Presentation

Attentive People Finding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Attentive People Finding James Elder Centre for Vision Research York University Toronto, Canada Joint work with: Simon Prince Bob Hou

  2. Collaborative Project: “Monitoring Changes to Urban Environments with a Network of Sensors” Funding: Canadian Agency called GEOIDE (Geomatics for Informed Decisions) "This ‘network of networks’ brings together the skills, technology and people from different communities of practice, in order to develop and consolidate the Canadian competences in geomatics." Research Context

  3. Monitoring Changes to Urban Environments "This project will study visual detection and interpretation of changes to urban environments using continuous and non-continuous sensing from a multiplicity of diverse sensors using networks of video cameras, augmented with high-resolution satellite imagery. It will also investigate the problem of how such information can be integrated and managed within a computer, leading to the development of a prototype information system for monitoring urban environments." What is our project?

  4. University Principal Investigators: David Clausi, Waterloo Geoffrey Edwards, Laval James Elder, York Frank Ferrie, McGill Jim Little, UBC Main Industry Partners CAE Genetec Aimetis Project Team

  5. April 2005 – March 2009 Timeframe

  6. 1. Establishment of urban test facilities involving networks of multi-sensor wireless cameras with associated satellite data and development of intercalibration software (Elder, Ferrie, Little) 2. Development of algorithms for fusing offline satellite data with streaming video from terrestrial sensors for the construction of more complete 3D urban models (Clausi). 3. Development of algorithms for inferring approximate intrinsic images from monocular video (ordinal depth maps, reflectance maps, …). (Elder, Ferrie, Little) 4. Development of algorithms for identifying and modeling typical dynamic events (e.g. pedestrian and automobile traffic, changes in climate, air quality, seasonal changes) and detecting unusual events. (Elder, Ferrie, Little) 5. Development of algorithms for deriving and updating navigational maps based upon derived models. (Edwards) 6. Development of integrated demonstration system. (Ferrie) Objectives

  7. Disaster management (e.g., earthquakes) Traffic monitoring (e.g., automobile, trucking, pedestrian) Security (e.g., people tracking, activity and identity recognition) Urban planning (e.g., 3D dynamic scene visualization) Environmental monitoring (e.g., air quality) Possible Application Areas

  8. FOVEAL IMAGE TILT PAN WIDE-FIELD IMAGE Pre-Attentive and Attentive Sensing (with S. Prince, Y. Hou, M. Sizinitsev, E. Olevskey)

  9. Homographic fusion of attentive and pre-attentive streams

  10. Wide-Field Body Detection Min: 15x2 pixels Max: 98x78 pixels Median: 52x14 pixels

  11. Wide-Field Face Detection Min: 2x2 pixels Max: 34x31 pixels Median: 6x6 pixels

  12. Detecting people in realistic environments

  13. Biological vision?

  14. Motion scaling From Johnston & Wright, 1986

  15. Biological Motion From Ikeda, Blake & Watanabe, 2005

  16. 1000 ms 59 ms 506 ms Until Response Structural Coherence (with L. Velisavljevic) Psychophysical Method

  17. Image Conditions Scrambled Coherent Colour Monochrome

  18. 82 Data Model 78 74 70 66 62 58 Colour Colour BW BW Coherent Incoherent Coherent Incoherent Results % Correct

  19. 90 80 Percent Correct 70 60 Unscrambled Scrambled 50 3 8 13 18 Mean Distance from Fixation (º) Spatial Coherence Colour Monochromatic

  20. Pre-Attentive (Peripheral) Vision: Motion discrimination Colour discrimination Biological motion Contour integration Coherent structure Summary

  21. Motion region likelihood ratio raw pixel pixel posterior region response pixel model spatial integrator region model Foreground region likelihood ratio system posterior raw pixel pixel posterior region response system priors pixel model spatial integrator region model X Skin region likelihood ratio pixel posterior region response raw pixel pixel model spatial integrator region model Preattentive System Design

  22. confirmed face location mean body indicator motion kernel spatial prior gaze command prior posterior random sampler gaze control high-resolution face detection non-max suppression likelihood Attentive sensor motion kernel Priors as Attentive Feedback

  23. 1 Motion 0.5 Original frame 0 Foreground 1 Skin 0.5 0 Skin 1 0.5 0 Pixel Posteriors Pixel Posteriors

  24. Spatial Integration

  25. 0.86 0.84 0.82 0.8 Area under ROC Curve 0.78 0.76 Motion 0.74 Foreground 0.72 Skin 0.7 -1 0 1 10 10 10 g Exponent, Spatial Integration

  26. Motion Region Log Likelihood Ratio 4 2 Joint Region Log Likelihood Ratio 0 4 -2 2 Foreground Region Log Likelihood Ratio -4 4 0 2 -2 0 -4 -2 -4 Skin Region Log Likelihood Ratio 4 2 0 -2 -4 Spatial Integration

  27. 1 0.8 0.6 p(Hit) Foreground 13 x 20 Skin 4 x 5 0.4 Motion 20 x 20 Combined 0.2 Xiong & Jaynes 0 0 0.2 0.4 0.6 0.8 1 p(False Positive) Combining Detectors • System evaluation on distinct test database: • 74% of fixations capture human heads

  28. System evaluation on distinct test database: 74% of fixations capture human heads 83% of people are fixated at least once Performance

  29. Automatically Confirmed High-Resolution Faces

  30. 3D POSE PROBLEM Capture training and test database Horizontal pose (known) varies over 180 degrees. Pose for each image known precisely. Points on each face identified Image regions extracted Features are weighted sums of pixels in region

  31. An Alternate Approach: 2D to 3D (with VisionSphere Technologies)

  32. Simon Prince

  33. Realistic environments and behaviour  hard problem. Humans: primitive mechanisms are preserved in periphery, more complex mechanisms are not. Our approach: probabilistic combination of simple, weak cues Ongoing work: attentive feedback Attentive People Finding

  34. Colour Scaling From Rovamo & Iivanainen, 1991

  35. Contour Integration From Hess & Dakin, 1999

  36. Contour Integration From Hess & Dakin, 1999

  37. Interactive Attentive Sensing Needed: Fast Saccadic Programming Algorithms!

  38. 0.86 0.84 0.82 0.8 Area under ROC Curve 0.78 0.76 Motion 0.74 Foreground 0.72 Skin 0.7 -1 0 1 10 10 10 g Exponent, Spatial Integration

  39. 3D Hugh

  40. Sal Khan (VisionSphere)

  41. A supervised method to make a feature set more invariant to a known nuisance parameter Fast No knowledge of faces No knowledge of 3d transformations Slower Uses lot s of domain specific knowledge Better Results SUMMARY EIGEN-LIGHTFIELDS < INVARIANCE << 3D MODEL Gross, Matthews, Baker Prince, Elder Blanz et al.

  42. TO TRAIN: ESTIMATE MEAN AND COV OF MANIFOLD A FUNCTION OF DISTRACTOR VARIABLE ALTERNATELY ESTIMATE: INVARIANT VECTORS Ci TRANSFORMATIONS F1..n TO CALCULATE INVARIANT VECTORS: ESTIMATE NUISANCE VALUE, v TRANSFORM BY APPROPRIATE Fv Algorithm Summary

  43. Attentive Snapshots

  44. Problem: Image variation due to nuisance parameters such as pose change is greater than variation due to identity. This is reflected in most “features” PROBLEM STATEMENT Feature Space

  45. X1 C f1,q1 ………………. ………………. ………………. NUISANCE PARAMETERS + CONVENTIONAL FEATURE VECTOR INVARIANT VECTOR f2,q2 X2 GOAL: Decompose Conventional Feature Vector to Invariant Feature + Nuisance Parameter

  46. TEST IMAGES – angle unknown ? PROBE IMAGE – angle unknown TRAINING IMAGES – angle known, several images of each face present TOY DATA SET – IN PLANE ORIENTATION Choice of features: – first few EIGENVECTORS

  47. Increasing q THE FIRST TWO FEATURE DIMENSIONS X2 X1

  48. ESTIMATE NUISANCE PARAMETER X2 X1

More Related