190 likes | 489 Views
Real-time Tracking of Multiple People Using Stereo. David Beymer Bob Bolles Kurt Konolige Chris Eveland Artificial Intelligence Center SRI International. Problem: people tracking for surveillance. return coarse 3D locations of people real-time on standard hardware
E N D
Real-time Tracking of MultiplePeople Using Stereo David Beymer Bob Bolles Kurt Konolige Chris Eveland Artificial Intelligence Center SRI International
Problem: people tracking for surveillance • return coarse 3D locations of people • real-time on standard hardware • multiple people in scene • stationary camera
Approach • consider: template-based tracking • maintain template of object • correlation used to update object position • template is recursively updated to handle changing object appearance • limitations/problems 1) object initialization/detection 2) template drift
Goal: add modality of stereo • segmentation: background subtraction on stereo disparities to detect foreground • detection: person templates encoding head and torso shape • tracking: • person templates used to avoid drift • stereo segmentation used to add “support” template left background disparities foreground
Approach • detection • segment foreground into depth layers • correlate with person templates • tracking • intensity and "support" templates are recursively updated • Kalman filtering on person location in 3D • person templates used to avoid drift
Related Work • Companies • Teleos Research/Autodesk, People Tracker • DEC/Compac, Smart Kiosk [Rehg, et al, 1997] • Interval, Morphin' Mirror [Darrell, et al, 1998] • Sarnoff [IUW, 1998] • Texas Instruments [Flinchbaugh, 1998] • Electric Planet • Universities • MIT, Pfinder [Wren, et al, 1997] • Toronto, [Fieguth and Terzopoulos, 1997 • Maryland, W S [Haritaoglu, et al., 1998] • MIT, Forest of Sensors [Grimson, et al., 1998] • CMU [Kanade, et al, 1998] • Columbia/Lehigh [Nayar and Boult, 1998] • Boston Univ., [Rosales and Sclaroff, 1998] 4
Hardware two CMOS cameras low power (150mW), inexpensive ($100 components) adjustable baseline: 2.7'' to 6.2'' in 1'' increments another version with DSP processing onboard Software stereo algorithm is area correlation based optimized C and MMX code 20 Hz on 320x240 image, 24 disparities, 400 MHz Pentium II Stereo module: SRI's Small Vision System (SVS)
SVS Stereo Results left right notation: current disparities background estimate disparities
Background subtraction • look for disparities closer than background • using stereo disparities versus intensities • less sensitive to lighting changes, shadows • can segment people at different depths more computationally expensive tends to blur & expand object boundaries
Handling scale • idea: range info from stereo can be used to fix scale of processing avoid search over scale parameter • person width is proportional to disparity • from similar triangles: • stereo equation: d: disparity b: baseline K: constant
Detection example • during detection, extract intensity and “support” template from layer(x,y)
Tracking -- coordinate space image 3D (x, disparity)(X, Z)
Tracking Steps • prediction • predict Kalman filter (X, Z) • predict person disparity • segmentation • select foreground layer around predicted disparity • localization • correlate gray level template against left image, weighted by support template [coarse localization] • correlate head/torso shape template against segmented foreground layer [re-centering step that addresses template drift] • update • Kalman filter • recursive update of intensity and support templates
Tracking Videos • recursive template update walking figure eight running Please click on image to start video. Once finished viewing the video, use the “back” button on your browser to return.
Tracking Videos visualizing tracks from map view tracking under multiple occlusions Please click on image to start video. Once finished viewing the video, use the “back” button on your browser to return.
Evaluating use of stereo in tracker • Experiment: disable stereo in tracker • code modifications: • disable re-centering step • weighted intensity correlation unweighted correlation • results: • mean tracking rate (TR) drops 4% • mean false positive rate (FP) increases from 3% to 10% • (qualitative) template drift causes people to be lost and re-detected
Conclusion • Stereo is an effective segmentation tool: • detection: provides a foreground layer divided into different depth layers • tracking: helps to avoid template drift by focusing on foreground pixels at object’s depth • Combine segmentation with priors on person shape (i.e. head/torso templates) for person localization.