350 likes | 504 Views
A General Framework for Tracking Multiple People from a Moving Camera. Wongun Choi, Caroline Pantofaru, Silvio Savarese. IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013. Overview. Motivation Related Work Introduction Proposed Method Experiment Result Conclusion.
E N D
A General Framework for Tracking MultiplePeople from a Moving Camera Wongun Choi, Caroline Pantofaru, Silvio Savarese IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, July 2013
Overview • Motivation • Related Work • Introduction • Proposed Method • Experiment Result • Conclusion
Motivation 1.Final goal is tracking multiple people from a moving camera, including outdoor video scene and indoor video scene. 2.There are some challenge to solve: • People have variety poses • Complexity of the motion patterns of multiple people in the same scene • Changeable scene and illumination effect
Related work • Tracking by online learning : Learning appearance model [10],[5],[34],[7],[26] Color histogram and mean shift [10] • Tracking with a moving camera: Probabilistic framework multiple detectors [42],[43] Stereo and graphical model [12],[13] [5] S. Avidan. Ensemble tracking. In PAMI, 2007 [7] C. Bibby and I. Reid. Robust real-time visual tracking using pixelwiseposteriors. In ECCV, 2008 [10] D. Comaniciu and P. Meer. Mean shift:Arobust approach toward feature space analysis. In PAMI, 2002. [12] A. Ess, B. Leibe, K. Schindler, and L. van Gool. A mobile vision system for robust multi-person tracking. In CVPR, 2008. [13] A. Ess, B. Leibe, K. Schindler, and L. van Gool. Robust multi person tracking from a mobile platform. PAMI, 2009. [26] S. Kwak, W. Nam, B. Han, and J. Han. Learning occlusion with likelihoods for visual tracking. In ICCV, 2011 [34] D. Ramanan, D. Forsyth, and A. Zisserman. Tracking people by learning their appearance. PAMI, Jan. 2007. [42] C. Wojek, S. Walk, S. Roth, and B. Schiele. Monocular 3d scene understanding with explicit occlusion reasoning. In CVPR, 2011. [43] C. Wojek, S. Walk, and B. Schiele. Multi-cue onboard pedestrian detection. In CVPR, 2009
Introduction(1) To solve these issues proposed method: • People have variety poses : Fusing multiple person detection method and some observations • Complexity of the motion patterns of multiple people in the same scene Build a motion model that capture the interaction between targets • Changeable scene and illumination effect Proposed a novel 3D model which explain the process of video generation
Introduction(2) Observation cues:
Introduction(3) Build 3D Model:
Introduction(4) Particle filter: 1.Def: posterior density estimation algorithms that estimate the posterior density of the state-space by directly implementing the Bayesian recursion equations 2.Using sampling for generating state distribution of posterior and using resampling To reconstruct the new distribution
Introduction(5) Reversible-Jump Markov Chain Monte Carlo(RJMCMC): A class of algorithms for sampling from probability distributions based on constructing a Markov chain which allows changes of the dimensionality of the state
Proposed Method System overview: 1.Using observation cues to generate detection hypotheses and an observation Model 2.Build a motion model account both for people’s unexpected motions as well as interactions between people 3. Sampling procedure for the RJ-MCMC tracker which include evaluation(resampling)
Proposed Method Model representation:
Proposed Method • Using as random variables and model their relationship by joint posterior probability • The tracking problem can formulate as finding maximum-a-posteri (MAP) • Observation likelihood • Motion model (transition model) • Posterior at time t-1
Proposed Method • Observation likelihood: Camera projection function:
Proposed Method Target Observation Likelihood: j:detectors wj: weight for detector j
Proposed Method Target Observation Likelihood: 1) pedestrian detector 2) upper body detector 3) target-specific detector based on appearance model 4) detector based on upper-body shape from depth 5) face detector 6) skin detector 7) motion detector
Proposed Method Pedestrian and upper body detector using HOG:
Proposed Method Face detector using OpenCV Viola-jones face detector:
Proposed Method Skin color detector using threshold on HSV color space:
Proposed Method Depth shape detector using world coordinate system:
Proposed Method Motion detector by project motion points into image plane and threshold:
Proposed Method Geometric Feature likelihood by interest point detector: is the uniform distribution
Proposed Method Motion prior:
Proposed Method Camera motion prior:
Proposed Method Target motion prior:
Proposed Method Existence prior:
Proposed Method Motion prior: Independent Interacting
Proposed Method Independent Motion prior : update
Proposed Method Interacting Motion prior: Mode variable
Proposed Method Repulsion: Group motion: Repulsion force
Proposed Method Tracking by Reversible Jump Markov Chain Monte Carlo Particle filtering: • Sampling: • Convert posterior problem:
Experimental result • Using ETH dataset [12] • Video frame rate ~14Hz • Resolution 640*480 pixels
Experimental result • Single frame detection accuracy via overlap ratio between the ground truth bounding box and tracked bounding box.
Conclusion • Combine probabilistic model with joint variables • Relationship between the camera, targets’ and geometric features • Combine multiple cues • adaptable to different sensor configurations and different environments • Allowing people to interact • Automatically detecting people