380 likes | 539 Views
Tracking Pedestrians Using Local Spatio-Temporal Motion Patterns in Extremely Crowded Scenes. Louis Kratz a nd Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2012. Outline. Motivation Introduction Proposed method Experimental results Conclusion.
E N D
Tracking Pedestrians Using LocalSpatio-Temporal Motion Patterns inExtremely Crowded Scenes Louis Kratzand Ko Nishino IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2012
Outline • Motivation • Introduction • Proposed method • Experimental results • Conclusion
Motivation • Goal: tracking single or multiple pedestrians in crowd scenes • Solve conventional tracking problems • Occlusion problem • Pedestrians move in of different directions • Appearance change
Introduction(1) • Observe a phenomenon
Observation • Small area of instantaneous motions tend to repeat • Temporal • Spatial
Introduction(2) • Spatio-temporal motion pattern • Describe crowd motion • Build a Spatial and temporal statistical model • Use to predict movement of individuals
Spatio-temporal motion pattern • 3D gradient vector: • Calculate the mean motion vector or build a statistical model at each cuboid
Introduction(3) • Hidden Markov Model: • States are not directly visible • Compromise of three components • observation probabilities • transition probabilities • initial probabilities
Introduction(4) • Posterior distribution: given confidence X find probability of parameters
Introduction(5) • Particle filter: is a filter which can be used to predict next state • different from kalman filter: • Robust to non linear system and can handle non Gaussian noise • Measurement:
(a) Divide the training video into spatio-temporal cuboids and calculate motion vectors, and then build statistical model for each motion patterns • (b) Train a collection of hidden Markov models • (c) Use observed local motion patterns to predict the motion patterns at each location • (d) Use this predicted motion patterns to trace individuals
Step (a)-statistical model for motion patterns • 1.First we calculate the motion vector at each pixel by 3D gradient vector • 2.Next we build a statistical model by 3D Gaussian distribution
3. Define the local spatio-temporal pattern at location n and frame t
Step (b)-train hidden Markov models • 1. By clustering algorithm, divide motion patterns into S clusters • 2. Define states{s=1,…,S},and S is the number of clusters • 3. For a specific hidden state s, the probability of an observed motion pattern is: Calculate variance between two distributions
Step(c)- predict motion patterns • Taking expected value of the predictive distribution: • Solve by forwards-backwards algorithm • Reference: • [23] L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,”Proc. IEEE,vol. 77, no. 2,pp. 257-286, Feb. 1989.
Step(d)-trace individuals • Use particle filter maximize posterior distribution : • Compare to: P(xf) priors posterior likelihood
xf-1=[x,y,w,h]T in frame f-1 • Figure present state vector xf-1 define a target window at frame f-1
Past and current measurement: • zf is the frame at time f
priors • We use the motion pattern at the center of tracked target to estimate priors on the distribution of next state xf
Transition distribution • P(xf|xf-1) is the transition distribution • We model by normal distribution: is the 2D optical flow vector from predicted motion pattern [27] is the covariance matrix from predicted motion pattern distribution Reference: [27]J. Wright and R. Pless, “Analysis of Persistent Motion Patterns Using the 3D Structure Tensor,”Proc. IEEE Workshop Motion and Video Computing,pp. 14-19, 2005
Likelihood distribution • T: template of human object • R: region of bounding box at frame f • Z: constant • : variance respect to appearance change
Define distance measure: • ti: template gradient vector • ri: region gradient vector • M: number of pixels in template • If distance large, likelihood small If distance small, likelihood large
Add weight information to adjust appearance change • Error account to appearance change • pixels from occlusion region have large angle between t and r thus error Ei large • When Ei large weight becomes small
Experimental results • Implementation : • Intel Xeon X5355 2.66GHz processor • 10 frames per seconds • cuboid size 10*10*10
Datasets • From UCF Crowd data set • 300,350,300,120 frames respectively • (a) train station’s concourse • (b) ticket gate • (c) sidewalk • (d) intersection
Experiment 1 • white indicate high error • error indicate little texture or • noisy area • intersection scene due to small amount amount of training data
When occlusion enormous, variance of likelihood increase at frame 56,112,201
Experiment 4 Errors cause by Innitial states not contain this direction
Conclusion • We proposed a efficient method for tracking individuals in crowded scenes • We solve the error caused by occlusion problem, appearance change, and different direction movement