180 likes | 400 Views
Fusing Multiple Video Sensors for Surveillance. By: Lauro Snidaro Ingrid Visentini Gian Luca Foresti. Presented By: Sushma Ajjampur Jagadeesh. Introduction. Why surveillance system? What is video surveillance system? Fundamental issues with surveillance system.
E N D
Fusing Multiple Video Sensors for Surveillance By: Lauro Snidaro Ingrid Visentini Gian Luca Foresti Presented By: Sushma Ajjampur Jagadeesh
Introduction • Why surveillance system? • What is video surveillance system? • Fundamental issues with surveillance system. • This paper presents a novel fusion framework for combining the data coming from multiple sensors observing a surveillance area.
Types of Sensor Fusion • Signal Level Fusion – Allows sensors of same type to be fused. • E.g. Imaging sensors. • Feature Level Fusion – Combines multiple and heterogeneous features extracted from a single image. • E.g. Audio and video sensors.
Example of Video Surveillance Setup • Impossible to apply fusion techniques at image level. • Therefore extract features from each sensor view and fuse them in a common representational format. • In video surveillance this can be achieved by tracking and projecting the target on a top view map.
Tracking • SensorLevelTracking • An algorithm is used to match the current detections with those extracted in the previous frame. • Enhances robustness of the tracker. • Heterogeneous features extracted from data can improve the tracking result quality. • Trackingviaclassification • A single classifier or classifier ensemble track an object separating the target from the background. • More robust to occlusions and illumination changes. • This approach is appealing when the classifier is updated by learning the current appearance of the target.
Projection • Gaussian Approximation • Full Likelihood Map
ProjectionofGaussianApproximation Projection of the point (xg, yg) representing the target via homographic transformation to the top-view map plane. • (xc, yc) The center of the bounding box represents the most likely position of the searched target in the current frame. • (xg, yg) Projection of (xc, yc) on the lower side of the bounding box touching the ground.
Proposed projection of a Gaussian distribution approximating the likelihood map within the search region. • In this work, instead of projecting a single point estimate of the target’s position according to each sensor, a likelihood function computed as a single Gaussian approximation is projected to the map plane for the search region of each sensor’s image plane. • Fusion via Independent Likelihood Pool.
LimitationsofGaussianApproximation • Single Gaussian approach can be limiting for multimodal distributions as it cannot be represented. • During occlusions or other ambiguous conditions the classifiers are not able to properly detect the target and the resulting likelihood could be multimodal.
ProjectionofFullLikelihoodMap • Project the likelihoods produced by the classifiers on the sensors’ image planes to the common top view map. • This projection reproduces all the characteristics of the original image plane. • The problem is that the projections are not in analytical form and should be properly normalized. • Fusion using Bayesian Inference.
ExperimentalResults TwoCameras experiment at a parking lot Field of view of both cameras in parking lot Centre: confidence of the detectors on the two cameras sequences
Fused projected likelihoods of the parking lot sequence. Top view of position estimates from first camera (dark red) and second camera (green). Fusion of trajectories obtained from two sensors using single Gaussian approximation and full likelihood projection.
ThreeCamerasexperimentatatrainstation Quality of the detection process for the three cameras
Gaussian fusion output Likelihood fusion result
Twocameraresult Threecameraresult
Conclusion • This paper showed how a multi-camera surveillance system can exploit data fusion for obtaining more precise estimates of a target’s position. • The result has been obtained through a novel framework that exploits the projection of the likelihood produced by online trained classifiers used for target tracking. • The method has the benefit of dynamically performing sensor fusion or selection depending on available data. • Future research - further experiment with the proposed approach along with investigating the addition of a filtering step before the fusion on the map plane.