160 likes | 277 Views
Authors: Tomas Brodsky, Robert Cohen, Eric Cohen-Solal, Srinivas Gutta, Damian Lyons, Vasanth Philomin, Miroslav Trajkovic. Philips Research USA. Course: CIS 750 - Video Processing and Video Mining Semester: Spring 2003 Presenter: Nilesh Ghubade (nileshg@temple.edu)
E N D
Authors: Tomas Brodsky, Robert Cohen, Eric Cohen-Solal, Srinivas Gutta, Damian Lyons, Vasanth Philomin, Miroslav Trajkovic. Philips Research USA. Course: CIS 750 - Video Processing and Video Mining Semester: Spring 2003 Presenter: Nilesh Ghubade (nileshg@temple.edu) Advisor: Dr Longin Jan Latecki Visual Surveillance in Retail Stores and Home
Agenda • Abstract • Introduction • PTZ Camera Calibration • Intruder detection and tracking with a PTZ camera • Video Content Analysis • Indexing and retrieval • Residential intruder detection • Object classification Radial-basis networks. • Conclusions • References
Abstract • Professional security market Retail stores monitoring. • Low-cost automated residential security. • Pan-Tilt-Zoom (PTZ) camera: • Intruder tracking • Calibration, enhanced camera control. • Video Content Analysis, Detection of security related objects and events. • This system does real time video processing and provides immediate alarms to alert the security operator. • Relevant information stored in database for later retrieval. • Residential monitoring Intruder detection system • Robust to changes in lighting • Object classification scheme based on radial-basis networks.
Introduction • Traditional commercial video surveillance systems: • Capture several hours or days worth of video. • Manual search tedious job. • Set Alarms Improvement over manual search method, but … • Alarms usually must be defined before capturing video. • Search limited to predefined binary alarms. • Cumbersome search if alarms are too simplistic or too many false alarms. • This system • Detects a whole range of events like ‘enter’, ‘people met’, ‘deposit object’, ‘leave’, etc… • Semantic indexing and retrieval process used for search. • Residential environment low cost factor introduces constraints: - • Grayscale cameras (instead of colored ones). • Limited computational power. • No supervision. • Robustness to environmental changes.
PTZ Camera Calibration • Pan-tilt-zoom (stationary, but rotating and zooming) camera advantage: - • One camera used for surveillance of large area. • Closely look at points of interest. • Knowledge of camera position and orientation is crucial for geometric reasoning: • Automatically pointing the camera to certain location, by clicking on its position on the area map. • Displaying current field of view of the camera. • Knowledge of internal camera calibration parameters is important for: • Tracking with a rotating camera. • Obtaining metric measurements • Knowing how much to zoom to achieve a desired view, etc… • Goal Automatic calibration of surveillance cameras. • Assumptions: • Camera principal point and the center of rotation of the pan and tilt units coincide. • The skew factor is zero. • The principal point does not move while the camera is zooming. • Maximum zoom-out factor s of the camera is known.
PTZ Camera Internal Calibration • Point the camera to a texture rice area in the room. • Camera zooms in and out completely to acquire two images I1 and I2 • Principal point is then determined by scaling down for factor s image I1 and finding the best match for so obtained template in the image I2 • Take two images at fixed pan and different tilt settings. • f = -(d / tan ) where • f = focal length at particular zoom setting, • d = displacement of the principal point w.r.t the two images. • = difference in the tilt angle. • Compute mapping between zoom settings and focal length, by fitting inverse of focal length (lens power) to the second order polynomial in zoom ticks. It can be shown that this fitting not only has desirable numerical properties (i.e. stability), but also yields linear solution.
PTZ Camera External Calibration • Assumes known camera height and allows the installer to determine camera position and orientation in the following manner: • The user points the camera at several points in the area and clicks on their respective position on the area map shown in the GUI in Fig 1. • Each time the user clicks in a point in the map, the system acquires current camera position and location on the map camera is point to. • The algorithm then computes camera position and orientation using data acquired at step 2.
Intruder detection and tracking • Target selection: First step of tracking process. • Placing ‘Target Rectangle’ (TR) on torso, head and part of trousers. • Hue and saturation color model. Model gray colors separately. • Represent TR by its combined color/gray histogram. • Motion detection: This system has procedure for recursive and fast histogram matching. • Issue velocity commands so that camera moves towards the TR and acquire next image. • This system has improved procedure for feature based image alignment that does not require any information on camera calibration and camera motion.
Video Content Analysis • System processes video in real-time: • Extracts relevant objects and event information. • Indexes this information into a database. • Issue alarms to alert the operator. • Separate retrieval software used to search for specific data in the database and to quickly review associated video content. • Assume stationary camera and use background subtraction technique. • Each video frame is compared with the background model. Foreground pixels extracted and grouped into connected components and tracked. • Event detection: • Simple events like enter/leave and merge/split are based on appearance and disappearance of foreground regions. • Event reasoning module generates more complicated events derived from simple event stream, based on user provided rules which specify sequence of events, length of time intervals, etc… • Hierarchies of events constructed using feedback strategy.
Indexing and Retrieval • Query types: • Merge/Split: Show all people that an identified pickpocket interacted with. • Color model: Group of employees talking without entertaining a customer. • New event: Theft Person hiding an object.
Residential Intruder Detection • Detection of moving objects proceeds in two steps: • First, a background subtraction technique used to detect pixels that differ from the background model. • Additional filter applied to classify such pixels into real objects or lighting changes. Example: • Separate person from his/her shadow. • Moving flashing light on a sofa in living room produces moving bright spot. Current system detects this as a lighting change. • Compare the gray-level structure of 3x3 or 5x5 neighborhood around each detected pixel (using normalized cross-correlation filters) with the corresponding region in the reference (background) image. If the comparison is similar then the pixel is marked as caused by lighting changes. • Foreground pixels grouped into objects. • Objects classified into people and animals, so that the system can suppress alarms caused by pets (false alarms).
Object classification • Objects are classified based on horizontal and vertical gradient feature that captures shape information. • The extracted gradients are used to train Radial Basis Function (RBF) classifier Architecture very similar to that of a traditional three-layer back-propagation (neural) network. • Overall classification performance 93.5%
Conclusions • Automated camera calibration and PTZ tracking. • Easy to use GRAPHICAL user interface. • Efficient indexing and retrieval of video content. • Improved object classification technique. • Surveillance/Security system for professional market (retail stores) and low-end market (residential).
References • C. Stauffer, W.E.L. Grimson, “Adaptive Background Mixture Models for Real-time Tracking”, Proc. Computer Vision and Pattern Recognition. • F. Bremond, M. Thonnat, “Object Tracking and Scenario Recognition for Video Surveillance”, Proc. IJCAI, 1997. • E. Stringa and Carlo S. Regazzoni, “Real-Time Video-Shot Detection for Scene Surveillance Applications”, IEEE Trans. Image Processing, Jan 2000 • R. P. Lippmann and K. A. Ng, “Comparative Study of the Practical Characteristic of Neural Networks and Pattern Classifiers”, MIT Technical Report 894, Lincoln Labs, 1991. • D.M. Lyons, T. Brodsky, E. Cohen-Solal and A. Elgammal, “Video Content Analysis for Surveillance Applications”, Philips Digital Video Technologies Workshop 2000.