210 likes | 340 Views
Using spatio-temporal probabilistic framework for object tracking. Emphasis on Face Detection & Tracking. By: Guy Koren-Blumstein Supervisor: Dr. Hayit Greenspan. Agenda. Previous research overview (PGMM) Under-segmentation problem Face tracking using PGMM
E N D
Using spatio-temporal probabilistic framework for object tracking Emphasis on Face Detection & Tracking By: Guy Koren-Blumstein Supervisor: Dr. Hayit Greenspan
Agenda • Previous research overview (PGMM) • Under-segmentation problem • Face tracking using PGMM • Modeling skin color in [L,a,b] color space – over-segmentation problem • Optical flow – overview • Approaches for using optical flow • Examples
Previous research • Complementary research to an M.Sc. Thesis research conducted by A.Mayer under the supervision of Dr. H.Greenspan and Dr. J. Goldberger. • Research Goal: Building a probabilistic framework for spatio-temporal video representation. • Useful for: • Offline – automatic search in video databases • Online – characterization of events and alerting on those that are defined as ‘suspicious’
Previous research Source Clip BOF 1 Labeled BOF Blob Extraction Parsing clip to BOF Build feature Space [L a b] Build GMM model In [Lab] space Under segmentation problem… Label BOF pixels Connect. Comp. On [L,a,b,x,y,t] Learn GMM model On [L,a,b,x,y,t]
Face Detection & Tracking • Most of the known techniques can be divided into two categories : • Search for skin color and apply shape analysis to distinguish between facial and non-facial objects. • Search for facial features regardless of pixel color (eyes,nose,mouth,chin,symmetry etc.)
Apply framework to track faces • The framework can extract and track after objects in an image sequence. • Applying shape analysis to each skin-colored-blob can label the blob as ‘face’ or ‘non-face’. • The face will be tracked by virtue of the tracking capabilities of the framework
Skin color in [L a b] • Skin color is modeled in [a b] components only • Supplies very good discriminability between ‘skin’ pixels and ‘not-skin’ pixels (high rate of True-Negative) • Not optimal in terms of True-Positive (leads to mis-detection of skin color pixels)
Over-segmentation of faces • Building blobs is done in [L a b] color space. • More than one blob might have skin color [a b] components • Solution : Unite all blobs whose [a b] are close enough to the skin color model (adaptive TH can be used)
Under Segmentation • Faces moving in front of skin-color background are not extracted well. • Applying shape analysis on the middle map yields mis-detection of faces.
Employing motion information • Motion information helps to distinguish between foreground dynamic objects and static background • 2 levels of motion information • Binary – indicates for each pixel whether it is in motion or not. Does not supply motion vector. Feature space: [L a b x y t m] where m={0,1} • Optical flow - supplies motion vector according to a given model. Feature space: [L a b x y t Vx Vy]
Optical Flow • Optical flow is an apparent motion of image brightness • If I(x,y,t) is the brightness, two main assumptions can be made: • I(x,y,t) depends on coordinates x,y in greater part of the image • Brightness of every point of moving object does not change in time
Optical Flow • If object is moving during time dt and its displacement is (dx,dy) then using Taylor series • According to assumption 2: • Dividing by dt gives the optical flow equation:
Optical Flow – Block Matching • Does not use the equation directly. • Divides the image to blocks • For every block in It it search for the best matching block in It-1. • Matching criteria: Cross Correlation, Square Difference, SAD etc.
Working with 8-D feature space Parsing clip to BOF • Connected component analysis: • Does not require initialization of the order of the model • Hard decision prone • GMM model via EM: • Initialized by K means. Requires initialization of K. • Impose elliptic shape on the objects • Soft Decision prone Build feature Space [L a b] Build GMM model In [Lab] space Label BOF pixels Connect. Comp. On [x,y,t,Vx,Vy] Learn GMM model [x,y,t,Vx,Vy] Frame By Frame Tracking
Frame by frame tracking Label by predicted parameters • Widely used in the literature • Can handle variations in object’s velocity • Tracking can be improved by employing Kalman filter to predict object’s location and velocity Update blob’s params Label by updated parameters Kill old blobs split blobs Create new blobs merge blobs Predict params for next frame
Examples • Opposite directions: • Optical Flow, Connected component (Extracted Faces), GMM • Same direction, different velocity • Optical Flow, Connected component, GMM (Faces) • Different directions – complex background • Optical Flow, Connected component, GMM: K=5,K=3,Faces • Variable velocity • Optical Flow, Connected component, GMM, Frame By Frame
Face tracking Optical Flow No motion info Connected component GMM Frame By Frame Car Tracking Optical Flow No Motion info GMM Flower garden Optical Flow No motion info Connected component GMM Real World Sequences
Summary • Applying probabilistic framework to track faces in video clips • Working in [L,a,b] color space to detect faces • Handling over segmentation • Handling under segmentation by employing optical flow information in 3 different ways: • Connected Component Analysis • Learning GMM model • Frame By Frame tracking
Further Research • Adaptive face color model • Variable length BOF (using MDL) • Using more complex motion model
Thank you for listening Questions ?