290 likes | 460 Views
Online Detection of Unusual Events in Videos via Dynamic Sparse Coding. Outline. Unusual Event Detection Video Representation Dynamic Sparse Coding Empirical Study Conclusions. Outline. Unusual Event Detection Video Representation Dynamic Sparse Coding Empirical Study Conclusions.
E N D
Online Detection of Unusual Events in Videos via Dynamic Sparse Coding
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Unusual events: Incidences that occur very rarely in the entire video
Unusual Event Detection • Easy-to-verify • Given a frame, fairly easy to decide if unusual events occur • Hard-to-describe • Cannot enumerate all possible unusual events • Cannot model unusual events directly • Solution: Model usual events instead, and claim anything different as unusual Easy to model usual events?
Challenges • Unsupervised learning • Only input is video itself • Online detecting • In most cases, cannot afford multiple runs through the video • Concept drift • Usual events change • Truly unusual event vs. Noisy usual event
Previous Works • Clustering Based Method (CVPR 2004) • Finding spatially isolated clusters • Reconstruction (IJCV 2007) • Space-time Markov Random Field (CVPR 2009)
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Video Features • Static features based on edges and object shapes • Image-level information • Dynamic features based on optical flow measurements • Motion information • Spatio-Temporal Interest Points • Obtained from local video patches • Shown to be useful in human action categorization
Spatio-Temporal Interest Point • Detection • Basic idea: generalize interest point detector from spatial domain to spatio-temporal domain • Spatial (image): Laplacian, Hessian, Harris corner detector, etc. • Spatio-temporal (video): spatio-temporal corners, Laplacian on spatial and temporal axis • Output: small video patches extracted from each interest point
Spatio-Temporal Interest Point (Cont.) • Description • Similar to detection, generalization of spatial method to spatio-temporal domain • Spatial: histogram of directional gradients – SIFT, HOG • Spatio-Temporal: gradients on x, y, and time directions
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Motivation of the Approach • Sparse Reconstruction • Reconstruct an event with other events in the video • Usual events: multiple appearances could find a few events to reconstruct it SPARSE • Unusual events: rare appearance need large amount of events for reconstruction DENSE • Concisely represent the knowledge of usual events
The Proposed Approach • Define events in the video • Sliding window runs through the video • Spatio-temporal interest points within the same window define an event • Knowledge of what are usual events • Store in the learned dictionary D • Abnormality of an event • Update dictionary D
(Ab)Normality • Reconstruction error • Sparsity regularization • Smoothness regularization
(Ab)Normality (Cont.) • Empirical demonstration
Optimization • Learning 𝜶 with Fixed D • Learning D with Fixed 𝜶 • Online Dictionary Update
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Video Data • Subway Surveillance Videos • Subway exit: 43 minutes, 65K frames • Usual events: people exiting subway • Unusual events: entering subway, loitering, etc. • Subway entrance: 1 hour 36 minutes, 144K frames • Usual events: people entering subway • Unusual events: exiting subway, no payment, loitering • Youtube Videos: 8 short videos • Different camera motion (rotation, zoom in/out, fast tracking, slow motion, etc.) • Different categories of targets (human, vehicles, animals, etc.) • Wide variety of activities and environmental conditions (indoor, outdoor).
Learned Dictionary • Subway exit surveillance video • Subway entrance surveillance video
Quantitative Comparison • Subway exit surveillance video • Subway entrance surveillance video
Analysis Experiment • Online Update of the Learned Dictionary • Our approach: update learned dictionary after observing new event • Comparing method: fixed dictionary
Detected Unusual Events • Subway Exit
Detected Unusual Events (Cont.) • Subway entrance
Detected Unusual Events (Cont.) • Youtube Videos • For each video, approximately the first 1/5 of video data is used to learn initial dictionary • Unusual event detection is carried out in the remaining video • Red boxes represent sliding windows that result in unusual event detection
Outline • Unusual Event Detection • Video Representation • Dynamic Sparse Coding • Empirical Study • Conclusions
Conclusions • Fully unsupervised dynamic sparse coding approach for detecting unusual events in videos • Bases dictionary is updated in an online fashion as the algorithm observes more data, avoiding any issues with concept drift.