140 likes | 274 Views
Week 6. Fatemeh Yazdiananari. 1. Feature Extraction. 75 Validation Videos Were sent to the cluster for DTF feature extraction The extracted features are large in size Max size:152GB Min size: 540MB Total size: 1.6TB Programing Since the files are large Textscan, fgetl, dlmread
E N D
Week 6 • Fatemeh Yazdiananari 1
Feature Extraction • 75 Validation Videos • Were sent to the cluster for DTF feature extraction • The extracted features are large in size • Max size:152GB • Min size: 540MB • Total size: 1.6TB • Programing • Since the files are large • Textscan, fgetl, dlmread • Ran each feature and saved all the information into a mat file • Used the mat files to run the rest of the codes. 2
Histogram code • For each video • Obtain 4 feature matrix • Per feature find the closest codebook term and save the index • Using the saved indices run a histogram function • Save : video name, first 10 elements of features, Indices of Tr, Hof, Hog, and Mbh, and histogram of Tr, Hof, Hog, and Mbh. 3
OverView • 15 Validation Videos • Holistic • Histograms of 15 videos • Binary SVM • Sliding Windows (10sec long windows) • Histograms of the Sliding window • Binary SVM 4
Histograms • 15 validation videos • Normalized histograms HOF histogram for one video HOG histogram for one video 5
Histograms (cont.) MBH histogram for one video TR histogram for one video 6
Action recognition steps • Binary SVM • Trained using the UCF101 train splits (split 1) • obtained models • Tested using the 15 validation videos 7
SVM & Results • Added two UCF101 features • Class 1 and class 2 • Ran Binary SVM • Classification results (right) • Accurately predicted class 1 and 2 of UCF101 (11.76%) 8
Sliding window steps • Obtain the frame rate of the video (videoreader) • 10secs worth of frames: multiply to frame rate and get the frame number (window size) • Load the histograms of the 15 videos • Using the frame number we read in the histograms and save them into a structure • After all the videos are divided into their windows and saved we load them • run normalization • binary SVM • Trained on UCF101 split 1 • Tested on all the windows 9
Sliding Window Histogram • 15 Validation Videos • Divided into sliding windows per 10secs • Normalized histograms Hof histogram for one sliding window Hog histogram for one sliding window 10
Sliding Window Histogram (cont.) Tr histogram of one sliding window Mbh histogram of one sliding window 11
SVM & Results • Added two UCF101 features • Class 1 and class 2 • Ran Binary SVM • Classification results • Classified class 1 and 2 of UCF101 accurately • The sliding windows were misclassified as class 38 12
Conclusion • Action recognition using the existing methods on temporally untrimmed videos were done. • The videos are long YouTube videos (THUMOS’14) • The approach based on one Bag-of-words histogram per videos (state-of-the-art) obviously failed as expected. • The Sliding Window approach failed since: • The histograms extracted from the untrimmed clips may not include any particular action; • such histograms happen to be similar to typical histograms of Class 38 misclassification as class 38. • The effect of not having the exact boundaries of the shots/clips have a nontrivial impact on the formed histograms. 14
Next Steps • Visual comparison of the class 38 histograms with the untrimmed histograms a visual similarities should be observed. • Extracting the histograms from UCF101 videos and classifying them using the same code used for the untrimmed videos sanity check. • Manually identifying the boundaries of the action in the untrimmed videos and selecting the windows accordingly investigating the impact of unknown boundaries and partial windows. 15