320 likes | 459 Views
Statistical analysis and evaluation of spatio -temporal and compressed domains moving object detection. Presented by Rajesh Radhakrishnan Instructor: K.R Rao. Scope of the Project.
E N D
Statistical analysis and evaluation of spatio-temporal and compressed domains moving object detection Presented by Rajesh Radhakrishnan Instructor: K.R Rao
Scope of the Project • Three essential modules are required to obtain the comparative study of the moving object detection between two domains. • First is the manual annotation of hand locations using a GUI to get the co-ordinate location of the object in every frame. • Second is to obtain time series of hand locations based on spatio-temporal algorithm. • Third is to obtain time series of hand locations based on compressed domain algorithm.
Spatio temporal moving object detection: • Given a video test sequence, we need to perform background subtraction to separate the foreground moving objects from the background model. Fig 1: Block diagram of steps involved in spatio temporal object detection
Background modeling is a process of obtaining static image regions from a sequence of video. • Frame differencing is one of the technique to perform background modeling. • Parametric and non-parametric estimation model is a way to improve the candidate foreground object detection.
Parametric model • Simple Gaussian model is an example of a parametric model. Estimate parameters such as mean and standard deviation. • Consider a block of ground truth image and estimate the mean and standard deviation of the block. • Ground truth can be defined as the area under the actual moving object in each frame. • P(color/RGB)=(P(RGB/color)*P(color))/P(RGB) by Bayes rule. • Where, P(color/RGB)= conditional probability and P(x) =probability of x.
P(RGB/color) is estimated from the training set, which is the Gaussian probability of RGB/color(mean ,std), where std = standard deviation. • Assume P(R), P(G), P(B) are mutually independent, then Gaussian probability of P(RGB/color) is given by, P(RGB/color)=P(R/color)*P(G/color)*P(B/color) where P(R/color(mean, STD))= Gaussian_probability(R(i,j),mean,std) • where, 1≤i≤N, 1≤j≤M, of an image of size NxM
An example to find P(RGB/color) • Consider the sub-block image shown in Fig 1 to estimate the mean and standard deviation of the green color object to be detected. This is an sub-block image of size 80x60, find the mean and standard deviation of each color band separately. Fig 2:Sub-block image to estimate mean and standard deviation.
How to estimate P(color) and P(RGB) ?? • P(color) can be of any value, this determines the color adjustment factor. • Some of the prior probabilities are shown, Prior Probabilities P(x) P(x) P(x) X X X Fig. 2.1 Prior probability distribution [4] Fig. 2.2 Prior probability distribution [4] Fig. 2.3 Prior probability distribution [4] P(RGB) =1, as shown below, P(RGB)= P(RGB/color)* P(color)+P(RGB/non_color)*P(non_color)
Non-parametric Estimation • Non- parametric model does not require any parameter estimates. • Histogram based distribution is one of the non parametric model. • For basic color object detection like red, green and blue colors, approximation based method can be done.
Suppose a green object has to be detected, given a color image of dimension NxMx3, • Subtract the green pixel region with the rest to get the probability of green distributions in the image Green_dist(i,j)=2*Image(i,j,2)-Image(i,j,1)-Image(i,j,3)
Experimental Results- Spatiotemporal object detection • The spatio-temporal moving object detection algorithm was tested on two video sequences one with a nearby object and another with a far end object. • Four set of output were generated, with single and multiple detection boxes.
Object detection output of a frame with three detection boxes Fig 3. Close up video Frame # 17, 3 detection boxes. Fig 4. Distant video Frame # 19, 3 detection boxes.
Object detection output of a frame with single detection box Fig 5. Close up video Frame # 19, 1 detection box Fig 6. Close up video Frame # 17, 1 detection box
Compressed Domain Object detection • Motion vector estimate is used to predict the moving object block. Algorithm: • Rearrange the frames from bit stream order to display order. • Consider three pairs of arrays present, past and future for storing the motion vectors. • The process of inputting the motion vectors into correct arrays and reordering frames were incorporated into the decoder.
Each video sequence is divided into one or more group of pictures (GOPs), the display order of the GOPs will be of the form given in fig . 7, • Here I , B and P are intra-coded, bidirectional prediction and predicted frames. Fig 7: MPEG group of pictures – Display order [11]. But the encoder output in bit stream order will be of the form I P B B P B B I B B P B B. [11]
Converting from bit stream order to Display order Bit stream to display order conversion Fig.8. Block diagram illustrating conversion from bit stream order to display order [11].
If an P frame is encountered, place it in a temporary storage called future. • P frame will be left in the future until another I or P frame comes in, on arrival of a new I or P frame, the already existing I or P frame is removed from the future and put in the display order. • All B frames are immediately put in display order. • Next step is to obtain the motion vectors from these frames.
Frame handling- Program Operation Fig.9. Flow chart of the operational program. [9]
Each incoming frames are placed in a past, present or future array locations based on their type (i.e) either a P, I or B frame. • The size of the array will be equal to the frame size in macro blocks, (i.e) the frame size used in this project is 240x320, for a motion vector of block size 8x8, array size would be 30x40. • Once the motion vectors are stored, the next step is to find the motion from frame to frame.
Finding motion from frame to frame • The output of the present and past frame array motion vectors are used to find the motion from frame to frame. Table 1: Constraints to be taken into account.
For example, consider a transition from B frame to a P or B frame, it has both the forward and backward vector to be considered. • let a B frame macro block motion vector have values (4, -6) for forward prediction and (-6,1) for backward prediction. • Let a P frame macro block motion vector have values (9,-7) for forward and (0,0) for backward, as P frame doesn’t have a backward prediction. • Total motion will be average of forward and backward prediction. • Forward = (9,-7) – (4,-6)=(5,-1) , backward = (0,0) – (-6,1) = (6,-1)
The corresponding motion vector values are written into a file one for horizontal and another for vertical and its values were plotted using MATLAB. • The motion vector which gave a maximum direction was spotted and its corresponding spatial domain coordinate location was noted. • For example, suppose the array location (16,24) gave the maximum motion vector magnitude, then the corresponding spatial coordinates was marked as (128,192).
Motion vector Plot Vertical Motion Horizontal Motion Fig 10:Motion vector values from frame #15 of close detect1
Close detect frame #15 Fig 11:Corresponding spatial domain frame of the moving object In this particular example the object was correctly detected from the motion vector estimate.
Problems/ Constraints in object detection using motion vectors • If the test video has moving background object other than object to be detected then the object detection accuracy will be very less. • For a specific moving object detection, in our case a color object, even the user hand which moves the object that has to be detected can be falsely classified as correctly detected object. • These constraints results in reduced accuracy of detection.
GUI for annotating hand locations: Fig 12: GUI for annotating hand locations
Accuracy of Detection Fig 13: Evaluation of accuracy 1. Accuracy of detection= correctly classified object/(correct answers). 2. The rectangular boxes namely correct answers and questions answered are the 40x40 detection box. 3. Where, correct answers detection box are obtained manually from the GUI used earlier and questions answered are the results returned from the object detection algorithm. 4. Correctly classified objects is the region of overlap between the two detection boxes.
Results: • The results were obtained by considering the object detection from two different test video sequences. The results are tabulated as follows. Table 2: Performance of moving object detection
Further Improvements • The accuracy of moving object detection in compressed domain can be further improved by considering parameters like DCT coefficients associated with each macro block. [6]. • Including pre-processing and post-processing steps can enable object detection algorithm to adopt to non-stationary object movements such as waving trees, image changes due to camera motion and illumination changes.
References • [1] Z. Qiya and L. Zhicheng, “Moving object detection algorithm for H.264/AVC compressed video stream”, ISECS International Colloquium on Computing, Communication, Control and Management, pp.186-189, Sep. 2009. • [2] T. Yokoyama, T. Iwasaki, and T. Watanabe,” Motion vector based moving object detection and tracking in the MPEG compressed domain”, Seventh International Workshop on content based Multimedia Indexing, pp. 201-206, Aug. 2009. • [3] Kapotas K and A. N. Skodras,” Moving object detection in the H.264 compressed domain”, International Conference on Imaging systems and techniques, pp.325-328, Aug. 2010. • [4] S. C Sen-Ching and C. Kamath,” Robust techniques for background subtraction in urban traffic video” Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Jul. 2004. • [5] S. Y. Elhabian and K. M. El-Sayed,” Moving object detection in spatial domain using background removal techniques- state of the art”, Recent patents on computer science, Vol 1, pp. 32-54, Apr. 2008.
[6] O. Sukmarg and K.R Rao,” Fast object detection and segmentation in MPEG compressed domain”, IEEE TENCON 2000, proceedings, pp. 364-368, Mar. 2000. • [7] W.B. Thompson and Ting-Chuen,” Detecting moving objects”, International journal of computer vision, pp. 39-57, Jun. 1990. • [8] JM software - http://iphome.hhi.de/suehring/tml/ • [9] V. Y. Mariano, et al,”Performance evaluation of object detection algorithms” International conference on pattern recognition, Vol.3, pp. 965 – 969, June 2002. • [10] J. C Nascimento and J. S Marques,” Performance evalaution of object detection algorithms for video survillance”, IEEE Transactions on multimedia, Vol. 8, pp. 761-774, Dec. 2006. • [11] J Gilvarry,”Calculation of motion using motion vectors extracted from an MPEG stream”, Proc. ACM Multimedia 99, Boston MA, pp 3-50, Sept 20, 1999. • [12] S. Aramvith and M. T Sun, “MPEG-1 and MPEG-2 Video Standards”, image and video processing handbook, Vol-2, pp- 320-342, June 1999. • [13] FFMPEG - http://www.ffmpeg.org/download.html