A fuzzy video content representation for video summarization and content-based retrieval

A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias 2000 • Presented by Mohammed S. Al-Logmani

Agenda • Introduction • Motivation/ Problem Statement • Video Sequence Analysis • Fuzzy Visual Content Representation • Video Summarization • Content-Based Retrieval • Experimental Results • Future Work • Conclusion

Introduction • The increase amount of digital image & video data requires new technologies for efficient searching, indexing, content-based retrieving & managing multimedia databases. • Drawbacks with keyword annotations: • Large amount of effort for developing them. • Cannot efficiently characterize the rich visual content using only text

Introduction Cont. • Content-based algorithms • QBIC • VisualSeek • Virage • Cannot easily applied to video DBs. • Perform queries on every frame is inefficient & time consuming • Videos DBs. are distributed which impose large storage & transmission requirements

Introduction Cont. • Content-based sampling algorithms • Extract small but meaningful info. (summarization) • Require a more meaningful representation of visual content than the traditional pixel-based one • Related Work: • A hidden Markov model for color image retrieval • An approach of image retrieval based on user sketches • A hierarchical color clustering method • Construction of a compact image map or image mosaics for video summarization • A pictorial summary of video sequences based on story units

Motivation/ Problem Statement • Increase the flexibility of content-based retrieval systems • Provide an interpretation closer to the human perception • Result a more robust description of visual content • possible instabilities of the segmentation are reduced

fuzzy representation of visual content • Video summarization • Performed by minimizing a cross correlation criterion among the video frames using a GA. • The correlation is computed using several features extracted using a color/ motion segmentation on a fuzzy feature vector formulation basis. • Content-based indexing & retrieval • The user provides queries (images or sketches) which are analyzed in the same way as video frames in video summarization scheme. • A metric distance or similarity measure is then used to find a set of frames that best match the user's query.

Video Sequence Analysis • A color/motion segmentation algorithm is applied for visual content description • Multiresolution Recursive Shortest Spanning Tree (M-RSST) • recursively applies the RSST to images of increasing resolution. (a truncated image pyramid is created) • Produces same results as RSST with less time. • Eliminates regions of small segments

Video Sequence Analysis cont. • Factors affect the segmentation efficiency • The initial image resolution level • selected to be 3 (downsampling by 8x8 pixels) • The selection of threshold used for terminating the algorithm • Euclidean distance of the color or motion intensities between two neighboring segments • Terminate the segmentation if no segments are merged from one step to another.

Video Sequence Analysis cont.

Fuzzy visual content representation • The size & location cannot be used directly • segments # is not constant for each video frame • To overcome this problem, pre-determined classes of color/motion properties • To avoid the possibility of classifying two similar segments to different classes, a degree of membership is allocated to each class • Resulting in a fuzzy classification formulation • Create a fuzzy multidimensional histogram

Fuzzy visual content representation Cont. • Example: property (s) is used for each segment. • s takes values in [0,1] • It is classified into Q classes using Q membership functions • degree of membership of s in the nth class

Fuzzy visual content representation Cont. • Assume a video frame consists of K segments • First, evaluate the degree of membership of feature si = 1,2, … K, of the ith segment • Then, find the degree of membership of K in the nth class through the fuzzy histogram

Video summarization

Video summarization Cont. • Extraction of key-frames • Key-frames are extracted by minimizing a cross-correlation criterion, so that the selected frames are not similar to each other. • The generic approach (GA) • Similarities to the traveling salesman problem (TSP). • Initially, a population of m chromosomes is created. • Evaluate the performance of all chromosomes in population P(n) using a correlation measure. • Evaluate the chromosomes quality using fitness functions. • Select appropriate parent so that a fitter chromosome gives a higher number of offspring • The GA terminates when the best chromosome fitness remains constant for a large number of generations

Video summarization Cont. • Examined about170 shot, # Kf=6 , Q=3

Content-based retrieval • Apply the previous scheme to discard all the redundant temporal video information • The user can submit: • Images (query by example) • Sketches (query by sketch) • Analyze the query using M-RSST • Extract and classify the segments • Apply a distance similarity measure

Experimental results

Experimental results Cont.

Future Work • Increase the system accuracy by developing a fuzzy adaptive mechanism for estimating the distance weights.

Conclusion • Presented a fuzzy video content representation • Efficient for: • Video summarization • Content-based image indexing & retrieval • Experimental results indicate that this approach outperforms the other methods for both accuracy and computational efficiency

A fuzzy video content representation for video summarization and content-based retrieval