MediaEval Workshop 2011

MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011

Introduction • Genre Tagging task: Given 1727 videos and 26 genre tags, decide which tag goes to which video. • Genres were – art, health, literature. Technology, sports, blogs, religion, travel, etc. • Videos were from an online video hosting site called blip.tv

Introduction cont.. • Data given to us: Videos, Speech transcripts, metadata and some user defined tags. • Total data/videos were divided into two sets. • Development set (consisting of 247 videos of which we were given the ground truth, so that we can play around with our algorithm). • Test Set (consisting of 1727 videos for which we were not given the ground truth and we had to submit our results in the workshop).

TUD-MIR at MediaEval 2011 Genre Tagging Task: QueryExpansion from a Limited Number of Labeled Videos

Main Idea • Information Retrieval approach • Just used the textual data • Using a relatively small number of labeled videos in the development set to mine query expansion terms that are characteristic of each genre.

Approach • Combined all the videos of the same genre in the development set together. • Apply preprocessing such as stop word removal and stemming. • Perform weighting and ranking of all the terms in the development set vocabulary. • And then use the top 20 terms from each genre document to be expanded query terms.

Offer Weighting Formula In the formula above, r is the number of videos of a particular genre in which term t(i) appears in, R is the total number of videos of that genre, N is the total number of videos in the collection and n is the number of videos in the collection in which term t(i)appears.

Few other Query Expansion Techniques • They also ran several query expansions: PRF, WordNet, Google Sets and YouTube. • To expand queries via YouTube, they first download metadata (e.g. title, description and tags) of the top-50 ranked videos returned by YouTube for each genre label, except for default category and sample 20 expansion terms from those using the Offer Weight as explained earlier.

LIA @ MediaEval 2011 : Compact Representation ofHeterogeneous Descriptors for Video Genre Classification

Main Idea • Classification approach • A method that extracts low dimensional feature space based on text, audio and video information. • Late fusion of SVM results for each modality.

Data Collection • Training data set was collected from the web. • They first expanded the query terms using Latent Dirichlet Allocation (LDA) on Gigaword corpus and then used top 10 expanded terms for each genre. • They Queried YouTube and Daily-motion for the videos (total of 3120 videos). • For textual data they used web pages from Google (1560 documents/web pages)

Features Extracted • Features – • Text: TF-IDF metric • Audio: Acoustic frames of MFCC every 10ms in a hamming window of 20 ms large. • Visual: Color structure descriptor or dominant color structure like homogeneous texture descriptor or edge histogram descriptor. Texture was the best feature according to them.

Classification • Each modality is separately given to SVM classifier and the scores of each are combined using linear interpolation.

User Name Similarity • They also tried to use the user name similarity in the training set. They refer to the relation of genres and user name as a knowledge base and use it to boost the genre scores. • So they increase the scores of genre for any video if the user name of that video exists in the knowledge base (development set).

TUB @ MediaEval 2011 Genre Tagging Task: Predictionusing Bag-of-(visual)-Words Approaches

Main Idea • Classification task • Bag-of-words approaches with different features derived from visual content and associated textual information

Features Extracted • Mainly textual features: • They translated foreign language program ASR in English using Google Translate. • Used Bag-of-Words (Tf-Idf) model for the textual features. • For visual features: • They used local feature SURF extracted from each key frame of video sequence.

Classification • Fusion: • Early fusion of visual and textual features and then SVM classification. • Classification: • Used multi-class SVM, Multinomial Naïve Bayes and Nearest Neighbor for classification.

SINAI-Genre tagging of videos based on information retrieval andsemantic similarity using WordNet

Main Idea • IR approach • Query expansion using WordNet • And different similarity measure rather than Cosine similarity

Approach • Query Expansion: Produce a bag of words using WordNet’s synonyms, hyponyms and domain terms for each genre term. • An existing framework, Terrier IR system, has been used to obtain a measure of relatedness between the videos and the genre terms.

Second Approach • They also used a formula proposed by Lin, which is based on WordNet, to measure the semantic similarity between the nouns detected in each test video and the bags of words generated for each genre category.

Then they only kept the matches which exceeded the threshold of 0.75 score. • Finally, the accumulated similarity score has been divided by the number of words detected in the video, obtaining the final semantic similarity score.

Results for all

MediaEval Workshop 2011

MediaEval Workshop 2011

Presentation Transcript

2011 MWBE Workshop

Resco Workshop 2011

Numeracy workshop KS1 2011

2011 CACCRAO Workshop

2011 Recipients Workshop

CMACS Workshop 2011 group2

2011 WPA Summer Workshop

ASCT Workshop – Phoenix 2011

2011-2012 Budget Workshop

Mediaeval/Renaissance

2011 Department Chair Workshop:

Mediaeval Bulgaria, 681-1393

2011 Applicants’ Workshop

2. ARISTOTELIAN-MEDIAEVAL LOGIC

ITU Workshop 2011

IEFC 2011 Workshop 22 March, 2011

Mediaeval Christian Historians

HIST 100 EARLY MEDIAEVAL EUROPE

WORKSHOP 2011

Statistics Workshop 2011

Collaborations Workshop 2011