170 likes | 430 Views
Semantic Video Classification Based on Subtitles and Domain Terminologies. Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades P ervasive C omputing R esearch G roup C ommunication N etworks L aboratory Department of Informatics and Telecommunications
E N D
Semantic Video Classification Based on Subtitles and Domain Terminologies Polyxeni Katsiouli, Vassileios Tsetsos, Stathes Hadjiefthymiades Pervasive Computing Research Group Communication Networks Laboratory Department of Informatics and Telecommunications University of Athens – Greece KAMC ‘07 @ Genoa, Italy polina@di.uoa.gr polina@di.uoa.gr polina@di.uoa.gr b.tsetsos@di.uoa.gr b.tsetsos@di.uoa.gr b.tsetsos@di.uoa.gr shadj@di.uoa.gr shadj@di.uoa.gr
Outline • The Polysema Platform • Introduction-Motivation • VideoCategorizationMethod • ExperimentalEvaluation • Conclusions - Future Work
Polysema platform • Develops an end-to-end platform for iTV services • Semantics-related research focuses on the development of: • semantics extraction techniques for automatic annotation of audiovisual content, • a personalization framework for iTV services with SW technologies, • a tool with GUI for video annotation and MPEG-7 metadata creation http://polysema.di.uoa.gr
Introduction - Motivation • Multimedia databases are becoming popular • Most video classification methods are based on visual/audio signal processing • Text processing is more lightweight than visual/audio processing • High-level semantics are more closely related to human language than to visual features • Subtitles capture the semantics of the corresponding video
Step 1: Text Preprocessing • Subtitles are segmented into sentences • A Part of Speech Tagger is applied to each sentence • Stop words (e.g., “to”, “him”) are removed based on a stop words list
Step 2: Keyword extraction • We used the TextRank algorithm to extract keywords • TextRank • represents the text as a graph, • applies to the vertices a ranking algorithm based on Google’s PageRank, • sorts vertices in decreasing rank order, • extracts the top highly ranked vertices for further processing TextRank: Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts, in Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), Barcelona, Spain, July 2004
Step 3: Word Sense Disambiguation • Words have many possible meanings, called senses • A Word Sense Disambiguation (WSD) algorithm is applied to determine the correct sense of each word • WSD • is based on the lexical database WordNet, • is a variation of Lesk’s WSD algorithm WSD: Banerjee, S., Pedersen, T.: An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet. In the Proceedings of the 3rd International Conference on Intelligent Text Processing and Computational Linguistics (CICLING-02) Mexico City, Mexico (2002)
Step 4: WordNet Domains Extraction (1/2) • augment WordNet with domain labels • a taxonomy of ~200 domain labels • synsets have been annotated with at least one domain label WordNet domains WN domains: http://wndomains.itc.it/wordnetdomains.html
Step 4: WordNet Domains Extraction (2/2) • For each video: • Extract the WordNet domains for each keyword’s sense • Calculate the frequency occurrence of each domain label • Sort domain labels in decreasing order according to their occurrence frequency
Step 5: Correspondences between categories & WN domains • For each category label: • Look up in WordNet the senses related to it (include senses related through hypernym & hyponym relations) • Obtain the corresponding WordNet domains • Calculate the occurrence score for each domain • Sort domains in decreasing occurrence order Example:
Step6: Category label assignment • Compare the ordered list with the WN domains of each video with the ordered list of the WN domains of each category Example: WN domains of a video science animals
Experimental Evaluation (1/2) • 36 subtitle files of documentaries • 36 subtitle files of documentaries Statistical information of files (average values): • Classify under the categories: geography, animals, history, war, technology, science, accidents, music, transportation, people, religious, politics, arts • Classify under the categories: geography, animals, history, war, technology, science, accidents, music, transportation, people, religious, politics, arts
Experimental Evaluation (2/2) • Classifiers: • Proposed method • Proposed method in which Step 6 has been replaced with Spearman’s footrule distance • J4.8 • decision tree classifier • supervised approach
Conclusions – Future Work • Conclusions • A novel approach that is based only on text and uses natural language processing techniques • No training phase is required (unsupervised approach) • Future Work • The application of a method on a per video segment basis • Definition of domain knowledge more close to movie classification • Performance comparison with other unsupervised approaches
Thank you! Questions??? http://p-comp.di.uoa.gr