280 likes | 670 Views
Multimedia Data Mining. Arvind Balasubramanian arvind@utdallas.edu Multimedia Lab The University of Texas at Dallas. Me and My Research. Research Interests: Machine Learning Data Mining Statistical Analysis Applications of the above in Multimedia I am currently working on
E N D
Multimedia Data Mining ArvindBalasubramanian arvind@utdallas.edu Multimedia Lab The University of Texas at Dallas
Me and My Research • Research Interests: • Machine Learning • Data Mining • Statistical Analysis • Applications of the above in Multimedia • I am currently working on • Optimizing index and retrieval structures for human motion data • Analysis of Tongue motion data to identify baseline characteristics of pronunciations (classification, speech therapy)
Data Mining and Multimedia • Uncovering hidden information from data. • Exploiting data to obtain new knowledge and interpret results. • Immense applications in Multimedia.
Data Mining Techniques • Classification • Prediction • Cluster Analysis & Class Discovery • Extraction and Retrieval • Statistical Analysis
Ideas for Projects Text Mining • Information Extraction from Domain-specific documents • involves extracting data from free text pieces and populating a database • Serves to organize required information available in unorganized form • Not enough in itself; combine with class discovery
Ideas for Projects Text Mining • New Class Discovery using Clustering techniques • identifying groups of keywords that do not fall into known categories • creating new categories and validating them • Possibly employ clustering algorithms with proper similarity measure or distance functions
Ideas for Projects Text Mining (contd.) • Query-based document retrieval system • employ one of several base models such as a probabilistic model or a vector space model • design an efficient indexing system • include relevance ranking feature • possibly make the system intelligent using machine learning techniques
Ideas for Projects Pattern Recognition in Multimedia Data • Scope • analyze and identify interrelationships within Multimedia data sets • Derive a composite score from several different sub-scores • Methods • classic techniques like Principal Component Analysis (PCA) and Factor Analysis (FA) • Statistical methods such as Regression analysis
Ideas for Projects Pattern Recognition in Multimedia Data (contd.) • Methods • Principal Component Analysis (PCA) • Dimensionality Reduction • Efficient Storage and Retrieval of Media data • Applications in any multi-dimensional media: Images (noise reduction), Video (content analysis), Audio (Voice Signature recognition)
Ideas for Projects Pattern Recognition in Multimedia Data (contd.) • Methods • Factor Analysis (FA) • Minimize data redundancy • Reveal hidden patterns • combining attributes to form a single attribute by determining the importance and contribution of each attribute • Medical analysis, IQ tests, Personality tests, Software measurement, Multimedia content analysis, Motion Capture Data analysis.
Ideas for Projects Pattern Recognition in Multimedia Data (contd.) • Methods • Statistical Analysis • Correlation analysis to bring out interrelationships between data attributes • Regression analysis to analyze the ability of a set of data attributes to predict other data attributes
Ideas for Projects Prediction and Suggestion Systems • An intelligent shopping application or a movie review application that • learns from user ratings or purchases, and suggests other products or options • Examples: Netflix & Amazon • Many machine learning techniques could be employed: Bayesian reasoning and classification algorithms like Adaboosting
Ideas for Projects Prediction and Suggestion Systems • An intelligent media hosting application that • learns from user queries and requests, and accordingly suggests other media items • Suggested items would be retrieved by querying on the features of the media features and metadata • Examples: Esnips music hosting • Many machine learning techniques could be employed: Bayesian reasoning and classification algorithms
Ideas for Projects • Ideas for alternative projects having to do with applications of machine learning, data mining and statistical analysis in the domain of multimedia are welcome. • Tools – Weka, Matlab, Statistical software packages (even Excel helps a lot!!).
Thank You • ArvindBalasubramanian • arvind@utdallas.edu • Multimedia Lab • The University of Texas at Dallas