90 likes | 200 Views
Mining the Web for Multimedia Question Answering. Yiming Yang Language Technologies Institute & Computer Science Department Carnegie Mellon University. Research Team. Yiming Yang (PI) Jaime Carbonell (Co-PI) Yan Liu (PhD student) Bora Cenk Gazen (PhD student) Fan Li (PhD student)
E N D
Mining the Web forMultimedia Question Answering Yiming Yang Language Technologies Institute & Computer Science Department Carnegie Mellon University
Research Team • Yiming Yang (PI) • Jaime Carbonell (Co-PI) • Yan Liu (PhD student) • Bora Cenk Gazen (PhD student) • Fan Li (PhD student) • Shoubin Dong (Visiting researcher)
A picture is worth a thousand of words Example 1. Student demographics • What are the race distribution of the international students in university X? • Is there an unexpected trend in the admission? • Do other universities have a similar trend recently?
Example 2. Macro-economics • What is the overall trend in the balance of trade between the US and mainland China? • Do similar cases exist in recent history, and if so, between which countries? • Is there any unexpected trend in trading agricultural products in Asian/Pacific region? • …
Technical Components • Templates defined for Q/A of certain types • Web crawling tools • Text categorization for relevant pages • HTML/XML parsing for tables & images • Information extraction techniques • Named Entities, tables, graphs & text • Templates connecting multi-media objects • Q/A mapping and reasoning • Collaborative filtering of frequent Q’s and A’s • Case-based reasoning across text, tables & graphs • Numerical analysis of graphical similarity (lines)
Hypotheses • Answers combining text and pictures are beneficial for certain types of questions. • Valuable information in multi-media forms can be gathered from the Web. • Information extraction & comparative image analysis (graphs, curves) can be jointly used for cross-media data mining. • Template-based reasoning can serve as a vehicle for multi-media Q/A.