190 likes | 202 Views
CSE 574 Extracting, Managing & Personalizing Web Information. Staffing Dan Weld Raphael Hoffmann Content Intersection of AI, ML, DB & HCI Student Responsibilities Reading, Reports, Discussion Project (for those taking 3 credits). Class Focus. Extracting, Managing &
E N D
CSE 574 Extracting, Managing & Personalizing Web Information • Staffing • Dan Weld • Raphael Hoffmann • Content • Intersection of AI, ML, DB & HCI • Student Responsibilities • Reading, Reports, Discussion • Project (for those taking 3 credits)
Class Focus Extracting, Managing & Personalizing Web Information
Why Information Extraction • Next-Generation Search • Citeseer, Google scholar, MSRA Libra • Google product search • Flipdog • Zvents • Zoominfo • Question Answering
Making Structured Content • Information Extraction • E.g. Google Scholar • Cons: Noisy • Communal Content Creation • E.g. Wikipedia • Cons: Bootstrapping & Incentives
Why Managing ? • Select • Store, Index, Aggregate • Search, Query, Explore • Share, Collaborate, “Publish” Example: Personalized Portals cf DBlife, Rexa, Dontcheva UIST-07
Why Personalize? • Because we can.
Preliminary Schedule • Information Extraction • Traditional Machine Learning Approaches • Self-Supervised Methods • Other Issues: Coreference & Ontology • Collaborative Content Creation & UI Issues • Applying Contraints from Interaction to Learning • Decision Theoretic Interaction • Faceted Interfaces • Community Information Management • Extraction over Evolving Text • Data Provenance • Mashups & Personalized Web • Next-Generation Search • Inference, Textual Entailment, Machine Reading • Entity Search
For next time • Read • Agichtein, Gravano. Snowball: Extracting Relations from Large Plain-Text Collections. • Add yourself to mailing list • Look at papers on website wiki • Add new ones • Add summary (different from report) • Notate if you wish to present one • Think about project / (form a group?)