100 likes | 207 Views
A Brief Overview of Watson. CSC 9010 Spring 2011. Paula Matuszek. Watson. QA system developed by IBM and collaborators “massively parallel probabilistic evidence-based architecture” Hardware is a high-end IBM system, the IBM Power7 platform. 10 Power7 server blades 90 servers
E N D
A Brief Overview of Watson CSC 9010 Spring 2011. Paula Matuszek
Watson • QA system developed by IBM and collaborators • “massively parallel probabilistic evidence-based architecture” • Hardware is a high-end IBM system, the IBM Power7 platform. • 10 Power7 server blades • 90 servers • 4 processors/server • 8 cores/processor • Robotic arm to press the buzzer. • Input is text only, no speech recognition, no visual. CSC 9010 Spring 2011. Paula Matuszek
Watson • Software is built on top of UIMA: unstructured information management application. UIMA is a framework build by IBM and since open-sourced. • The information corpus was downloaded and indexed offline; no web access during the game. • Corpus was developed from a large variety of text sources: • baseline from wikipedia, Project Gutenberg, newspaper articles, thesauri, etc. • extend with web retrieval, extract potentially relevant text “nuggets”, score for informative, merge best into corpus • Primary corpus is unstructured text, not semantically tagged or formal knowledge base. • About 2% of Jeopardy! answers can be looked up directly. • Also leverages semistructured and structured sources such as Wordnet and Yago. CSC 9010 Spring 2011. Paula Matuszek
Components of DeepQA • About 100 different techniques overall. • Content acquisition: corpus, sample games. Offline, before game itself. • Preprocessing • Natural Language Tools • Retrieve possible answers • Score answers • Buzz in • Game strategies CSC 9010 Spring 2011. Paula Matuszek
Preprocessing • Determine question category • factoid • decomposable • puzzle • Note: excluded questions with AV components and “special instruction” categories • Determine lexical answer type (LAT) • film? person? place? novel? song? • about 2500 in sample of 20,000 questions. About 12% of clues do not indicate type CSC 9010 Spring 2011. Paula Matuszek
Initial Natural Language Processing • Parse question • Semantically tag the components of the question • Reference or coreference resolution • Named entity recognition • Relation detection • Decomposition into subqueries CSC 9010 Spring 2011. Paula Matuszek
Retrieve Relevant Text • Component most similar to a web search • Focus is on recall • Search engines include Indri, Lucene, SPARQL • For some “Closed” LATs (All US States, presidents, etc) can generate candidate list directly • Otherwise extract actual answer • title? • person? etc • Several hundred hypotheses typically generated CSC 9010 Spring 2011. Paula Matuszek
Score Hypotheses • Evaluate candidate answers • soft filtering. Fast light-weight filters prune answers to about 100 • evidence retrieval. Additional structured or unstructured queries • Score answers • LOTS of algorithms! -- More than 50 components • Range from simple word counts to complex spatial and temporal reasoning • Creates an evidence profile: taxonomic, geospatial, temporal, source reliability, etc • Merge answers • Determine ranking and confidence estimation CSC 9010 Spring 2011. Paula Matuszek
And Game Strategies! • Picking a category • tries to find the daily double • goes for lower cost categories to help learn the category • When to bet? • Normally will buzz in if >50% certain • Will buzz in lower if only way to win • Will not buzz in if can’t lose except with a mistake • How much to bet? CSC 9010 Spring 2011. Paula Matuszek
References • A clip of the end of the Jeopardy! game http://www.youtube.com/watch?v=8W36OuMU0yE • A good high-level overview. theswimmingsubmarine.blogspot.com/2011/02/how-ibms-deep-question-answering.html • A detailed description: www.stanford.edu/class/cs124/AIMagzine-DeepQA.pdf • Many clips, blogs, and links: ibmwatson.com CSC 9010 Spring 2011. Paula Matuszek