410 likes | 531 Views
Watson Systems. By- Team 7 : Pallav Dhobley 09005012 Vihang Gosavi 09005016 Ashish Yadav 09005018. Motivation:. Deep-Blue’s Triumph over Kasparov in 1997. In search of new challenge. Jeopardy!. 2004 – Search ends! One of the most popular Quiz show in U.S.A.
E N D
Watson Systems By- Team 7 : PallavDhobley 09005012 VihangGosavi 09005016 AshishYadav 09005018
Motivation: • Deep-Blue’s Triumph over Kasparov in 1997. • In search of new challenge.
Jeopardy! • 2004 – Search ends! • One of the most popular Quiz show in U.S.A. • Broad/Open Domain. • Complex Language. • High Speed. • High precision. • Accurate Confidence.
Jeopardy! • 2004 – Search ends! • One of the most popular Quiz show in U.S.A. • Broad/Open Domain. • Complex Language. • High Speed. • High precision. • Accurate Confidence. *le IBM
Easier than playing Chess? • Chess: • Finite moves and states. • Mathematically well defined • search space • Symbols have mathematical • meaning • Natural Language: • Implicit • Highly Contextual • Ambiguous • Imprecise
Easier than playing Chess? NO!! • Chess: • Finite moves and states. • Mathematically well defined • search space • Symbols have mathematical • meaning • Natural Language: • Implicit • Highly Contextual • Ambiguous • Imprecise
Easy Question (LN(1,25,46,798*π))^3 / 34,600.47 = ?
Easy Question: (LN(1,25,46,798*π))^3 / 34,600.47 = 0.155
Hard Question: • Where was our “father of nation” born? - contextual. - imprecise. • Easy for us Indians to relate term “father of nation” with M.K. Gandhi. • Not the same with computers. • Need of learning from As-Is content.
What is Watson? • Advanced Search Engine? × • Some fancy Database Retrieval System? × • Beginning of Sky-Net? × • Science behind an Answer? √
Principles of DeepQA: • Massive Parallelism - Each hypothesis and interpretation is analyzed independently in parallel to generate candidate answers. • Many experts - Facilitate the integration and contextual evaluation of a wide range of analytics generated by several algorithms running in parallel.
Principles of DeepQA (ctd.) • Pervasive Confidence Estimation - No component commits to an answer • Integrate shallow and deep knowledge - Using shallow and deep semantics for better precision e.g. Shallow semantics : Keyword matching Deep semantics : Logical Relationships
Step 0 : Content Acquisition • Identifying and gathering the content to be used for answering and evidence supporting. • Involves analyzing example questions from the problem space which consists of Q-A from previous games. • Encyclopedias, dictionaries, wiki pages etc. are use to make up the evidence sources. • Extract , verify and merge the most informative nuggets as a part of content acquisition.
Step 1 : Question Analysis The initial analysis that determines how the question will be processed by the rest of the system. • Question Classification e.g. puzzle/math • Focus and (Lexical Answer Type)LAT e.g. “On this day” LAT – date/day • Relation Detection e.g. sea(India, x, west) • Decomposition - divide and conquer.
Step 2 : Hypothesis Generation • Primary search : • Keyword based search • Top 250 results are considered for Candidate Answer generation. • Empirical statistics : 85% time answer is within top 250 results. • CA generation : above results are further processed for CA generation. • Soft Filtering • It reduces set of candidate answers using superficial analysis (machine learning). • Reduction in number of CA to approx. 100 • Answers are not fully discarded , may be reconsidered at final stage.
Step 2: Hypothesis Generation (ctd.) 4. Each CA plugged back into the question is considered a hypothesis which the system has to prove correct with some threshold of confidence. 5. If failed at this state , system has no hope of answering the question whatsoever. • Noise tolerance.
Step 3 : Hypothesis & evidence scoring • Evidence retrieval : • Further evidences are gathered to support the Hypothesis formed in last step . e.g. Passage search: gathering passages by adding CA to primary search query. • Scoring: • Deep content analysis • Determines degree of certainty that retrieved evidence supports the CA.
Step 4 : Final Merging and Ranking • Merging: • Merging all the hypothesis which give you the same answer. • Using an ensemble of matching, normalization and co-reference resolution algorithms, Watson identifies equivalent and related hypothesis. • Ranking and confidence estimation: • The final set of hypothesis after merging are ran over set of training questions with known answers.
Example : • Q : “Who is the antagonist of Stevenson's Treasure Island?” • Step 1 : Parse and generate a logical structure to describe the question. -antagonist(X) -antagonist_of(X, Stevenson’s TI) -adj_possesive(Stevenson, TI)
Example (ctd.): • Step 2: Generating semantic assumptions - island (TI) -book(TI) - movie(TI) -author(Stevenson) -director(Stevenson) • Step 3:Builds different semantic queries based on phrases, keywords and semantic assumptions. • Step 4 : Generates 100s of answers based on passages, documents and facts returned from 3. Long-John Silver is likely to be one of them.
Example (ctd.): • Step 5:Formulate evidence in support or refutation. (+VE) evidence : 1. Long-John Silver the main character in TI. 2. The antagonist in Treasure Island is Long-John Silver 3. Treasure Island, by Stevenson was a great book. (-VE) evidence : Stevenson = Richard Lewis Stevenson antagonist = Wolverine
Example (ctd.): • Step 6: - Combine all the evidence and their scores. - Analyze evidences to compute confidence and return the most confident answer. Long-John Silver in this case !
Watson’s Brain (Software): • Languages used : Java , C++ , prolog. • Apache Hadoop framework for distributed computing. • Apache UIMA framework. • Helps in DeepQA’s demand for Massive Parallelism. • Facilitated rapid component integration, testing , evaluation • SUSE Linux Enterprise Server 11
Watson’s Brain(Hardware): • One Jeopardy! Question takes 2hours on normal desktop computer! • The real task - Confidence determination before buzzing. • High Time need of faster Hardware support.
Watson’s Brain: (ctd.) • Total Ninety POWER-750 servers. • Total 2880 POWER7 processor cores. • Total 16 Terabytes of R.A.M. • Each POWER-750 server uses a 3.5 GHz POWER7eight core processor, with 4 Threads per core. • Size of total 8 refrigerators. • Can process data up-to the speed of 500 GB/s.
The Final Blow! • 3 rounds of Jeopardy! Between Watson , Rutter& Jennings. • Watson comprehensively defeats it’s competitors with net score of $77,147 • Jennings managed $24,000. • Rutter ended third with $21,600.
The Final Blow! (ctd.) “I for one welcome our new computer overlords” - Jennings
Conclusion: • High performance analytics • Non-cognitive • Smart Learner • Not invincible
Watson & Suits • Tech support • Knowledge management • Business Intelligence • Improvised Information sharing
Watson for society- Health Care • Symptoms • Patient Records • Tests • Medications • Notes/Hypothesis • Texts, Journals Diagnosis Models Finding appropriate “Disease” , As per Asked by adjoining “Symptoms” and “Records”
References: • Watson Systems: http://www-03.ibm.com/innovation/us/watson/ • Wiki Page http://en.wikipedia.org/wiki/Watson_%28computer%2 • Research Papers: http://researcher.ibm.com/researcher/view_page.php?id=2121
References: • Jeopardy! IBM Watson Day 1 (Feb 14, 2011) http://www.youtube.com/watch?v=seNkjYyG3gI&feature=related • Science Behind an Answer- http://www-03.ibm.com/innovation/us/watson/what-is-watson/science-behind-an-answer.html • The AI magzine http://www.aaai.org/ojs/index.php/aimagazine/article/view/2303
References: • Philip Resnik. 1999.Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research. • Tom M. Mitchell. 1997. Machine Learning. Computer Science Series. McGraw-Hill.