450 likes | 754 Views
Watson: The Jeopardy! Machine. Robin Sturm. Who is Watson?. This is Watson! (His “face” at least). This is also Watson. What is Watson?. A computer developed by IBM to play Jeopardy!
E N D
Watson: The Jeopardy! Machine Robin Sturm
Who is Watson? This is Watson! (His “face” at least)
What is Watson? • A computer developed by IBM to play Jeopardy! • Watson was IBM’s next “Grand Challenge” after Deep Blue defeated Garry Kasparov in a chess tournament in 1997.
Watson’s Challenge • Winning Jeopardy was associated with intelligence. • Competing in Jeopardy required mastery of natural language processing, something computers had struggled with for awhile. • Jeopardy would go further than simple math and require much • All the same, mastery of question and answering would go further than just Jeopardy
Why Jeopardy • Jeopardy would be the platform to develop question-answering abilities • Jeopardy questions are unlike many others, they have linguistic nuances, puns, allusions that make it extremely complex. • Also incorporates timing and strategy.
What is Jeopardy? • Trivia based game show. • 2 rounds with 30 questions divided into 6 categories • 1 Daily double in round 1, 2 in round 2 • Open domain questions. (Any topic) • Contestants answer by buzzing in on a buzzer • Gain money for correctness, lose money if incorrect. • Final Jeopardy
AI required for Jeopardy • Natural Language Processing: determining what a question is asking for, what kind of answer is required, what are the key words involved. • Information Retrieval: Looking through all of the stored data and finding potential answers.
Watson Development • Piquant: Early question and answering computer. Performed well below threshold needed for Jeopardy. • Later became Blue J as it was developed to be more equipped to play Jeopardy. • Watson was continuously updated and improved to answer specific types of questions better.
How would you answer this question? On Sept. 1, 1715 Louis XIV died in this city, site of a fabulous palace he built. • Is it easier to answer the question if you are choosing from a list of options? • Paris, Athens, London, Versailles, Berlin, Milan, Vienna.
How to solve the problem Humans Computer Breaking down question Complicated task, several possibilities. No intuition Search through documents. Parallel computing Several answer possibilities Analyze confidence. Which of the possible answers is the best? • Breaking down question • Typically a brief process for humans • Intuition • You know it, or you don’t. • Searching through memory. • Think back to books, magazines, classes… • Analyze confidence. • Is it a wild guess? • Will incorrectness cost you dearly.
Question Analysis • Parse the question. • Break down into parts of speech • Find the key words. • Understand what the question is asking for. • Sends out queries. • Several possible interpretations of the question. • Each one aims to find its own answer. • Question classification • Identify the type. • Focus and LAT detection • Find the blank to fill in; “This” • Relation detection • Decomposition • Two part questions that can be better solved in parts
Using Predictive Annotation • Mark up question with category types. • Works well for who, what, where, when questions.
Hypothesis Generation • Each interpretation of the question is sent off to find possible answers. • Aims to propose many possibilities. • The more possibilities, the better. • Different algorithms find different types of answers. • Several search techniques • Document search (key word) • Passage search • Knowledge base search (database) • Candidate answers • 85% of the time the correct answer is in the top 250 original candidates.
Hypothesis/Evidence Scoring • Eliminates answers that are obviously wrong • Finds passages that may support a certain answer. • Positive and negative scoring based on content and context of passages • Algorithms in parallel score all of the possible answers. • Several scoring algorithms are used • Counting number of IDF-weighted terms in common • Measuring length of longest similar subsequences • Measuring alignment of logical forms (grammatical relationships) • Geospatial and temporal reasoning.
Final Merging and Ranking • Incorporates experiences from prior questions. • Able to weight and apply the algorithms it ran to determine the significance of the evidence. • Calculates confidence in the possible answers it came up with. • A certain level of confidence is necessary to answer the question (changes based on the game)
DeepQA Approach • “[A] Massively parallel probabilistic evidence-based architecture.” • > 100 techniques for analyzing clue, finding sources, generating hypotheses, finding and scoring evidence, and merging and ranking. • Principles • Massive parallelism • Many experts • Pervasive confidence estimation • Integrate shallow and deep knowledge
Computing Power Behind Watson • 90 IBM Power 750 servers • 3.5 GHz POWER7 eight core processor, with four threads per core • 2880 POWER7 processor cores • Able to perform “massively parallel” tasks • 16 Terabytes of RAM • Process at 500 gigabytes per second • 80 TeraFLOPs
The Data Inside Watson • Roughly 4 terabytes of information. • Entirely text-based, no pictures/audio/video. • Included dictionaries, books, textbooks, encyclopedias, news articles, and all of Wikipedia. • Some of the data was structured (databases), but a lot was unstructured or semi-structured. • Divided into clusters and tagged for usefulness.
Watson developed strategy throughout sparring competitions against past Jeopardy players. Leading up to the game
Daily Double Betting • Betting strategy depends on the time of the game and the opponents’ scores. • First Round: Catch opponents if behind. Fairly conservative if ahead. • Second Round: More aggressive to pull ahead. • End of Second Round: Strategic bets to maintain a lead (if any). • In one sparring match bet 100$ when leading 27,500 to 8,200 and 4,600
Learning Throughout the Game • Sometimes Watson doesn’t fully grasp what a category is asking for, learns during the game from the previous answers.
Final Jeopardy Betting • Watson judges its score against the others to determine what is needed to win.
Watson’s Answering Strategy • A person won’t answer a question if we don’t know enough, neither does Watson. • Watson determines this through a confidence scoring.
Using the Buzzer • Although Watson was quite quick, it was sometimes not quick enough to hit the buzzer first. • Humans were able to anticipate when the buzzer would be activated based on listening to the speaker. • While humans and Watson both thought about the question when it was asked, humans process differently. • Exemplar categories: “Celebrities Middle Names”, “Actors Who Direct.”
Buzz Threshold • The programmers added in a calculation to Watson that would incorporate confidence into determining whether to buzz in. • This took the state of the game into account .
Pop Culture • Questions about books, history, or facts are fairly constant. • For pop culture questions (as well as current events) Watson was updated with information. • Ex: “Not to be confused with Lady Gaga is Lady this, the country music group with the CD ‘Need You Now’”
Issues between Jeopardy and IBM • Jeopardy originally feared IBM would use the show as a stunt. • IBM execs were concerned the clues would be skewed in favor of the humans. • After many practice rounds of ringing in electronically, Watson was required to physically press a button. • Concern over computer advantage.
Strengths and Weaknesses Strengths Weaknesses Lacked instinct. There’s nothing Watson can automatically know off the top of its head. Lacked ability to anticipate buzzer. Strength champions had based off of listening. Can’t learn from a competitor’s mistake. Repeated incorrect responses. • No pressure. • As a computer, did not feel emotions. • On game day when the other players and even the programmers were nervous, Watson felt nothing. • Precise algorithms. • Played a smart strategy. • Looked for Daily Doubles • Bet smart
Watson’s Stage Presence • Voice comprised of recordings by Jeff Woodman • Avatar designed by Joshua Davis • Features IBM “Smarter Planet” logo and 42 threads circling the planet. • Designed to reflect moods based on the game.
Watson’s Competition Ken Jennings Brad Rutter
The Game First Match Second Match Fairly tight game The humans pulled ahead in a few categories by virtue of timing. At one points Jennings was leading by a couple thousand points. Watson got the last Double Jeopardy to pull ahead. Preserved lead in final Jeopardy • Made a few mistakes during the normal rounds. • Incorrect final Jeopardy answer. • Jennings with $4,800, Rutter with $10,400, and Watson with $35,734
Future of Watson • Healthcare • Use in diagnostics and treatment suggestion. • Businesses • Customer service • Any situation in which it is necessary to answer a question by parsing through information • Could be fine-tuned to fit a field.
Bibliography Brain, Marshall. "How was IBM’s Watson computer able to answer the questions on Jeopardy? How did the technology work? How might it be used?" HowStuffWorks. HowStuffWorks, 18 Feb. 2011. Web. 6 July 2012. <http://blogs.howstuffworks.com/2011/02/18/ how-was-ibms-watson-computer-able-to-answer-the-questions-on-jeopardy-how-did-the-technology-work-how -might-it-be-used/>. “ The DeepQA Project." IBM. IBM, n.d. Web. 6 July 2012. <http://www.research.ibm.com/deepqa/ deepqa.shtml>. "FAQs." IBM. IBM, n.d. Web. 6 July 2012. <http://www.research.ibm.com/deepqa/faq.shtml>. Ferrucci, David, et al. "Building Watson: An Overview of the DeepQA Project." AI Magazine Fall 2010: n. pag. AI Magazine. Web. 6 July 2012. <http://www.aaai.org/Magazine/Watson/watson.php>. Hale, Mike. "Actors and Their Roles for $300, HAL? HAL!" New York Times [New York] 8 Feb. 2011: n. pag. Web. 6 July 2012. <http://www.nytimes.com/2011/02/09/arts/television/ 09nova.html?_r=1>. IBM Watson Team. Interview. The Reddit Blog. reddit, 23 Feb. 2011. Web. 6 July 2012. <http://blog.reddit.com/2011/02/ibm-watson-research-team-answers-your.html>. Jackson, Joab. "IBM Watson Vanquishes Human Jeopardy Foes." PCWorld. IDG Consumer & SMB, 16 Feb. 2011. Web. 6 July 2012. <http://www.pcworld.com/businesscenter/article/219893/ ibm_watson_vanquishes_human_jeopardy_foes.html>. Loftus, Jack, ed. Gizmodo. Gawker Media, 26 Apr. 2009. Web. 6 July 2012. <http://gizmodo.com/ 5228887/ibm-prepping-soul+crushing-watson-computer-to-compete-on-jeopardy>. Markoff, John. "Computer Program to Take On ‘Jeopardy!’." New York Times [New York] 26 Apr. 2009: n. pag. Web. 6 July 2012. <http://www.nytimes.com/2009/04/27/technology/ 27jeopardy.html>. Pearson, Tony. "Inside System Storage -- by Tony Pearson." developerWorks. IBM, 18 Feb. 2011. Web. 6 July 2012. <https://www.ibm.com/developerworks/mydeveloperworks/blogs/InsideSystemStorage/ entry/ibm_watson_how_to_build_your_own_watson_jr_in_your_basement7?lang=en>. Prager, John, et al. Question-answering by Predictive Annotation. Technical rept. New York: ACM, 2000. ACM Digital Library. Web. 6 July 2012. <http://dl.acm.org/citation.cfm?id=345574>. Radev, Dragomir R., John Prager, and Valerie Samn. Ranking Suspected Answers to Natural Language Questions Using Predictive Annotation. Technical rept. N.p.: n.p., 2000. ACM Digital Library. Web. 6 July 2012. <http://dl.acm.org/citation.cfm?doid=974147.974168>. "The Research Team." IBMWatson. IBM, n.d. Web. 6 July 2012. <http://www-03.ibm.com/innovation/us/ watson/research-team/algorithms.html>. "Show #6086 - Monday, February 14, 2011." J! Archive. N.p., n.d. Web. 6 July 2012. <http://www.j-archive.com/showgame.php?game_id=3575>. Silverman, Matt. "Engineering Intelligence: Why IBM’s Jeopardy-Playing Computer Is so Important." Mashable Tech. Mashable, 11 Feb. 2011. Web. 6 July 2012. <http://mashable.com/2011/02/11/ ibm-watson-jeopardy/>. Singh, Tarandeep. "Artificial Intelligence Algorithm behind IBM Watson." Geeknizer. Geeknizer, 16 Feb. 2011. Web. 6 July 2012. <http://geeknizer.com/ artificial-intelligence-algorithm-behind-ibm-watson/>. “Watson (computer)." Wikipedia. Wikipedia. Web. 6 July 2012. <http://en.wikipedia.org/wiki/ Watson_%28computer%29>. Zimmer, Ben. "Is It Time to Welcome Our New Computer Overlords?" The Atlantic. Atlantic Monthly Group, 17 Feb. 2011. Web. 6 July 2012. <http://www.theatlantic.com/technology/archive/2011/ 02/is-it-time-to-welcome-our-new-computer-overlords/71388/>.