440 likes | 581 Views
Natural Language Processing. The Dream. To have human-computer interaction that is as natural as possible: language Can be spoken or written Not just voice recognition Not just “learn these commands”. Commercial Efforts. iPhone Siri Google voice. The Reality.
E N D
The Dream • To have human-computer interaction that is as natural as possible: language • Can be spoken or written • Not just voice recognition • Not just “learn these commands”
Commercial Efforts • iPhone Siri • Google voice
The Reality • “In reality, Siri is nothing more than a cute gimmick and fantastic marketing tool used to suck people into its mediocrity.” – The Daily Reveille • Gene Weingarten, Washington Post, “Why Google Voice stinks”
Research Efforts • Command-and-Control • Dialog system – Car helps with tasks • Dialog system – Getting directions
Types of tasks • Command-and-control • Dialog • Translation • Text classification • Information Retrieval • Information Extraction
LINGUISTIC INPUT PRE-PROCESSOR CLEANED-UP INPUT SYNTACTIC ANALYZER PARSE TREE SEMANTIC INTERPRETER PREPOSITIONAL REPRESENTATION "REAL" PROCESSING INFERENCE/RESPONSE …
LINGUISTIC INPUT PRE-PROCESSOR CLEANED-UP INPUT SYNTACTIC ANALYZER PARSE TREE SEMANTIC INTERPRETER PREPOSITIONAL REPRESENTATION "REAL" PROCESSING INFERENCE/RESPONSE …
LINGUISTIC INPUT PRE-PROCESSOR CLEANED-UP INPUT SYNTACTIC ANALYZER PARSE TREE SEMANTIC INTERPRETER PREPOSITIONAL REPRESENTATION "REAL" PROCESSING INFERENCE/RESPONSE …
S NP VP NP V PP John Adv V n Prep NP Det a book to Mary often gives What are the representations? • Linguistic input: “how dual get to um bank” • Cleaned-up input: “how do i get to a bank” • Parse tree • Semantics often(gives(john, mary, book))
Semantic Representations • Case Frames • “John broke the window with a hammer” • “The window was broken by John with a hammer” • “Using a hammer, John broke the window” [head: BREAK agent: JOHN object: WINDOW instrument: HAMMER ]
Why is it hard? Ambiguity, ambiguity, ambiguity “I made her duck.”
Types of Ambiguity • Syntactic I saw the Grand Canyon flying to New York. Time flies like an arrow. • Word Sense The man went to the bank to get some cash. and jumped in. He ran the mile infour minutes. the Olympics. • Referential I took the cake from the table and washedit. ate it. • Indirect Speech Acts Can you open the window? I need some air.
Speech • Using speech (instead of text) introduces a slew of issues: • Recognition errors • Grammatical errors in speech • Unclear sentence boundaries • Omissions and word fragments • Inversions • Interjections • Speech repairs
Techniques? • Parsing: • WordNet • Augmented Transition Networks by AUX NP V 3 1 5 7 2 4 6 8 V NP NP NP
Techniques? • Processing the language • Speech acts • Context Map – both of conversation and world • Common sense/ World knowledge • Social Context 10-year old: “I want a juicy Hamburger!” Mother: “Not today, perhaps tomorrow…” General: “I want a juicy Hamburger!” Aide: “Yes, sir!!” Prisoner 1: “I want a juicy Hamburger!” Prisoner 2: “Wouldn't that be nice for once!”
History • Eliza, the first chatterbot : a computer program thatmimicshuman conversation. • Joseph Weizenbaum, Massachusetts Institute of Technology • User types in naturallanguage • ELIZA thenanalyzes the user'sstatement and generatessomeresponsewhichit types out.
Men are all alike. IN WHAT WAY? They'realwaysbugging us about something or other. CAN YOU THINK OF A SPECIFIC EXAMPLE? Well, myboyfriend made me come here. YOUR BOYFRIEND MADE YOU COME HERE He saysI'mdepressedmuch of the time. I AM SORRY TO HEAR YOU ARE DEPRESSED It'strue. I amunhappy DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY I needsome help, thatmuchseems certain. WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP Perhaps I couldlearn to getalongwithmymother. TELL ME MORE ABOUT YOUR FAMILY
Eliza Transformation Rules "I am X" "How long have you been X? " I amveryunhappy. HOW LONG HAVE YOU BEEN VERY UNHAPPY? "X YOU Y ME" "Whatmakesyouthink I Y you? " It seemsthatyouhate me. WHAT MAKES YOU THINK THAT I HATE YOU?
PARRY • PARRY had far more to say than ELIZA and wanted to tell you things • It modelled a psychiatrically disordered person (PARanoid) and hence it was easy to excuse when it didn’t make sense. • It remembered what you had said and got ‘annoyed’ and ‘frightened’ • Colby ran a series of TT’s with psychiatrists – PARRY passed
Why are you in the hospital? I SHOULDN’T BE HERE. Who brought you here? THE POLICE. What trouble did you have with the police? COPS DON’T DO THEIR JOB. What do they do instead? COPS ARREST THE WRONG PEOPLE. They arrest the right ones sometimes. THE LAW SHOULD GET THOSE ITALIAN CROOKS. Are you Italian? MY BACKGROUND IS BRITISH BUT I WAS BORN IN THIS COUNTRY. Do you know any Italian crooks? I TRY TO AVOID THE UNDERWORLD. Are you afraid of the underworld? THE MAFIA ARE THE WORST GANGSTERS.
Parry Meets Eliza parry: People get on my nerves sometimes. eliza: I am not sure I understand you fully. parry: You should pay more attention. eliza: Suppose you should pay more attention? parry: You're entitled to your opinion. eliza: What makes you think I'm entitled to my opinion? parry: It bothers me just to be around people. eliza: Please go on.
So… • Limit the task (domain), and/or • Limit the vocabulary, and/or • Limit the initiative
Watson on Jeopardy! http://www.youtube.com/watch?v=seNkjYyG3gI (How it works: 2:20 – commercial) http://www.youtube.com/watch?v=YLR1byL0U8M (Tanks on final jeopardy: 22:20) http://www.youtube.com/watch?v=vY8NrMVvMDM (An overall win: 7:20)
“This is not really an exercise in brains. It’s rather an exercise in mere recall.” (Uncredited person at ABC News) Why this guy’s clueless…
Human v. Computers • Computers are tools – made to help humans do what we don’t do well… • Repetition • Speed • Accuracy • Computer has PERFECT recall, duh…
What makes Jeopardy! hard? • To me? The trivia • To Ken Jennings? Being fast on the buzzer • To Watson? Understanding the question
How does Watson do it? • Rules • Grammar rules say “Operatic” is an adjective modifying “ship” • Rules about English say the articles “a” and “an” aren’t good search terms • A rule of thumb for Jeopardy says the answer is a synonym for “garment” (?)
How does Watson do it? 2. Big, parallel search • What does Google say? …. Not much helpful. • What is the probability that the word “garment” means “clothing”? • Do I have a list of clothing words, especially children’s clothing? • Operatic – Can I have a list of operas? plays? musicals? • Ship – What words are associated with ships? Does it mean the noun or the verb?
How does Watson do it? 3. Evidence Gathering • What each document tells me gives me some evidence, with a probability score • I combine the scores (somehow) into an overall confidence measure, and pick the highest one.
How does Watson do it? 4. Machine Learning • Over time, the machine learns which resources give the best evidence • During one category, the machine learns the kind of answer the category is looking for (all the answers so far in this category have been clothing words, for example)
So what kinds of questions stump Watson?See if you can figure out why…
Question has NOTHING to do with Wimbledon – but that seems like an important word and will issue lots of evidence…
Well, there are lots! Watson doesn’t “get” that the category is computer keys… the answer is F1.
What exactly is the question asking? Where is the verb? There are lots of extra words here… Answer is Alberto.
Not at all Human! • When the computer gets it wrong, it is really wrong! • Once, when asked for the Russian word for "goodbye," Watson gave the answer "cholesterol." • "To me, that's just crazy. There's no way a human player could duplicate that kind of mistake.” - Jennings
"If you want to build something that thinks like a human, we have a great way to do that. It only takes like nine months and it's really fun.” – Bart Massey, Portland State
Rutter, on ABC interview: “Watson, as far as I know, can’t write a symphony or paint a lovely picture…” Wrong again, Rutter! Of course, he also said, “Hey, I’m just a guy who answers trivia questions!”
So, is it a significant accomplishment? "This is the most significant breakthrough of this century. I know the phones are ringing off the hook with interest in Watson systems. The Internet may trump Watson, but for this century, it's the most significant advance in computing." - Richard Doherty, Envisioneering Group
What do you think we can do in the real world with a system like Watson?