1 / 51

ARTIFICIAL INTELLIGENCE: THE MAIN IDEAS

ARTIFICIAL INTELLIGENCE: THE MAIN IDEAS. OLLI COURSE SCI 102 Tuesdays, 11:00 a.m. – 12:30 p.m. Winter Quarter, 2013 Higher Education Center, Medford Room 226. Nils J. Nilsson. nilsson@cs.stanford.edu http:// ai.stanford.edu/~nilsson /. Course Web Page: www.sci102.com/.

ryo
Download Presentation

ARTIFICIAL INTELLIGENCE: THE MAIN IDEAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ARTIFICIAL INTELLIGENCE:THE MAIN IDEAS OLLI COURSE SCI 102 Tuesdays, 11:00 a.m. – 12:30 p.m. Winter Quarter, 2013 Higher Education Center, Medford Room 226 Nils J. Nilsson nilsson@cs.stanford.edu http://ai.stanford.edu/~nilsson/ Course Web Page: www.sci102.com/ For Information about parking near the HEC, go to: http://www.ci.medford.or.us/page.asp?navid=2117 There are links on that page to parking rules and maps

  2. AI in the News?

  3. PART THREEAGENTS THAT REASON(Continued)

  4. Adding Reasoning to The Model of an Agent Action Selection Perception Planner Memory Reasoner

  5. Reasoning Under Uncertainty We (and AI agents) are uncertain about almost everything! In these matters the only certainty is that nothing is certain. -- Pliny the Elder When one admits that nothing is certain one must, I think, also admit that some things are much more nearly certain than others. -- Bertrand Russell

  6. How Should We Reason if Things Are Uncertain? ∀(x)[Green(x) => Liftable(x)] Green(A) Conclusion: Liftable(A) But, what if we are not certain that A is green? Perhaps it’s only “likely” that it’s green. And, perhaps we are not certain that all green blocks are liftable? We need to use “probabilistic reasoning”

  7. Probabilities: A Tool for Dealing With Uncertainty Probability of Rain in Pendleton, OR As of Jan. 19. 2013

  8. Good Poker Players Are Familiar with Probabilities Hand Probability Single Pair Two Pair Triple A Straight Full House 0.422569 0.047539 0.021128 0.00392465 0.001441

  9. So Are Physicians Symptoms Family History Medical History Tests . . . Rule-out unlikely but serious diagnoses Likely diagnoses Probabilistic Reasoning Medical Knowledge Bases Clinical Experience

  10. Methods for Coming Up With Probability Values Mathematical: As in calculating poker probabilities Frequency: From large databases of records (such as mortality tables, etc.) “Subjective”: As in guessing football odds, horse-racing odds, etc.

  11. Markets Are Often Used to Establish Subjective Probabilities Foresight Exchange Prediction Market “A power plant will sell energy produced by nuclear fusion by 31 December 2045. After its initial energy sale, it must operate (i.e., sell energy) regularly for a minimum of one year. ‘Regularly’ is defined as >50% of the time.” Last Price: $0.71 That is, the probability is 0.71 http://www.ideosphere.com/fx/index.html

  12. More Prices http://www.ideosphere.com/fx/index.html

  13. Defining Bets Cold fusion of hydrogen in nickel can produce over 10 watts/cc net power [by Jan. 1, 2015]. The phrase "cold fusion" has its vernacular meaning of any low energy nuclear reaction that produces heat. http://www.ideosphere.com/fx/index.html

  14. Some Basics About Probabilities p(x) denotes the probability of x [sometimes Pr(x)] Example: p(heads) = 0.5 p(x) always between 0 and 1 (sometimes expressed as a percentage) p(x) sometimes expressed as “odds” (4 to 1 odds in favor of x is the same as p(x) = 0.8) p(x) + p(y) = 1 if x and y are mutually exclusive and exhaustive Example: p(heads) + p(tails) = 1

  15. Conditional Probabilities The probability that Joe has a fever, given that he has the flu is greater than the probability that he has a fever not knowing that fact. p(Joe has high fever|Joe has flu) > p(Joe has high fever) a conditional probability a prior probability

  16. Much Causal Knowledge is Probabilistic flu causes a fever (usually) p(Joe has high fever|Joe has flu) cause a possible symptom

  17. What is the Likely Cause of a Symptom? p(Joe has high fever|Joe has flu) known from medical (causal) knowledge But, what about the reverse: p(Joe has flu|Joe has high fever) ?

  18. Bayes’ Rule to the Rescue p(x|y)=p(y|x)p(x)/p(y) p(cause|symptom) = p(symptom|cause)p(cause)/p(symptom) Thomas Bayes (c. 1701 –April 1761) was an English mathematician and Presbyterian minister

  19. Deriving Bayes’ Rule p(x|y)p(y) =p(x & y) = p(y & x) unconditioning order of x and y doesn’t matter p(y & x) = p(y|x)p(x) from whence: p(x|y) = p(y|x)p(x)/p(y)

  20. Using Bayes’ Rule p(Joe has flu|Joe has high fever) = p(Joe has high fever|Joe has flu) p(Joe has flu)/p(Joe has high fever) From thermometer From medical and statistical “knowledge”

  21. Baysian Networks Judea Pearl

  22. In AI, Probabilistic Knowledge is Represented in “Bayesian Networks” “causality” link is annotated with a table of probabilities Joe has flu A Very Small Network Joe has flu p(Joe has high fever) T 0.95 F 0.005 Joe has high fever Going in the Direction of the Arrow is Called “Causality Reasoning”

  23. Use Bayes’ Rule to Go AgainstThe Arrow calculate probability using Bayes’ rule Joe has flu probability = 0.99 given symptom Joe has high fever Going Against the Arrow is Called “Evidential Reasoning”

  24. The Calculations Are More Complex for Bigger Networks Conditional Probability Table Expresses Causality Information

  25. A Larger Causality Network Check out interactive applet at: http://aispace.org/bayes/ and click on sample problems

  26. The Car Doesn’t StartEvidential Reasoning Needed p = 0.283 Computed p = 0.023 Given: p = 0

  27. New Information Changes Things p = 0.283 p = 0.1 Given: p = 0 Given: p = 0

  28. Bayesian Networks Can Be Learned Can This Net be Learned Using the Statistical Data That it Generates?

  29. Suppose We Know There Are 37 Nodes Can We Learn What is Linked to What

  30. Generate Thousands of Samples That Obey The Underlying Probabilities Given by the Unknown Network These Were Generated From the Unknown Network

  31. The Learned Network Sprites, P., and Meek, C. 1995. “Learning Bayesian networks with discrete variables from data.” Proc. 1st Int. Conf. on Knowedge Discovery and Data Mining.

  32. Comparison Learned network (generated from the data) Target (from which data was generated)

  33. Availability of Large Amounts of Data Permits the Statistical Analysis Needed to Learn Bayesian Networks

  34. Some Learned Networks Can Be Quite Large Gene network inferred analyzing human cell cycle expression data http://www.sciencedirect.com/science/article/pii/S1532046411000311#

  35. Applications of Bayesian Networks * Computational biology and bioinformatics (gene regulatory networks, protein structure, . . .) * Medicine * Document classification * Information retrieval * Image processing * Decision support systems * Engineering * Law * Speech recognition http://en.wikipedia.org/wiki/Bayesian_network#Applications

  36. PART FOURAGENTS THAT UNDERSTAND HUMAN LANGUAGE

  37. Adding Language Ability to The Model of an Agent Perception Action Selection Language Processing Planner Memory Reasoner

  38. Natural Language Processing (NLP) * Converting Speech to Text * “Understanding” Text * Translation * Speech and Text Generation

  39. Converting Speech to Text * Capturing the Speech Waveform * Division of Waveform Into “Frames” * Determining “Features” of Each Frame * Recognizing “Phonemes” * Converting Phonemes to Text

  40. A Hard Problem! There are many ways to recognize speech. There are many ways to wreck a nice beach. I scream for ice cream

  41. Example Speech Waveform Sentence that was spoken “Phonemes” are acoustic elements

  42. Notation For English Phonemes

  43. Early Processing Speech waveform Features of the waveform Spectral Analysis, Etc. F1, F2, F3, . . .

  44. Different Phonemes Cause Different Waveform Features (with some uncertainty) u and v are variables phoneme (could be one of 40 or so) u v A Bayesian Network p(v|u) “acoustic model” features (could be one of several)

  45. Given a Particular Feature, F, Select The Most Probable Phoneme phoneme (could be one of about 40) u F A Bayesian Network p(F|u) “acoustic model” Bayes’ Rule: p(u|F) = p(F|u)p(u)/p(F) Substitute each of the 40 or so phonemes in the above equation and note which gives the largest value. That’s our guess for which phoneme was spoken.

  46. Different Words Cause Different Phonemes (again, with some uncertainty) word (could be one of thousands) x y A Bayesian Network p(y|x) “word model” phonemes (could be one of 40 or so)

  47. Given a Particular Phoneme, PH, Select The Most Probable Word word (could be one of thousands) PH x A Bayesian Network p(PH|x) “word model” Bayes’ Rule: p(x|PH) = p(PH|x)p(x)/p(PH) Substitute each of the thousands of words in the above equation and note which gives the largest value. That’s our guess for which word was spoken.

  48. But, The Process is MUCH More Complicated! Language Model (what words are likely) y1 y2 y3 x1 x2 x3 Word Level Word Model Phoneme Level The Bayesian Network Connecting Words With Phonemes Is Called a “Hidden Markov Model (HMM)”

  49. We Also Have a HMM Connecting Phonemes With Features Articulation Model (what phonemes are likely) v1 v2 v3 u1 u2 u3 Phoneme Level Acoustic Model Feature Level

  50. Use Both HMM’s And Select The Most Probable Word Sequence v1 v2 v3 u1 u2 u3 x1 x2 x3 Phoneme Level Word Level Waveform Feature Level Learning: The Various Probabilities Can Be Tuned For a Particular Speaker

More Related