1 / 38

Uncertainty in Environments

Uncertainty in Environments. Comp3710 Artificial Intelligence Computing Science Thompson Rivers University. Course Outline. Part I – Introduction to Artificial Intelligence Part II – Classical Artificial Intelligence Part III – Machine Learning Introduction to Machine Learning

kandyp
Download Presentation

Uncertainty in Environments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Uncertainty in Environments Comp3710 Artificial Intelligence Computing Science Thompson Rivers University

  2. Course Outline • Part I – Introduction to Artificial Intelligence • Part II – Classical Artificial Intelligence • Part III – Machine Learning • Introduction to Machine Learning • Neural Networks • Probabilistic Reasoning and Bayesian Belief Networks • Artificial Life: Learning through Emergent Behavior • Part IV – Advanced Topics Uncertainty

  3. Learning Outcomes • Relate a proposition to probability, i.e., random variables. • Give the sample space for a set of random variables. • Use one of Komogorov’s 3 axioms P(A B) = P(A) + P(B) –P(AB) • Give the joint probability distribution for a set of random variables • Compute probabilities given a joint probability distribution • Give the joint probability distribution for a set of random variables that have conditional independence relations. • Compute the diagnostic probability from causal probabilities. Uncertainty

  4. Outline • How to deal with uncertainty in environments • Probability • Syntax and Semantics • Inference • Independence and Bayes' Rule Uncertainty

  5. 1. Uncertainty in environments • An example of decision making problem: • Let action At = leave for airport t minutes before flight. • Will At get me there on time? • [Q] How to decide? • Problems:Application environments are not deterministic and unpredictable. Many hidden factors. • partial observability (road state, other drivers' plans, etc.) • noisy sensors (traffic reports) • uncertainty in action outcomes (flat tire, etc.) • immense complexity of modeling and predicting traffic • Hence a purely logical approach either • risks falsehood: “A25 will get me there on time”, or • leads to conclusions that are too weak for decision making: • “A25 will get me there on time if there's no accident on the bridge and it doesn't rain and my tires remain intact etc etc.” • (A1440 might reasonably be said to get me there on time but I'd have to stay overnight in the airport …) Uncertainty

  6. Topics Methods for handling uncertainty • [Q] How to handle uncertainty? • [Q] Default logic? • Default assumptions: My car does not have a flat tire. ... • A25 works unless contradicted by evidence. • But issues: What assumptions are reasonable? • [Q] Can we use fuzzy logic? • Probability – uncertainty/certainty • Model of degree of belief • Given the available evidence or knowledge – environments, • A25 will get me there on time with probability 0.04. • Fuzzy logic – ambiguity/clearness • Degree of truth – value (observed data/facts) • E.g., my age Uncertainty

  7. 2. Probability • Probability is a way of expressing knowledge or belief that an event will occur or has occurred. E.g., … • Probabilistic assertions summarize effects of hidden factors, i.e., • laziness: failure to enumerate exceptions, qualifications, etc. • ignorance: lack of relevant facts, initial conditions, etc. • Subjectiveor Bayesianprobability: • Probabilities can relate propositions to agent's own state of knowledge (or conditions). e.g., P(A25 | no reported accidents) = 0.06 • Probabilities of propositions change with new evidence: e.g., P(A25 | no reported accidents, 5 a.m.) = 0.15 • Hence probability theory can be a good tool. Uncertainty

  8. Topics Making decisions under uncertainty • Suppose I believe the following: P(A25 gets me there on time | …) = 0.04 P(A90 gets me there on time | …) = 0.70 P(A120 gets me there on time | …) = 0.95 P(A1440 gets me there on time | …)= 0.9999 • [Q] Which action to choose? • Depends on my preferences for missing flight vs. time spent waiting, etc. • Utility (the quality of being useful) theory is used to represent and infer preferences. • Decision theory = probability theory + utility theory Uncertainty

  9. 3. Syntax and Semantics • Sample space, event, probability, probability distribution • Random variable and proposition • Prior probability (aka unconditional probability) • Posterior probability (aka conditional probability) Uncertainty

  10. Sample Space and Probability • Definition. Begin with a setΩ – the sample spaceof the same type atomic events • E.g., 6 possible rolls of a die -> Ω = {???} • ω in Ωis a sample point/possible world/atomic event • Definition. A probability space or probability distributionor probability model is a sample spacewith an assignment P(ω) for everyω in Ωs.t. • 0 ≤P(ω) ≤ 1 • ΣωP(ω) = 1 • E.g., P(1)=P(2)=P(3)=P(4)=P(5)=P(6)=1/6. [Q] Sample space? [Q] Probability distribution? • Definition. An eventA can be considered as a subset of Ω. • P(A) = Σω in AP(ω) • E.g., P(die roll < 4) = P(1) + P(2) + P(3) = 1/6 + 1/6 + 1/6 = 1/2 Uncertainty

  11. 3. Axioms of probability • Kolmogorov’s axioms: For any propositions (events) A, B • 0 ≤P(A) ≤ 1 • P(true) = 1 and P(false) = 0 • P(A B) = P(A) + P(B) –P(AB) Uncertainty

  12. Random Variable and Proposition • Definition. Basic element: random variable, representing a sample space • A function from sample points to some range, e.g., the reals or Booleans • E.g., Odd: N -> {true, false} Odd(1) = true • Boolean random variables • E.g., Cavity (do I have a cavity?); [Q] What is the sample space? • Discrete random variables (finite or infinite) • E.g., Weather is one of <sunny,rainy,cloudy,snow>; [Q] What is the sample space? • Continuous random variables (bounded or unbounded) • E.g., RoomTemperature < 22; [Q] What is the sample space? Uncertainty

  13. 3. Proposition (how to relate to probability?) • Think of a proposition as the event where the proposition is true. • E.g., I have a toothache; The weather is sunny and there is no cavity. • [Q] How to relate to probability, i.e., random variables? • Similar to propositional logic: possible worlds defined by assignment of values to random variables. • A sample point is a value of a set of random variables. • A sample space is the Cartesian product of the ranges of random variables. • [Q] What is the sample space for “Cavity and Weather”? • Elementary proposition constructed by assignment of a value to a random variable: E.g., Weather =sunny; Cavity = false (can be abbreviated as cavity) • Complex propositions formed from elementary propositions and standard logical connectives E.g., (Weather = sunny) (Cavity = false) • Proposition = conjunction/disjunction of atomic events Uncertainty

  14. Prior and Posterior Probability • Prior (aka unconditional) probability • Posterior (aka conditional) probability Uncertainty

  15. Prior probability • Prior or unconditional probabilities of propositions E.g., P(Cavity = true) = 0.1 and P(Weather = sunny) = 0.72 correspond to belief prior to arrival of any (new) evidence. • Probability distribution gives values for all possible assignments: P(Weather) = <0.72,0.1,0.08,0.1> (normalized, i.e., sums to 1) • Joint probability distribution for a set of random variables gives the probability of every atomic event on those random variables. P(Weather,Cavity) = a 4 × 2 matrix of values: Weather =sunnyrainycloudysnow Cavity = true 0.144 0.02 0.016 0.02 Cavity = false 0.576 0.08 0.064 0.08 [Q] What is the sum of all the values above? • Every question about a domain can be answered with the joint probability distribution. [Q] Any problem? Size of the table. Uncertainty

  16. P(Weather,Cavity) = a 4 × 2 matrix of values: Weather =sunnyrainycloudysnow Cavity = true 0.144 0.02 0.016 0.02 Cavity = false 0.576 0.08 0.064 0.08 • [Q] What is the probability of Weather =cloudy? • [Q] What is the probability of snow and no-cavity? • [Q] What is the probability of sunnyorcavity? P(Weather=sunny)  P(Cavity=true) = P(Weather=sunny) + P(Cavity=true) ??? ??? = (.144 + .576) + (.144 + .02 + .016 + .02) – ??? Uncertainty

  17. Prior probability – continuous variables • In general, express probability distribution as parameterized function, called a probability density function(pdf), of value • E.g., P(X=x) = U[18,26](x) = 0.125 • P is a density; integrates to 1 0.125 18 dx 26 Uncertainty

  18. Example of Normal distribution, or also called Gaussian distribution Probabilities [This image is from Wikipedia.org] Events Uncertainty

  19. 3. • Interesting facts: • P(A) = 1 – P(~A) • P(A) = P(A B) + P(A  ~B) for any B Uncertainty

  20. Conditional probability • Posterioror conditional probabilities e.g., P(cavity | toothache) = 0.6 i.e., given that toothache is all I know, (it is not like if toothache,) 60% chance of cavity. Another example, P(toothache | cavity) = 0.8 • New evidence/observation/fact may be irrelevant, allowing simplification, e.g., when Cavity and Weather are irrelevant, P(cavity | toothache, sunny) = P(cavity | toothache) = 0.6 • This kind of inference, sanctioned by domain knowledge, is crucial. • Very interesting examples: [Q] The stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? [Q] A car won’t start. What is the probability of having a bad starter? What is the probability of having a bad battery? Uncertainty

  21. b • Definition of conditional probability: P(a | b) = P(a b) / P(b) if P(b) > 0 • Product rule gives an alternative formulation: P(ab) = P(a) P(b | a) = P(b) P(a | b) • Chain rule is derived by successive application of product rule: P(abc) = ??? = P(ab) P(c | ab) = P(a) P(b | a) P(c | ab) P(X1,…,Xn) = P(X1,...,Xn-1) P(Xn | X1,...,Xn-1) = P(X1,...,Xn-2) P(Xn-1 | X1,...,Xn-2) P(Xn | X1,...,Xn-1) = … = P(X1) P(X2 | X1) P(X3 | X1, X2) P(X4 | X1, X2, X3) … = i=1P(Xi | X1,…,Xi-1) will be used in Inference later! a n Uncertainty

  22. 3. Topics • Interesting facts: • P(A|B) = 1 – P(~A|B) Uncertainty

  23. 4. Inference by Enumeration • Start with the joint probability distribution:Toothache (I have a toothache), Cavity (meaning?), Catch (catch probe?). • [Q] What are random variables here? • [Q] What is the sum of probabilities of all the atomic events? • [Q] How can you obtain the above probability distribution? • For any proposition φ, the probability is the sum of the atomic events where it is true: P(φ) = Σω:ω╞φP(ω) Uncertainty

  24. Start with the joint probability distribution: Toothache, Cavity, Catch • For any proposition φ, sum the atomic events where it is true: P(φ) = Σω:ω╞φP(ω) • [Q] P(toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2 • [Q] P(cavity toothache) = ??? • [Q] P(cavity toothache) = ??? Uncertainty

  25. Start with the joint probability distribution: Toothache, Cavity, Catch • Can also compute conditional probabilities: Very important! P(cavity | toothache) = P(cavity toothache) P(toothache) =0.016 + 0.064 0.108 + 0.012 + 0.016 + 0.064 = 0.4 [Q] P(cavity | toothache) = ??? P(cavity | toothache) = ??? [Q] P(cavity | catch) = ??? [Q] P(cavity | toothache) ??? Uncertainty

  26. Denominator can be viewed as a normalization constant α P(Cavity | toothache) = P(Cavity,toothache) / P(toothache) = α, P(Cavity,toothache) = α, [P(Cavity,toothache,catch) + P(Cavity,toothache,catch)] = α, [<0.108,0.016> + <0.012,0.064>] = α, <0.12,0.08> = <0.6,0.4> [Q] What is α? General idea: compute distribution on query variable by fixing evidencevariables and summing over hidden variables Uncertainty

  27. Topics • Obvious problems: For n random variables, • Worst-case time complexity O(dn)where d is the largest arity • Space complexity O(dn) to store the joint distribution • How to find the numbers for O(dn)entries? • In other words, too much. • [Q] How to reduce? Uncertainty

  28. 5. Independence & Bayes’ Rule • Independence • Conditional independence • Bayes’ rule Uncertainty

  29. 5. Independence • A and B are independent iffP(A|B) = P(A) or P(B|A) = P(B) or P(A,B) = P(A) P(B) P(Toothache, Catch, Cavity, Weather)= P(Toothache, Catch, Cavity) ×P(Weather) • In the above example, 32 (2*2*2*4) entries reduced to 12 (2*2*2 + 4). • Absolute independence is powerful but rare. • Dentistry is a large field with hundreds of variables, none of which are independent. [Q] What do we have to do? Uncertainty

  30. Conditional Independence • If I have a cavity, the probability that the probe catches in it doesn't depend on whether I have a toothache: (1) P(catch | (toothache  cavity)) = P(catch | cavity) • The same independence holds if I haven't got a cavity: (2) P(catch | (toothache  cavity)) = P(catch | cavity) • Catch is conditionally independent of Toothache given Cavity: P(Catch | Toothache, Cavity) = P(Catch | Cavity) • When Catch is conditionally independent of Toothache given Cavity,equivalent statements: P(Toothache | Catch, Cavity) = P(Toothache | Cavity) P(Toothache, Catch | Cavity) = P(Toothache | Cavity) P(Catch | Cavity) (similarly P(A, B) = P(A) P(B) when A and B are independent) Uncertainty

  31. Write out full joint distribution using chain rule: P(Toothache, Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch, Cavity) by the product rule = P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity) by ??? = P(Toothache | Cavity) P(Catch | Cavity) P(Cavity) by ??? 2 * 2 * 2 = 8 -> 2 + 2 + 1 How ??? Cavity Toothache Catch [Q] P(toothache) from the tables below??? = P(tcavity) + P(tcavity) = P(t|c) P(c) + P(t|c) P(c) = ??? [Q] P(t|cavity) = ??? = 1 – P(t|c) = ??? ? Uncertainty

  32. Write out full joint distribution using chain rule: P(Toothache, Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch, Cavity) = P(Toothache | Catch, Cavity) P(Catch | Cavity) P(Cavity) = P(Toothache | Cavity) P(Catch | Cavity) P(Cavity) • In most cases, the use of conditional independence reduces the size of the representation of the joint distribution from exponential in n to linear in n. • Conditional independence is our most basic and robust form of knowledge about uncertain environments. Cavity Toothache Catch Uncertainty

  33. 5. • Interesting fact: • P(A,B|C) = P(A|C)P(B|C) when A and B are conditionally independent of each other given C. • Actually it is a definition. • P(toothache  catch | cavity) = .108 / (.108 + .012 + .072 + .008) = .054 • P(toothache | cavity) = (.108 + .012) / .2 = .12 / .2 = .06 • P(catch | cavity) = (.108 + .072) / .2 = .09 • P(toothache | cavity) P(catch | cavity) = .06 * .09 = .054 •  P(toothache  catch | cavity) = P(toothache | cavity) P(catch | cavity) • … • Therefore Toothache and Catch are conditionally independent given Cavity. It is how to show the conditional independency. Uncertainty

  34. Bayes' Rule • Product rule P(ab) = P(a | b) P(b) = P(b | a) P(a) Bayes' rule, or also called Bayes’ Theorem: P(a | b) = P(b | a) P(a) / P(b) • or in distribution form P(Y|X) = P(X|Y) P(Y) / P(X) = αP(X|Y) P(Y) • Useful for assessing diagnosticprobability from causalprobability: • P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) • [Q] The stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? • [Q] A car won’t start. What is the probability of having a bad starter? What is the probability of having a bad battery? Uncertainty

  35. Useful for assessing diagnosticprobability from causalprobability: • P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) • [Q] When cavity, usually toothache P(cavity | toothache) ? • [Q] E.g., let M be meningitis, S be stiff neck; the stiff neck is one symptom of meningitis; the probability of having meningitis with stiff neck? P(s) = 0.1; P(m) = 0.0001; P(s|m) = 0.8 P(m|s) = ??? • [Q] How to find P(s|m), P(m) and P(s) ??? • Note: posterior probability of meningitis still very small! Uncertainty

  36. Useful for assessing diagnosticprobability from causalprobability: • P(Cause|Effect) = P(Effect|Cause) P(Cause) / P(Effect) • [Q] E.g., let B be ‘bad battery’, N be ‘no starting in cold morning’; N is one symptom of B; the probability of having bad battery when a car won’t start in cold morning? P(n) = 0.01; P(b) = 0.05; P(n|b) = 0.8 P(b|n) = ??? Uncertainty

  37. 5. Bayes' rule and conditional independence Topics P(Cavity | toothache catch) = P(toothache  catch | Cavity) P(Cavity) = P(toothache | Cavity) P(catch | Cavity) P(Cavity) • This is an example of a naïve Bayes model: P(Cause,Effect1, … ,Effectn) = P(Cause) ΠiP(Effecti|Cause), where Effects are conditionally independent of each other, given Cause. • We will use this later again! Uncertainty

  38. Topics 6. Summary • Probability is a rigorous formalism for uncertain knowledge. • Joint probability distribution for discrete random variables specifies probability of every atomic event. • Queries can be answered by summing over atomic events. • For nontrivial domains, we must find a way to reduce the joint size. • Independenceand conditional independence provide the tools. Uncertainty

More Related