310 likes | 376 Views
O u t li ne [A I M A Ch 13 ]. U n cer ta i nt y P r ob a b ili t y S y n tax a n d S e m a n t i c s I n fe re n ce I nd e p e nd e n ce a n d B ay e s ' R u l e. Unc er t a in t y. Le t a c t i o n A t = l e a v e f o r a i rp o rt t m i nu t e s b e f o re f li g ht
E N D
Outline[AIMACh13] • Uncertainty • Probability • SyntaxandSemantics • Inference • IndependenceandBayes'Rule
Uncertainty • Let action At= leave for airportt minutesbeforeflight • WillAtgetmethereontime? • Problems: • 1)partialobservability(roadstate,otherdrivers'plans,etc.) • 2)noisysensors(KCBStrafficreports) • 3)uncertaintyinactionoutcomes(attire,etc.) • 4)immensecomplexityof modelingandpredictingtraffic • Hencea purely logicalapproach either • 1)risksfalsehood:A25willgetme thereon time" • or 2)leadsto conclusionsthatare too weak fordecisionmaking: “A25willgetmethereon timeifthere'sno accidentonthebridge anditdoesn'trainand my tiresremainintactetc.,etc." • (A1440 mightreasonably besaidto getmethereon timebutI'dhavetostayovernight in theairport…)
Probability • Probabilisticassertionssummarizeeffectsof • laziness:failuretoenumerateexceptions,qualications,etc. • ignorance:lackofrelevantfacts,initialconditions,etc. • SubjectiveorBayesianprobability: • Probabilities relatepropositions toone'sownstateof knowledge • e.g., • Thesearenotclaimsofa“probabilistic tendency”inthe currentsituation(butmightbe learnedfrompastexperience ofsimilarsituations) • Probabilities of propositionschangewithnewevidence: e.g., • (Analogousto logicalentailmentstatusKB╞ α,nottruth.)
Making decisionsunderuncertainty • SupposeIbelievethefollowing: • Whichactiontochoose? • Dependsonmy preferencesformissing flightvs.airport cuisine,etc. • Utilitytheoryis usedtorepresentandinferpreferences • Decisiontheory=utilitytheory+ probabilitytheory
Propositions • Thinkofa proposition as theevent (setofsamplepoints) where the propositionis true • GivenBooleanrandomvariablesAandB: • eventa =setofsamplepoints where A(ω)=trueevent¬a=setofsamplepoints where A(ω)=falseeventa ∧b=points whereA(ω)=trueandB(ω)=true • Often in AI applications, thesamplepoints aredefinedbythevalues ofa setofrandomvariables,i.e.,thesamplespaceis theCartesian product oftherangesofthevariables • WithBooleanvariables,samplepoint =propositional logic model e.g.,A=true,B=false,ora ∧¬b. • Proposition=disjunction ofatomic eventsinwhichit is true e.g.,(a ∨b)≡(¬a ∧b)∨(a ∧¬b)∨(a ∧b) • ⇒P(a ∨b) = P(¬a∧b)+ P(a ∧¬b)+P(a ∧b)
Whyuseprobability? • The definitionsimply thatcertainlogicallyrelatedevents must haverelatedprobabilities • • E.g.,P(a∨b) =P(a) +P(b)- P(a∧b) • deFinetti (1931):anagentwhobetsaccordingtoprobabilitiesthatviolatetheseaxiomscanbeforcedtobetsoastolosemoneyregardlessof outcome.
Syntaxfor propositions • PropositionalorBooleanrandom variables e.g.,Cavity(doI havea cavity?) • Cavity=trueis aproposition, also writtencavity • Discrete randomvariables (finiteorinfinite) • e.g.,Weatherisoneof(sunny,rain,cloudy,snow) • Weather=rainis aproposition • Valuesmustbeexhaustiveandmutuallyexclusive • Continuousrandomvariables (boundedor unbounded) e.g.,Temp=21.6;alsoallow,e.g.,Temp<22.0. • ArbitraryBooleancombinationsofbasic propositions
Priorprobability • Prior orunconditionalprobabilitiesofpropositions • e.g.,P(Cavity=true)= 0:1andP(Weather=sunny)= 0:72 • correspondtobeliefpriortoarrivalofany(new)evidence • Probabilitydistributiongivesvaluesfor allpossibleassignments: • P(Weather)=(0.72,0.1,0.08,0.1)(normalized,i.e.,sumsto1) • Jointprobabilitydistributionfora setofr.v.sgivesthe • probabilityofeveryatomiceventonthoser.v.s(i.e.,everysamplepoint) • P(Weather,Cavity)= a 4 x 2matrixofvalues: • Everyquestionabouta domaincanbeansweredbythejointdistribution becauseeveryeventisa sumofsamplepoints
Conditionalprobability • Conditionalorposteriorprobabilities e.g.,P(cavity|toothache)= 0.8 • i.e.,giventhattoothacheisallIknow • NOT“iftoothachethen80%chanceofcavity” • (Notationforconditionaldistributions: • P(Cavity|Toothache)=2-elementvectorof 2-elementvectors) • If weknowmore,e.g.,cavityisalsogiven,then wehave • P(cavity|toothache,cavity)=1 • Note:thelessspecificbeliefremainsvalidaftermoreevidencearrives,butisnot alwaysuseful • New evidencemay beirrelevant,allowingsimplification,e.g., • P(cavity|toothache,TeamWon)=P(cavity|toothache)=0:8 • Thiskindof inference,sanctionedby domainknowledge,iscrucial
Inferenceby enumeration,contd. • Let Xbeall thevariables.Typically,wewant • theposterior joint distribution ofthequeryvariablesY • givenspecificvalues efor the evidencevariablesE • Let thehiddenvariablesbeH=X-Y -E • Then therequiredsummationofjoint entries is donebysummingout thehiddenvariables: • P(Y|Ee)∑hP(Y,Ee,Hh) • The termsinthesummationarejoint entries becauseY,E,andH • togetherexhaust thesetofrandomvariables • Obviousproblems: • Worst-case timecomplexity O(dn)whered is thelargestarity • Spacecomplexity O(dn)to store thejoint distribution • Howto find thenumbersforO(dn)entries???
Independence • AandBare independentiff • P(A|B)=P(A)orP(B|A)=P(B)orP(A,B)=P(A)P(B) • P(Toothache,Catch,Cavity,Weather)= • P(Toothache,Catch,Cavity)*P(Weather) • 32 entriesreducedto12;fornindependentbiased counts,2n→n • Absoluteindependencepowerfulbutrare • Dentristryisa largefield withhundredsofvariables,noneof whichareindependent. Whattodo?
ConditionalIndependence • P(Toothache,Cavity,Catch)has23 -1=7 independententries • IfI havea cavity,theprobabilitythattheprobecatchesinit doesn'tdependonwhetherIhaveatoothache: • P(catch|toothache,cavity)=P(catch|cavity) • Thesame independenceholds ifIhaven'tgotacavity: • (2)P(catch|toothache,.cavity)=P(catch|cavity) • Catchis conditionallyindependentofToothachegivenCavity: P(Catch|Toothache,Cavity)=P(Catch|Cavity) • Equivalentstatements: • P(Toothache|Catch,Cavity)=P(Toothache|Cavity) P(Toothache,Catch|Cavity)=P(Toothache|Cavity)P(Catch|Cavity)
Conditionalindependencecontd. • Writeoutfull joint distributionusingchainrule: • P(Toothache,Catch,Cavity) • =P(Toothache|Catch,Cavity)P(Catch,Cavity) • =P(Toothache|Catch,Cavity)P(Catch|Cavity)P(Cavity) • =P(Toothache|Cavity)P(Catch|Cavity)P(Cavity) • I.e.,2+2 +1=5independentnumbers(equations1and2remove2) • Inmostcases,theuseofconditionalindependencereduces thesizeoftherepresentationofthejoint distributionfrom exponentialinnto linear inn. • Conditionalindependenceis ourmost basic androbustform ofknowledgeaboutuncertainenvironments.
Bayes’ Ruleand conditional independence • P(Cavity|toothache∧catch) • =P(toothache∧catch|Cavity)P(Cavity) • =P(toothache|Cavity)P(catch|Cavity)P(Cavity) • Thisis anexampleofanaiveBayesmodel: • P(Cause,Effect1,...,Effectn)=P(Cause)ΠiP(Effecti|Cause) Totalnumberof parameterslinear inn
WumpusWorld • Pij=trueiff[i,j]containsapit • Bij=trueiff[i,j]isbreezy • Includeonlyintheprobabilitymodel
Usingconditionalindependence • Basicinsight:observationsareconditionallyindependentof otherhiddensquaresgivenneighboringhiddensquares • DefineUnknown=Fringe∪Other • Manipulatequeryintoa formwherewecanusethis!
Summary • Probabilityisa rigorousformalism foruncertainknowledge • Jointprobabilitydistributionspecifiesprobabilityofevery • atomic event • Queriescanbeansweredbysummingoveratomic events • Fornontrivialdomains,wemustfinda waytoreducethejoint size • Independenceandconditionalindependenceprovidethetools