• 400 likes • 560 Views
QUIZ!!. T/F: Forward sampling is consistent. True T/F: Rejection sampling is faster, but inconsistent. False T/F: Rejection sampling requires less memory than Forward sampling. True T/F: In likelihood weighted sampling you make use of every single sample. True
E N D
QUIZ!! • T/F: Forward sampling is consistent. True • T/F: Rejection sampling is faster, but inconsistent. False • T/F: Rejection sampling requires less memory than Forward sampling. True • T/F: Inlikelihood weighted sampling you make use of every single sample. True • T/F: Forward sampling samples from the joint distribution. True • T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True • T/F: Too few samples could result in division by zero. True xkcd
CSE511a: Artificial IntelligenceFall 2013 Lecture 18: Decision Diagrams 04/03/2013 Robert Pless, via Kilian Q. Weinberger Several slides adapted from Dan Klein – UC Berkeley
Announcements Class Monday April 8 cancelled in lieu of talk by Sanmay Das: “Bayesian Reinforcement Learning With Censored Observations and Applications in Market Modeling and Design” Friday, April 5, 11-12 Lopata 101.
Sampling Example • There are 2 cups. • The first contains 1 penny and 1 quarter • The second contains 2 quarters • Say I pick a cup uniformly at random, then pick a coin randomly from that cup. It's a quarter (yes!). What is the probability that the other coin in that cup is also a quarter?
Cloudy C S R W Recap: Likelihood Weighting • Sampling distribution if z sampled and e fixed evidence • Now, samples have weights • Together, the weighted sampling distribution is consistent
Likelihood Weighting Cloudy Cloudy Sprinkler Sprinkler Rain Rain Samples: WetGrass WetGrass +c, +s, +r, +w …
Likelihood Weighting Example • Inference: • Sum over weights that match query value • Divide by total sample weight • What is P(C|+w,+s)?
Cloudy C S Rain R W Likelihood Weighting • Likelihood weighting is good • Takes evidence into account as we generate the sample • At right: W’s value will get picked based on the evidence values of S, R • More samples will reflect the state of the world suggested by the evidence • Likelihood weighting doesn’t solve all our problems • Evidence influences the choice of downstream variables, but not upstream ones (C isn’t more likely to get a value matching the evidence) • We would like to consider evidence when we sample every variable
Markov Chain Monte Carlo* • Idea: instead of sampling from scratch, create samples that are each like the last one. • Procedure: resample one variable at a time, conditioned on all the rest, but keep evidence fixed. E.g., for P(b|c): • Properties: Now samples are not independent (in fact they’re nearly identical), but sample averages are still consistent estimators! • What’s the point: both upstream and downstream variables condition on evidence. -b +a +c -b -a +c +b +a +c
Gibbs Sampling • Set all evidence E1,...,Em to e1,...,em • Do forward sampling t obtain x1,...,xn • Repeat: • Pick any variable Xi uniformly at random. • Resample xi’ from p(Xi | markovblanket(Xi)) • Set all other xj’=xj • The new sample is x1’,..., xn’
Markov Blanket • Markov blanket of X: • All parents of X • All children of X • All parents of children of X (except X itself) X X is conditionally independent from all other variables in the BN, given all variables in the markov blanket (besides X).
Summary • Sampling can be your salvation • Dominant approach to inference in BNs • Pros/Conts: • Forward Sampling • Rejection Sampling • Likelihood Weighted Sampling • Gibbs Sampling/MCMC
Google’s PageRank http://en.wikipedia.org/wiki/Markov_chain Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999). The PageRank citation ranking: Bringing order to the Web. See also: J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACM-SIAM Symposium on Discrete Algorithms, 1998.
Google’s PageRank H Graph of the Internet (pages and links) A E I D B C F J G
Google’s PageRank H Start at a random page, take a random walk. Where do we end up? A E I D B C F J G
Google’s PageRank H Add 15% probability of moving to a random page. Now where do we end up? A E I D B C F J G
Google’s PageRank H PageRank(P) = Probability that a long random walk ends at node P A E I D B C F J G
Where do the CPTs come from? • How do you build a Bayes Net to begin with? • Two scenarios: • Complete data • Incomplete data --- touch on, you are not responsible for this.
Learning from complete data • Observe data from real world: • Estimate probabilities by counting: • (Just like rejection sampling, except we observe the real world.) (Indicator function I(a,b)=1 if and only if a=b.)
Learning from incomplete data • Observe real-world data • Some variables are unobserved • Trick: Fill them in • EM algorithm • E=Expectation • M=Maximization
EM-Algorithm • Initialize CPTs (e.g. randomly, uniformly) • Repeat until convergence • E-Step: • Compute for all i and x (inference) • M-Step: • Update CPT tables • (Just like likelihood-weighted sampling, but on real-world data.)
EM-Algorithm • Properties: • No learning rate • Monotonic convergence • Each iteration improves log-likelihood of data • Outcome depends on initialization • Can be slow for large Bayes Nets ...
U Decision Networks • MEU: choose the action which maximizes the expected utility given the evidence • Can directly operationalize this with decision networks • Bayes nets with nodes for utility and actions • Lets us calculate the expected utility for each action • New node types: • Chance nodes (just like BNs) • Actions (rectangles, cannot have parents, act as observed evidence) • Utility node (diamond, depends on action and chance nodes) Umbrella Weather Forecast
U Decision Networks • Action selection: • Instantiate all evidence • Set action node(s) each possible way • Calculate posterior for all parents of utility node, given the evidence • Calculate expected utility for each action • Choose maximizing action Umbrella Weather Forecast
U Example: Decision Networks Umbrella = leave Umbrella Weather Umbrella = take Optimal decision = leave
U(t,s) U(t,r) U(l,s) U(l,r) Decisions as Outcome Trees • Almost exactly like expectimax / MDPs {} take leave Weather Weather sun sun rain rain
U Evidence in Decision Networks • Find P(W|F=bad) • Select for evidence • First we join P(W) and P(bad|W) • Then we normalize Umbrella Weather Forecast
U Example: Decision Networks Umbrella Umbrella = leave Weather Umbrella = take Forecast =bad Optimal decision = take
U Value of Information • Idea: compute value of acquiring evidence • Can be done directly from decision network • Example: buying oil drilling rights • Two blocks A and B, exactly one has oil, worth k • You can drill in one location • Prior probabilities 0.5 each, & mutually exclusive • Drilling in either A or B has MEU = k/2 • Question: what’s the value of information? • Value of knowing which of A or B has oil • Value is expected gain in MEU from new info • Survey may say “oil in a” or “oil in b,” prob 0.5 each • If we know OilLoc, MEU is k (either way) • Gain in MEU from knowing OilLoc? • VPI(OilLoc) = k/2 • Fair price of information: k/2 DrillLoc OilLoc
Value of Information {e} • Assume we have evidence E=e. Value if we act now: • Assume we see that E’ = e’. Value if we act then: • BUT E’ is a random variable whose value is unknown, so we don’t know what e’ will be • Expected value if E’ is revealed and then we act: • Value of information: how much MEU goes up by revealing E’ first: VPI == “Value of perfect information” a P(s | e) U {e, e’} a P(s | e, e’) U {e} P(e’| e) {e, e’}
U VPI Example: Weather MEU with no evidence Umbrella MEU if forecast is bad Weather MEU if forecast is good Forecast Forecast distribution
VPI Properties • Nonnegative • Nonadditive ---consider, e.g., obtaining Ej twice • Order-independent
Quick VPI Questions • The soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is? • There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which? • You have $10 to bet double-or-nothing and there is a 75% chance that the Rams will beat the 49ers. What’s the value of knowing the outcome in advance? • You must bet on the Rams, either way. What’s the value now?
That’s all • Next Lecture • Hidden Markov Models! • (You gonnalove it!!) • Reminder, next lecture is a week from today (on Wednesday). Go to talk by Sanmay Das, Lopata 101, Friday 11-12.