210 likes | 233 Views
Graphical Models for Machine Learning and Computer Vision. Statistical Models. Statistical Models Describe observed ‘DATA’ via an assumed likelihood: With denoting the ‘parameters’ needed to describe the data.
E N D
Statistical Models • Statistical Models Describe observed ‘DATA’ via an assumed likelihood: • With denoting the ‘parameters’ needed to describe the data. • Likelihoods measure how likely what was observed was. They implicitly assume an error mechanism (in the translation between what was observed and what was ‘supposed’ to be observed). • Parameters may describe model features or even specify different models.
An Example of a Statistical Model • A burgler alarm is affected by both earthquakes and burgleries. It has a mechanism to communicate with the homeowner if activated. It went off at Judah Pearles house one day. Should he: • a) immediately call the police • under suspicion that a burglary took • place, or • b) go home and immediately transfer his • valueables elsewhere?
A Statistical Analysis • Observation: The burgler alarm went off (i.e., a=1); • Parameter 1: The presence or absence of an earthquake (i.e., e=1,0); • Parameter 2: The presence or absence of a burglary at Judah’s house (i.e., b=1,0).
LIKELIHOODS/PRIORS IN THIS CASE • The Likelihood associated with the observation is: • With b,e =0,1 (depending on whether a burglery,earthquake has taken place). • The Priors specify the probabilities of a burglery or earthquake happenning:
Example Probabilities • Here are some probabilities indicating something about the likelihood and prior:
LIKELIHOOD/PRIOR INTERPRETATION • Burglaries are as likely (apriori) as earthquakes. • It is unlikely that the alarm just went off by itself. • The alarm goes off more often when a burglary happens but an earthquakes does not than (the reverse) i.e., when an earthquake happens but a burglary does not. • If both a burglary and an earthquake happens than it is (virtually) twice as likely the alarm will go off.
Probability Propagation Graph B E b e b e A a
PROBABILITY PROPOGATION • There are two kinds of Probability Propogation: (see Frey 1998) a) marginalization i.e., • And b) multiplication i.e., • Marginalization sums over terms leading into the node; • Multiplication multiplies over terms leading into the node.
CAUSAL ANALYSIS • To analyze the causes of the alarm going off, we calculate the probability that it was a burglary (in this case) and compare it with the probability
CAUSAL ANALYSIS II • So, after normalization: • Similarly, • So, if we had to choose between burglary and earthquake as a cause of making the alarm go off, we should choose burglary.
Markov Chain Monte Carlo for the Burglar Problem • For current values of e =e*, calculate • or • Simulate b from this distribution. Call the result b*. Now calculate: • Or
Independent Hidden Variables: A Factorial Model • In statistical modeling it is often advantageous to treat variables which are not observed as ‘hidden’. This means that they themselves have distributions. In our case suppose b and e are independent hidden variables: • Then optimally:
Nonfactorial Hidden Variable Models • Suppose b and e are dependent hidden variables: • Then a similar analysis yields a related result
INFORMATION • The difference in information available from parameters after observing the alarm versus before the alarm was observed is: • This is the Kullback-Leibler ‘distance’ between the prior and posterior distributions. • Parameters are chosen to optimize this distance.
INFORMATION IN THIS EXAMPLE • The information available in this example • Calculated using: is
Markov Random Fields • Markov Random Fields are simply Graphical Models set in a 2 or higher dimensional field. Their fundamental criterion is that the distribution of a point x conditional on all of those that remain (i.e., -x) is identical to its distribution given a neighborhood ‘N’ of it (i.e.,
EXAMPLE OF A RANDOM FIELD • Modeling a video frame is typically done via a random field. Parameters identify our expectations of what the frame looks like. • We can ‘clean up’ video frames or related media using a methodology which distinguishes between what we expect and what was observed.
GENERALIZATION • This is can be generalized to non-discrete likelihoods with non-discrete parameters. • More generally (sans data) assume that a movie (consisting of many frames, each of which consists in grey level pixel values over a lattice) is observed. We would like to ‘detect’ ‘unnatural’ events.
GENERALIZATION II • Assume a model for frame i (given frame i-1) taking the form, • The parameters typically denote invariant features for pictures of cars, houses, etc.. • The presence or absence of unnatural events can be described by hidden variables. • The (frame) likelihood describes the natural evolution of the movie over time.
GENERALIZATION III • Parameters are estimated by optimizing the information they provide. This is accomplished by ‘summing or integrating over’ the hidden variables.