670 likes | 752 Views
Understand Bayesian classification and GMM for structured classification, handwriting recognition, and spam filtering. Learn about the importance of prior knowledge and generative models in classification problems.
E N D
Bayesian Learning & Gaussian Mixture Models Jianping Fan Dept of Computer Science UNC-Charlotte
Basic Classification Input Output Spam vs. Not-Spam Spam filtering Binary !!!!$$$!!!! Multi-Class Character recognition C C vs. other 25 characters
Structured Classification Input Output Handwriting recognition Structured output brace building 3D object recognition tree
Overview of Bayesian Decision • Bayesian classification: one example • E.g. How to decide if a patient is sick or healthy, based on • A probabilistic model of the observed data (data distributions) • Prior knowledge (ratio or importance)
Bayes’ Rule: Who is who in Bayes’ rule
Classification problem • Training data: examples of the form (d,h(d)) • where d are the data objects to classify (inputs) • and h(d) are the correct class info for d, h(d){1,…K} • Goal: given dnew, provide h(dnew)
Why Bayesian? • Provides practical learning algorithms • E.g. Naïve Bayes • Prior knowledge and observed data can be combined • It is a generative (model based) approach, which offers a useful conceptual framework • E.g. sequences could also be classified, based on a probabilistic model specification • Any kind of objects can be classified, based on a probabilistic model specification
Univariate Normal Sample Sampling
Maximum Likelihood Sampling We want to maximize it. Given x, it is a function of and 2
Log-Likelihood Function Maximize this instead By setting and
Miss Data Missing data Sampling
E-Step be the estimated parameters at the initial of the tth iterations Let
E-Step be the estimated parameters at the initial of the tth iterations Let
M-Step be the estimated parameters at the initial of the tth iterations Let
n= 40 (10 data missing) Estimate using different initial conditions. 375.081556 362.275902 332.612068 351.383048 304.823174 386.438672 430.079689 395.317406 369.029845 365.343938 243.548664 382.789939 374.419161 337.289831 418.928822 364.086502 343.854855 371.279406 439.241736 338.281616 454.981077 479.685107 336.634962 407.030453 297.821512 311.267105 528.267783 419.841982 392.684770 301.910093 Exercise
Multinomial Population Sampling Nsamples
Maximum Likelihood Sampling Nsamples
Maximum Likelihood Sampling Nsamples We want to maximize it.
Mixed Attributes Sampling Nsamples x3 is not available
E-Step Sampling Nsamples x3 is not available Given (t), what can you say about x3?
Exercise Estimate using different initial conditions?
# Children n6 n2 n3 n4 n5 n1 Married Obasongs Unmarried Obasongs (No Children) Binomial/Poison Mixture M: married obasong X: # Children n0 # Obasongs
# Children n6 n2 n3 n4 n5 n1 Married Obasongs Unmarried Obasongs (No Children) Binomial/Poison Mixture M: married obasong X: # Children n0 # Obasongs Unobserved data: nA : # married Ob’s nB : # unmarried Ob’s
# Children n6 n6 n2 n2 n3 n3 n4 n4 n5 n5 n1 n1 pA, pB p1 p2 p3 p4 p5 p6 Probability Binomial/Poison Mixture M: married obasong X: # Children n0 # Obasongs Complete data
# Children n6 n6 n2 n2 n3 n3 n4 n4 n5 n5 n1 n1 pA, pB p1 p2 p3 p4 p5 p6 Probability Binomial/Poison Mixture n0 # Obasongs Complete data
# Children n6 n6 n2 n2 n3 n3 n4 n4 n5 n5 n1 n1 pA, pB p1 p2 p3 p4 p5 p6 Probability Complete Data Likelihood n0 # Obasongs Complete data
Maximum Likelihood
Latent Variables Incomplete Data Complete Data
Complete Data Complete Data Likelihood
Complete Data Complete Data Likelihood A function of latent variable Y and parameter A function of parameter A function of random variable Y. The result is in term of random variable Y. Computable If we are given ,
Expectation Step Let (i1) be the parameter vector obtained at the (i1)th step. Define
Maximization Step Let (i1) be the parameter vector obtained at the (i1)th step. Define
Mixture Models • If there is a reason to believe that a data set is comprised of several distinct populations, a mixture model can be used. • It has the following form: with
Mixture Models Let yi{1,…, M} represents the source that generates the data.
Mixture Models Let yi{1,…, M} represents the source that generates the data.
Mixture Models
Expectation Zero when yi l
Maximization Given the initial guess g, We want to find , to maximize the above expectation. In fact, iteratively.
The GMM (Guassian Mixture Model) Guassian model of a d-dimensional source, say j : GMM with M sources: