Bayesian Decision Theory: Normal Density and Discriminant Functions

Lecture 2.Bayesian Decision Theory Multivariate normal distribution Discriminant function for normal distributions Discriminant function for discrete distribution

Normal density Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty Normal distribution has the maximum entropy over all distributions with a given mean and variance.

Normal density Let Σ be the covariance matrix, then it has k pairs of eigenvalues and eigenvectors. A can be decomposed as: Σ is positive semidefinite: …0 Zero is achieved when the data doesn’t occupy the entire k dimensional space.

Normal density Whitening transform:

Discriminant function Features -> discriminant functions gi(x), i=1,…,c Assign class i if gi(x) > gj(x) j  i Decision surface defined by gi(x) = gj(x)

Normal density To make a minimum error rate classification (zero-one loss), we use discriminant functions: This is the log of the numerator in the Bayes formula. Log is used because we are only comparing the gi’s, and log is monotone. When normal density is assumed: We have:

Discriminant function for normal density i = 2I Linear discriminant function: Note: blue boxes – irrelevant terms.

Discriminant function for normal density The decision surface is where With equal prior, x0 is the middle point between the two means. The decision surface is a hyperplane,perpendicular to the line between the means.

Discriminant function for normal density “Linear machine”: dicision surfaces are hyperplanes.

Discriminant function for normal density With unequal prior probabilities, the decision boundary shifts to the less likely mean.

Discriminant function for normal density (2) i = 

Discriminant function for normal density Set: The decision boundary is:

Discriminant function for normal density The hyperplane is generally not perpendicular to the line between the means.

Discriminant function for normal density (3) i is arbitrary Decision boundary is hyperquadrics (hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)

Discriminant function for normal density Decision boundary:

Discriminant function for normal density

Discriminant function for normal density Extention to multi-class.

Discriminant function for discrete features Discrete features: x = [x1, x2, …, xd ]t, xi{0,1 } pi = P(xi = 1 | 1) qi = P(xi = 1 | 2) The likelihood will be:

Discriminant function for discrete features The discriminant function: The likelihood ratio:

Discriminant function for discrete features So the decision surface is again a hyperplane.

Discriminant function for discrete features EX：

Discriminant function for discrete features

Bayesian Decision Theory: Normal Density and Discriminant Functions

Bayesian Decision Theory: Normal Density and Discriminant Functions

Presentation Transcript

Bayesian Decision Theory

2. Bayes Decision Theory

Lecture 2. Bayesian Decision Theory

Bayesian Decision Theory (Classification)

Chapter 2 (Part 1): Bayesian Decision Theory (Sections 2.1-2.2)

Intro to Pattern Recognition : Bayesian Decision Theory

Bayesian Decision Theory Case Studies

Chapter 2 (part 3) Bayesian Decision Theory (Sections 2-6,2-9)

LECTURE 02: BAYESIAN DECISION THEORY

LECTURE 02: BAYESIAN DECISION THEORY

Bayesian Decision Theory (Sections 2.1-2.2)

Chapter 2 (Part 2): Bayesian Decision Theory (Sections 2.3-2.5)

Likelihood, Bayesian and Decision Theory

Decision theory and Bayesian statistics. More repetition

Bayesian Decision Theory

Bayesian Decision Theory – Continuous Features

Chapter 2 (Part 2): Bayesian Decision Theory (Sections 2.3-2.5)

Chapter 15: Likelihood, Bayesian, and Decision Theory

Chapter 2 (part 3) Bayesian Decision Theory (Sections 2-6,2-9)

Bayesian Decision Theory

Bayesian Decision Theory