1 / 23

Bayesian Decision Theory: Normal Density and Discriminant Functions

Explore the concept of Bayesian Decision Theory focusing on Normal Density, Covariance Matrix, Discriminant Functions, and Decision Surfaces in multivariate distributions. Understand how to optimize classification with minimum error rates.

janices
Download Presentation

Bayesian Decision Theory: Normal Density and Discriminant Functions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 2.Bayesian Decision Theory Multivariate normal distribution Discriminant function for normal distributions Discriminant function for discrete distribution

  2. Normal density Reminder: the covariance matrix is symmetric and positive semidefinite. Entropy - the measure of uncertainty Normal distribution has the maximum entropy over all distributions with a given mean and variance.

  3. Normal density Let Σ be the covariance matrix, then it has k pairs of eigenvalues and eigenvectors. A can be decomposed as: Σ is positive semidefinite: …0 Zero is achieved when the data doesn’t occupy the entire k dimensional space.

  4. Normal density Whitening transform:

  5. Discriminant function Features -> discriminant functions gi(x), i=1,…,c Assign class i if gi(x) > gj(x) j  i Decision surface defined by gi(x) = gj(x)

  6. Normal density To make a minimum error rate classification (zero-one loss), we use discriminant functions: This is the log of the numerator in the Bayes formula. Log is used because we are only comparing the gi’s, and log is monotone. When normal density is assumed: We have:

  7. Discriminant function for normal density i = 2I Linear discriminant function: Note: blue boxes – irrelevant terms.

  8. Discriminant function for normal density The decision surface is where With equal prior, x0 is the middle point between the two means. The decision surface is a hyperplane,perpendicular to the line between the means.

  9. Discriminant function for normal density “Linear machine”: dicision surfaces are hyperplanes.

  10. Discriminant function for normal density With unequal prior probabilities, the decision boundary shifts to the less likely mean.

  11. Discriminant function for normal density (2) i = 

  12. Discriminant function for normal density Set: The decision boundary is:

  13. Discriminant function for normal density The hyperplane is generally not perpendicular to the line between the means.

  14. Discriminant function for normal density (3) i is arbitrary Decision boundary is hyperquadrics (hyperplanes, pairs of hyperplanes, hyperspheres, hyperellipsoids, hyperparaboloids, hyperhyperboloids)

  15. Discriminant function for normal density Decision boundary:

  16. Discriminant function for normal density

  17. Discriminant function for normal density

  18. Discriminant function for normal density Extention to multi-class.

  19. Discriminant function for discrete features Discrete features: x = [x1, x2, …, xd ]t, xi{0,1 } pi = P(xi = 1 | 1) qi = P(xi = 1 | 2) The likelihood will be:

  20. Discriminant function for discrete features The discriminant function: The likelihood ratio:

  21. Discriminant function for discrete features So the decision surface is again a hyperplane.

  22. Discriminant function for discrete features EX:

  23. Discriminant function for discrete features

More Related