1 / 37

From last time: PR Methods

From last time: PR Methods. Feature extraction + Pattern classification Training, testing, overfitting, overtraining Minimum distance methods Discriminant Functions Linear Nonlinear (e.g, quadratic, neural networks) -> Statistical Discriminant Functions. Statistical Pattern Recognition.

dacian
Download Presentation

From last time: PR Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From last time:PR Methods • Feature extraction + Pattern classification • Training, testing, overfitting, overtraining • Minimum distance methods • Discriminant Functions • Linear • Nonlinear (e.g, quadratic, neural networks) • -> Statistical Discriminant Functions

  2. Statistical Pattern Recognition • Many sources of variability in speech signal • Much more than known deterministic factors • Powerful mathematical foundation • More general way of handling discrimination

  3. Statistical Discrimination Methods • Minimum error classifier and Bayes rule • Gaussian classifiers • Discrete density estimation • Mixture Gaussians • Neural networks

  4. we decide x is in class 2 we decide x is in class 1

  5. How to approximate a Bayes classifier • Parametric form with single pass estimation • Discretize, count co-occurrences • Iterative training (mixture Gaussians, ANNs) • Kernel estimation

  6. Minimum distance classifiers • If Euclidean distance used, optimum if: • Gaussian • Equal priors • Uncorrelated features • Equal variance per feature • If different variances per feature, correlated features, MD could be better

  7. Then the discriminant function can be Di(x) = wiTx+ wi0 • Where Wi = Σi-1μi • Andwi0 = - ½ (μiTΣi-1μi) + log p(ωi) • This is a linear classifier

  8. General Gaussian case • Unconstrained covariance matrices per class • Thenthe discriminant function is Di(x) = xTWix + wiTx + wi0 • This is a quadratic classifier • Gaussians are completely specified by 1stand 2nd order statistics • Is this enough for general populations of data?

  9. A statistical discriminant function log p(x |ωi) + log p (ωi )

  10. Remember: P(a|b) = P(a,b)/P(b) P(a,b) = P(a|b)P(b) = P(b|a)P(a)

  11. Upcoming quiz etc. • Monday, 1st the guest talk on “deep” neural networks • Then the quiz. Topics: ASR basics, pattern recognition overview. Typical questions are multiple choice plus short explanation. Aimed at a 30 minute length. • There will be one more HW, one more quiz, then all oriented towards project.

More Related