1 / 16

Generalized Likelihood Ratio Tests and Model Order Selection Criteria

Generalized Likelihood Ratio Tests and Model Order Selection Criteria. ECE 7251: Spring 2004 Lecture 29 3/26/04. Prof. Aaron D. Lanterman School of Electrical & Computer Engineering Georgia Institute of Technology AL: 404-385-2548 <lanterma@ece.gatech.edu>. The Setup.

jenny
Download Presentation

Generalized Likelihood Ratio Tests and Model Order Selection Criteria

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generalized Likelihood Ratio Testsand Model Order Selection Criteria ECE 7251: Spring 2004 Lecture 29 3/26/04 Prof. Aaron D. Lanterman School of Electrical & Computer Engineering Georgia Institute of Technology AL: 404-385-2548 <lanterma@ece.gatech.edu>

  2. The Setup • Usual parametric data model • In previous lecture on LMP tests, we assumed specials structures like: • What should we do if we have a more general structure like: • Often, we do something a bit ad-hoc! or

  3. The GLRT • Find parameter estimates and • under and • Substituting estimates into likelihood ratio yields a generalized likelihood ratio test: • If convenient, use ML estimates:

  4. Two Sided Gaussian Mean Example (1)

  5. Same as the LMPU test from last lecture! Two-Sided Gaussian Mean Example (2) • Chapter 9 of Hero derives and analyzes the GLRT for every conceivable Gaussian problem – a fantastic reference!

  6. Gaussian Performance Comparison We take a performance hit from not knowing the true mean (Graph from p. 95 of Van Trees Vol. I)

  7. Some Gaussian Examples • Single population: • Tests on mean, with unknown variance yield “T-tests” • Statistic has a Student-T distribution • Asymptotically Gaussian • Two populations: • Tests on equality of variances, with unknown means yields a “Fisher F-test” • Statistic has a Fisher-F distribution • Asymptotically Chi-Square • See Chapter 9 of Hero

  8. Asymptotics to the Rescue (1) • Suppose . Since the ML estimates are asymptotically consistent, the GLRT is asymptotically UMP • If the GLRT is hard to analyze directly, sometimes asymptotic results can help • Assume a partition (nuisance parameters)

  9. Asymptotics to the Rescue (2) • Consider GLRT for a two-sided problem • where is unknown, but we don’t care what it is • When the density p(y;) is smooth under H0, • it can be shown that for large n (Chi-square with p degrees of freedom) • Recall

  10. A Strange Link to Bayesianland • Remember if we had a prior p(), we could handle composite hypothesis tests by integrating and reducing things to a simple hypothesis test • If p() varies slowly compared to p(y|)around the MAP estimate, we can approx. • Suppose MAP and ML estimates are approximately equal

  11. Laplace’s Approximation (1) • Do a Taylor series expansion Empirical Fisher info where

  12. Laplace’s Approximation (2) • Recognize quadratic form of the Gaussian: • So

  13. Large Sample Sizes • Consider the logdensity: • Suppose we have n i.i.d. samples. By the law of large numbers:

  14. Called Bayesian Information Criterion (BIC) or Schwarz Information Criterion (SIC) • Often used in model selection; second term is a penalty on model complexity Schwarz’s Result • As n gets big

  15. Minimum Description Length • BIC is related to Rissanen’s Minimum Description Length criterion; (p/2) ln(n) is viewed as the optimum number of “nats” (like bits, but different base) used to encode the ML parameter estimate with limited precision • Data is encoded with a string of length nats used to encode data given ML est. • Choose model which describes the data using the smallest number of bits (or nats)

  16. References • A.R. Barron, J. Rissanen, B. Yu, “The Minimum Description Length Principle in Coding and Modeling,” IEEE Trans. Info. Theory, Vol. 44, No. 6, Oct. 1998, pp. 2743-2760. • A.D. Lanterman, “Schwarz, Wallace, and Rissanen: Intertwining Themes in Theories of Model Order Estimation,” International Statistical Review, Vol. 69, No. 2, August 2001, pp. 185-212. • Special Issues: • Statistics and Computing (Vol. 10, No. 1, 2000) • The Computer Journal (Vol. 42, No. 4, 1999)

More Related