1 / 30

STATS 730: Lecture 5

STATS 730: Lecture 5. Sufficiency!. Today’s lecture:. Theme for the next few lectures. The estimation problem: We sample data from a population. Data has joint density depending on a parameter q that has a real-world interpretation. The Estimation Problem:.

mglen
Download Presentation

STATS 730: Lecture 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STATS 730: Lecture 5 Sufficiency! Today’s lecture: 730 Lectures 5&6

  2. Theme for the next few lectures • The estimation problem: • We sample data from a population. • Data has joint density depending on a parameter q that has a real-world interpretation 730 Lectures 5&6

  3. The Estimation Problem: • Given a sample X1,…,Xn with joint density f(x1,…xn;q) how should we combine the data to estimate q? ie what statistic S(X) should we use as an estimate? • What considerations are important? • Aside: Important special case: if the X’s are iid, then 730 Lectures 5&6

  4. Important considerations • Small bias (mean of sampling distribution close to q) • Small standard error (std dev of sampling distribution small) • Estimate should use “all the information in the data” -Sufficiency 730 Lectures 5&6

  5. Today’s lecture Sufficiency • The Concept • The Definition • The factorisation theorem • Examples 730 Lectures 5&6

  6. Sufficiency: the concept • Suppose X1,…,Xn have joint density f(x, q) where the value of q is unknown. • We have a statistic S (ie a function of the sample) • How much information about q is contained in the statistic S? 730 Lectures 5&6

  7. Sufficiency – the concept • Suppose the sample is normal with unknown mean m and known variance 1. • How much information about m is contained in the sample mean? • How about the sample variance? Intuitively, the sample mean has more information. 730 Lectures 5&6

  8. Sufficiency – the concept • Suppose the population is Poisson, with unknown mean q. Then the population variance is also q. • Consider two statisticians, A and B who want to estimate q. A gets to look at the whole sample, while B only gets to see the sample mean. Is A better off than B? 730 Lectures 5&6

  9. Sufficiency – the concept • If A gets to look at the whole sample, and B only gets to see the sample variance, is A better off than B? 730 Lectures 5&6

  10. Sufficiency: the concept • Sticking with the Poisson example, suppose A gets to see 100 random numbers. Clearly A hasn’t got any information about q. Why? • Because the distribution of the 100 random numbers is uniform[0,1], which does not depend on q. 730 Lectures 5&6

  11. Sufficiency: the concept • Still sticking with the Poisson example, suppose A gets to see the mean of the Poisson sample. Then, later, A gets to see the whole sample. What does A get to see the second time? • Answer: an observation from the conditional distribution of the sample, given the mean 730 Lectures 5&6

  12. Sufficiency: the concept • If this conditional distribution has q as a parameter, then A has gained some information. • If the conditional distribution does not involve q, (ie q is not a parameter) then A gets no further information by observing the whole sample. 730 Lectures 5&6

  13. Sufficiency: the definition A statistic S is sufficient for a parameter q if the conditional distribution of X1,…,Xn given S does not involve q. 730 Lectures 5&6

  14. Example 1 The Poisson distribution-again The mean is sufficient, because... The conditional distribution of X1,..,Xn given the sample mean does not involve q. To show this… 730 Lectures 5&6

  15. Example 1 (cont) which does not involve q!!!! 730 Lectures 5&6

  16. Example 2 Consider X1,…,Xn having independent binary distributions P(Xi=1)=q, P(Xi=0)=1-q SiXi is sufficient for q: 730 Lectures 5&6

  17. Example 2 (cont) 730 Lectures 5&6

  18. Factorisation theorem • Conditional distributions can be tricky – is there an easier way? • Yes: use the factorisation theorem 730 Lectures 5&6

  19. Factorisation theorem S is sufficient for qif and only if the joint density f(x;q) of the observations can be written as A(S(x);q)B(x) where B does not depend on q This is the “factorisation”!! 730 Lectures 5&6

  20. Factorisation theorem (example) For example, the joint density of the Poisson is 730 Lectures 5&6

  21. Factorisation theorem: proof of continuous version • The hard bit! • The conditional distribution of X1,…,Xn given a statistic S(X) is hard to compute. • So….. 730 Lectures 5&6

  22. Factorisation theorem: proof ofcontinuous version • Suppose we can find a 1-to-1 function g that maps X1,…,Xn onto Y1,…,Yn, such that Y1=S(X). • Y and X contain the same information about q, since if we know X we know Y and vice versa. • Thus, S is sufficient for q iff the conditional distribution of Y2,…Yn given Y1 does not involve q. 730 Lectures 5&6

  23. Factorisation theorem: continuous version Notation: • y=g(x) • x=h(y) (ie h is inverse of g) • S(h(y))=y1 730 Lectures 5&6

  24. No q!! Factorisation theorem: proof ofcontinuous version • Now we prove the theorem. • Suppose the conditional distribution of Y2,…,Yn given Y1 does not involve q. We will show the factorisation holds. 730 Lectures 5&6

  25. Factorisation theorem: proof of continuous version Now conversely suppose the factorisation is true. We show the conditional distribution of Y2,…,Yn given Y1 does not involve q. Step 1: by the change of variable formula,the joint density of Y is 730 Lectures 5&6

  26. Factorisation theorem: proof ofcontinuous version Step 2: the marginal density of Y1 is 730 Lectures 5&6

  27. No q!!! Factorisation theorem: proof of continuous version Step 3: the conditional distribution is 730 Lectures 5&6

  28. No q!! Factorisation theorem: example Normal N(q,1). Mean is sufficient. Joint density is 730 Lectures 5&6

  29. Factorisation theorem: example Any iid sample. Order statistics are sufficient. Joint density is Function of the x(i) alone! 730 Lectures 5&6

  30. Factorisation theorem: example An iid exponential sample. Mean are sufficient. Joint density is 730 Lectures 5&6

More Related