1 / 56

45-733: lecture 8 (chapter 7)

45-733: lecture 8 (chapter 7). Point Estimation. Samples from populations. There is some population we are interested in: Families in the US Products coming off our assembly line Consumers in our product’s market segment Employees. Samples from populations.

pestela
Download Presentation

45-733: lecture 8 (chapter 7)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 45-733: lecture 8 (chapter 7) Point Estimation William B. Vogt, Carnegie Mellon, 45-733

  2. Samples from populations • There is some population we are interested in: • Families in the US • Products coming off our assembly line • Consumers in our product’s market segment • Employees William B. Vogt, Carnegie Mellon, 45-733

  3. Samples from populations • We are interested in some quantitative information (called variables) about these populations: • Income of families in the US • Defects in products coming off our assembly line • Perception of consumers of our product • Productivity of our employees William B. Vogt, Carnegie Mellon, 45-733

  4. Samples from populations • All the information (accessible to statistics) about a quantity in a population is contained in its distribution function • Real-world distribution functions are complicated things • In real life, we usually know little or nothing about the distribution functions of the variables we are interested in William B. Vogt, Carnegie Mellon, 45-733

  5. Samples from populations • Because distribution functions are complex, we only try to find out about certain aspects of them (parameters): • Average income of families in the US • Rate of defects coming off our production line • % of customers who view our product favorably • Average pieces/hour finished by a worker William B. Vogt, Carnegie Mellon, 45-733

  6. Samples from populations • Of course, we do not begin by knowing even these quantities • One possibility is to measure the whole population • Allows us to answer any question about the distribution or parameters, using the techniques of chapter 2 • However, this is almost always expensive and often infeasible William B. Vogt, Carnegie Mellon, 45-733

  7. Samples from populations • Instead, we take a sample • Taking a sample • We select only a few of the members of the population • We measure the variables of interest for those members we select • Examples • Phone survey • Take 1 out of each 10,000 units off our prod line William B. Vogt, Carnegie Mellon, 45-733

  8. Samples from populations • The whole of statistics is figuring out what we can learn about the population from a sample: • What can we say about the distribution of a variable from the information in a sample? • What can we say about the parameters we are interested in from our sample? • How good is the information in our sample about the population? William B. Vogt, Carnegie Mellon, 45-733

  9. Samples and statistics • As a practical matter, we are usually interested in using our sample to say something about a parameter of the distribution we care about • To get at this parameter, we construct a variable called an estimator or statistic William B. Vogt, Carnegie Mellon, 45-733

  10. Sample and estimator • An estimate is an informed guess at the value of a parameter • An estimator is an algorithm or rule for turning samples into informed guesses about the value of a parameter • An estimator is an algorithm for tuning samples into estimates William B. Vogt, Carnegie Mellon, 45-733

  11. Sample and estimator • Example: • We are benchmarking our compensation policies for our salesforce • Therefore, we are interested in how much salespeople who work in similar jobs for similar companies are paid • Naturally, they are not all paid the same • There is a distribution of salaries among these salespeople William B. Vogt, Carnegie Mellon, 45-733

  12. Sample and estimator • Example: • We don’t need or want to know exactly how much each and every one of these comparable people is paid • We don’t need or want to know the exact distribution of pay for this job William B. Vogt, Carnegie Mellon, 45-733

  13. Sample and estimator • Example: • We do need and want to know some basic facts about pay in this job. For example: • What is the mean salary? • What is the median salary? • What is the standard deviation of salary? • What is the 25th percentile of salary? • What is the 75th percentile of salary? • How is salary related to: • Experience? • Typical hours? Travel requirements? • Job responsibilities? Etc. William B. Vogt, Carnegie Mellon, 45-733

  14. Sample and estimator • Example: • Each of these things can be regarded as a parameter, either of the distribution of salaries or of the joint distribution of salary and other variables • Let’s focus on mean salary • We take a sample of salaries: {s1, s2, …,sn} • How can we get an estimate of E(s)=s? William B. Vogt, Carnegie Mellon, 45-733

  15. Sample and estimator • Example: • Let’s focus on mean salary, E(s)=s • There is a TRUE value of s • This value is fixed (non-random) • It is just a number, like $47,432.81 • We wish to know it • Knowing it exactly would be nice • If we can’t know it exactly, a good guess would be useful. William B. Vogt, Carnegie Mellon, 45-733

  16. Sample and estimator • Example: • Let’s focus on mean salary • We take a sample of salaries: {s1, s2, …,sn} • S-bar is an estimator • S-bar tells us what to do with a sample to turn it into a guess at the (population) mean salary William B. Vogt, Carnegie Mellon, 45-733

  17. Sample and estimator • Example: • Let’s focus on mean salary • We take a sample of salaries: {s1, s2, …,sn} • S-bar is an estimator • S-bar is a random variable with a distribution function of its own • The distribution of s-bar depends on the distribution of the underlying s William B. Vogt, Carnegie Mellon, 45-733

  18. Sample and estimator • Example: • Let’s focus on mean salary • Suppose our sample is (in thousands):{55,62,43,77,89,61} • The our estimate would be: William B. Vogt, Carnegie Mellon, 45-733

  19. Sample and estimator • Example: • Let’s focus on mean salary • Suppose our sample is (in thousands):{45,52,33,67,79,51} • The our estimate would be: William B. Vogt, Carnegie Mellon, 45-733

  20. Sample and estimator • Example: • Let’s focus on mean salary • In both cases, the estimator is: • But in one case, the estimate is 64.5 and in the other example, the estimate is 54.5 William B. Vogt, Carnegie Mellon, 45-733

  21. Sample and estimator • A key distinction: estimator vs. estimate • An estimate is a guess, based on a sample, at the value of a parameter • It is a number, not random • It is different for each sample, and depends on the sample • An estimator is an algorithm, a rule, a formula for turning a sample into an estimate. • It is a random variable • It’s distribution depends only on the distribution of the underlying variable • It is exactly the same from sample to sample William B. Vogt, Carnegie Mellon, 45-733

  22. Sample and estimator • Review: • We wish to know about (some quantity) in a population • The distribution of the quantity = complete knowledge • A parameter of the distribution = a summary of the info in the distribution • A estimate is a guess at a parameter based on the information in a sample • An estimator is a way of turning samples into guesses William B. Vogt, Carnegie Mellon, 45-733

  23. All estimators are created equal? • NOT! • What makes for a good estimator? • What makes for a good guess? • Being exactly right all the time (can’t be done) • Being close to right, making few/small mistakes • Being right on average • Improving as the sample size grows William B. Vogt, Carnegie Mellon, 45-733

  24. All estimators are created equal? • There is a parameter we want to know, let’s call it . It has a true value that we don’t know. • We have an estimator, call it 1-hat, which has some distribution. • We have another estimator, call it 2-hat, which has some (other) distribution • How can we know which of these two is better than the other William B. Vogt, Carnegie Mellon, 45-733

  25. All estimators are created equal? • Some examples of estimators for E(s)=s • The sample mean: William B. Vogt, Carnegie Mellon, 45-733

  26. All estimators are created equal? • Some examples of estimators for E(s)=s • The sample mean plus one: William B. Vogt, Carnegie Mellon, 45-733

  27. All estimators are created equal? • Some examples of estimators for E(s)=s • The first observation: William B. Vogt, Carnegie Mellon, 45-733

  28. All estimators are created equal? • Some examples of estimators for E(s)=s • Roll a die and use the number of spots: William B. Vogt, Carnegie Mellon, 45-733

  29. All estimators are created equal? • Some examples of estimators for E(s)=s • Seven: William B. Vogt, Carnegie Mellon, 45-733

  30. All estimators are created equal? • Some examples of estimators for E(s)=s • It should be clear that the sample mean is the best of these estimators • We want to develop objective criteria for evaluating estimators which allow us to conclude that, for example, that the sample mean is the best of these estimators William B. Vogt, Carnegie Mellon, 45-733

  31. All estimators are created equal? • Consider the distribution of the sample mean: William B. Vogt, Carnegie Mellon, 45-733

  32. All estimators are created equal? • Compared to the distribution of s,2-hat: William B. Vogt, Carnegie Mellon, 45-733

  33. All estimators are created equal? • Why do we like the distribution of the sample mean better? • It is centered on the true value, s • The estimator (the random variable) is more often close to the truth, s William B. Vogt, Carnegie Mellon, 45-733

  34. All estimators are created equal? • Consider the distribution of the sample mean: William B. Vogt, Carnegie Mellon, 45-733

  35. All estimators are created equal? • Compare to the distribution of the first obs William B. Vogt, Carnegie Mellon, 45-733

  36. All estimators are created equal? • Why do we like the distribution of the sample mean better? • Now, both are centered on the true value, s • The sample mean is more often close to the truth, s • Now, because it has smaller variance William B. Vogt, Carnegie Mellon, 45-733

  37. All estimators are created equal? • Consider the distribution of the sample mean: William B. Vogt, Carnegie Mellon, 45-733

  38. All estimators are created equal? • Compare to the distribution of seven William B. Vogt, Carnegie Mellon, 45-733

  39. All estimators are created equal? • Why do we like the distribution of the sample mean better? • Sample mean is centered on the true value, s, no matter what the true value is • The estimator “seven” is only centered on the true value if the true value happens to be s=7 • Similarly, the sample mean is close to the true value more often unless the true value is very close to seven William B. Vogt, Carnegie Mellon, 45-733

  40. All estimators are created equal? • Recall • In general, we are trying to estimate a parameter whose value we do not know,  • We have a proposed estimator, 1-hat • We have another proposed estimator, 2-hat • We want to know which is better • So, we need some criteria to use to compare estimators William B. Vogt, Carnegie Mellon, 45-733

  41. All estimators are created equal? • The simplest criteria • Is “an estimator is good if it is always right” • But a parameter is just a fixed number, like 62. • An estimator is a random variable, so it can take on many values • So, practically no estimator will be good by this criterion. • We must lower our standards! William B. Vogt, Carnegie Mellon, 45-733

  42. All estimators are created equal? • Bias and unbiasedness • Since estimators are random variables, we can think about their expectations • We are going to say that an estimator is unbiased if: William B. Vogt, Carnegie Mellon, 45-733

  43. All estimators are created equal? • Bias and unbiasedness • An estimator is unbiased if it is (always) “right on average” • An unbiased estimator is “not systematically wrong” William B. Vogt, Carnegie Mellon, 45-733

  44. All estimators are created equal? • Bias and unbiasedness • The bias of an estimator is defined as: • Obviously, an unbiased estimator has a bias equal to zero William B. Vogt, Carnegie Mellon, 45-733

  45. All estimators are created equal? • Bias and unbiasedness • The sample mean is unbiased • The sample mean plus one is biased • The sample mean plus one has a bias of 1 • This is why we like the sample mean better than the sample mean plus one • Sample mean is better than sample mean plus one on the biasedness criterion William B. Vogt, Carnegie Mellon, 45-733

  46. All estimators are created equal? • Some unbiased estimators: • The sample mean for the population mean • The sample variance for the population variance • The sample proportion for the population proportion William B. Vogt, Carnegie Mellon, 45-733

  47. All estimators are created equal? • Some biased estimators: • The sample standard deviation for the population standard deviation • The sample median for the population median • Sample percentiles for population percentiles William B. Vogt, Carnegie Mellon, 45-733

  48. All estimators are created equal? • Variance (efficiency) • Suppose we are comparing two unbiased estimators, • We say that 1-hat is more efficient than 2-hat if: William B. Vogt, Carnegie Mellon, 45-733

  49. All estimators are created equal? • Variance (efficiency) William B. Vogt, Carnegie Mellon, 45-733

  50. All estimators are created equal? • Variance (efficiency) • We like the sample mean better than the first observation because its variance is lower: William B. Vogt, Carnegie Mellon, 45-733

More Related