1 / 30

MGMT 276: Statistical Inference in Management Spring, 2014

MGMT 276: Statistical Inference in Management Spring, 2014. Welcome. Green sheets. Please click in. My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z. Please read before our next exam (March 25 th ) - Chapters 5 - 11 in Lind book

trynt
Download Presentation

MGMT 276: Statistical Inference in Management Spring, 2014

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MGMT 276: Statistical Inference in ManagementSpring, 2014 Welcome Green sheets

  2. Please click in My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z

  3. Please read before our next exam (March 25th) - Chapters 5 - 11 in Lind book - Chapters 10, 11, 12 & 14 in Plous book: Lind Chapter 5: Survey of Probability Concepts Chapter 6: Discrete Probability Distributions Chapter 7: Continuous Probability Distributions Chapter 8: Sampling Methods and CLT Chapter 9: Estimation and Confidence Interval Chapter 10: One sample Tests of Hypothesis Chapter 11: Two sample Tests of Hypothesis Plous Chapter 10: The Representativeness Heuristic Chapter 11: The Availability Heuristic Chapter 12: Probability and Risk Chapter 14: The Perception of Randomness

  4. Use this as your study guide By the end of lecture today3/6/14 Central Limit Theorem Confidence Intervals Logic of hypothesis testing Steps for hypothesis testing Levels of significance (Levels of alpha) what does p < 0.05 mean? what does p < 0.01 mean?

  5. Homework Due – Tuesday (March 11th) Please print and complete homework worksheet #11 Dan Gilbert Reading: Law of Large Numbers • Due – Thursday (March 13th) • Please print and complete homework worksheet #12 • Calculating Confidence Intervals And Examples of Type I and Type II Errors

  6. Comments on Dan Gilbert Reading

  7. Review of Homework Worksheetjust in case of questions

  8. Homework review 2 = .40 5 Based on apriori probability – all options equally likely – not based on previous experience or data Based on expert opinion - don’t have previous data for these two companies merging together Based on frequency data (Percent of rockets that successfully launched)

  9. Homework review Based on apriori probability – all options equally likely – not based on previous experience or data 30 = .30 100 Based on frequency data (Percent of times at bat that successfully resulted in hits) Based on frequency data (Percent of times that pages that are “fake”)

  10. Homework review Based on frequency data (Percent of students who successfully chose to be Economics majors) 5 = .10 50

  11. . .8276 .1056 .2029 .1915 .3944 .4332 .3944 .3944 55 55 55 52 44 50 50 44 - 50 4 52 - 50 4 -1.5 +.5 = = 55 - 50 4 +1.25 = z of 1.5 = area of .4332 z of .5 = area of .1915 1.25 = area of .3944 55 - 50 4 55 - 50 4 +1.25 +1.25 = = .5000 - .3944 = .1056 z of 1.25 = area of .3944 z of 1.25 = area of .3944 .4332 +.3944 = .8276 .3944 -.1915 = .2029

  12. .3264 Homework review .2152 .5143 .1255 .3888 .1736 .1736 .3888 3,000 3,500 2,500 3,500 3,000 2500 - 2708 650 3000 - 2708 650 3000 - 2708 650 -.32 = 0.45 0.45 = = z of -0.32 = area of .1255 z of 0.45 = area of .1736 z of 0.45 = area of .1736 3500 - 2708 650 3500 - 2708 650 1.22 = 1.22 = .5000 - .1736 = .3264 z of 1.22 = area of .3888 z of 1.22 = area of .3888 .3888 +.1255= .5143 .3888 - .1736 = .2152

  13. .0764 Homework review .9236 .1185 .4236 .4236 .4236 .3051 10 12 20 20 10 - 15 3.5 -1.43 = 20 - 15 3.5 20 - 15 3.5 1.43 1.43 = = z of -1.43 = area of .4236 z of 1.43 = area of .4236 z of 1.43 = area of .4236 12 - 15 3.5 -0.86 = .5000 + .4236 = .9236 .5000 - .4236 = .0764 z of -.86 = area of .3051 .4236 – .3051 = .1185

  14. Central Limit Theorem

  15. X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X Sampling distribution for continuous distributions • Central Limit Theorem: If random samples of a fixed N are drawn • from any population (regardless of the shape of the • population distribution), as N becomes larger, the • distribution of sample means approaches normality, with • the overall mean approaching the theoretical population • mean. Distribution of Raw Scores Sampling Distribution of Sample means Melvin 23rd sample Eugene X X X X X 2nd sample

  16. Central Limit Theorem Proposition 1: If sample size (n) is large enough (e.g. 100) The mean of the sampling distribution will approach the mean of the population As n ↑ x will approach µ Proposition 2: If sample size (n) is large enough (e.g. 100) The sampling distribution of means will be approximately normal, regardless of the shape of the population As n ↑ curve will approach normal shape Proposition 3: The standard deviation of the sampling distribution equals the standard deviation of the population divided by the square root of the sample size. As n increases SEM decreases. As n ↑ curve variability gets smaller X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X

  17. Central Limit Theorem: If random samples of a fixed N are drawn from any population (regardless of the shape of the population distribution), as N becomes larger, the distribution of sample means approaches normality, with the overall mean approaching the theoretical population mean. Distribution of Raw Scores Animation for creating sampling distribution of sample means Distribution of single sample Eugene Melvin Sampling Distribution of Sample means Sampling Distribution of Sample means Mean for sample 12 Mean for sample 7 http://onlinestatbook.com/stat_sim/sampling_dist/index.html

  18. . Writing Assignment: Writing a letter to a friend • Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these two questions: (Feel free to use diagrams and drawings if you think that can help) • Dear Friend, • 1. I’m struggling with this whole Central Limit Theorem idea. Could you • describe for me the difference between a distribution of raw scores, and a • distribution of sample means? • 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all • seem to address sample size, but I don’t get how sample size could affect • these three things. If you could help explain it, that would be really helpful.

  19. . • Imagine you have a good friend (pick one). This is a good friend whom you consider to be smart and interested in stuff generally. They are teaching themselves stats (hoping to test out of the class) but need your help on a couple ideas. For this assignment please write your friend/mom/dad/ favorite cousin a letter answering these two questions: (Feel free to use diagrams and drawings if you think that can help) • Dear Friend, • 1. I’m struggling with this whole Central Limit Theorem idea. Could you • describe for me the difference between a distribution of raw scores, and a • distribution of sample means? • 2. I also don’t get the “three propositions of the Central Limit Theorem”. They all • seem to address sample size, but I don’t get how sample size could affect • these three things. If you could help explain it, that would be really helpful.

  20. Problem with point estimate Mean kids IQ of 100. Mean income of $35,000 a year. Mean weight 7 pounds. Are we right always? - no How close is our estimation? - what other information about these distributions would we want to know? Variability! Which of these distributions would allow our guess to be closest to what’s right?

  21. Standard Error of the Mean (SEM) Remember confidence intervals? Revisit Confidence Intervals Confidence Intervals (based on z): We are using this to estimate a value such as a population mean, with a known degree of certainty with a range of values • The interval refers to possible values of the population mean. • We can be reasonably confident that the population mean • falls in this range (90%, 95%, or 99% confident) • In the long run, series of intervals, like the one we • figured out will describe the population mean about 95% • of the time. Greater confidence implies loss of precision.(95% confidence is most often used) Can actually generate CI for any confidence level you want – these are just the most common

  22. Confidence Intervals (based on z): A range of values that, with a known degree of certainty, includes an unknown population characteristic, such as a population mean • How can we make our confidence interval smaller? • Increase sample size (This will decrease variability) • Decrease variability through more careful assessment • and measurement practices (minimize noise) . • Decrease level of confidence 95% 95%

  23. ? ? Mean = 50Standard deviation = 10 Find the scores for the middle 95% 95% x = mean ± (z)(standard deviation) 30.4 69.6 .9500 Please note: We will be using this same logic for “confidence intervals” .4750 .4750 ? 1) Go to z table - find z score for for area .4750 z = 1.96 2) x = mean + (z)(standard deviation) x = 50 + (-1.96)(10) x = 30.4 30.4 3) x = mean + (z)(standard deviation) x = 50 + (1.96)(10) x = 69.6 69.6 Scores 30.4 - 69.6 capture the middle 95% of the curve

  24. ? ? Mean = 50Standard deviation = 10 n = 100 s.e.m. = 1 Confidence intervals σ 95% standard error of the mean = Find the scores for the middle 95% n √ 48.04 51.96 For “confidence intervals” same logic – same z-score But - we’ll replace standard deviation with the standard error of the mean .9500 .4750 .4750 ? 10 = 100 √ x = mean ± (z)(s.e.m.) x = 50 + (1.96)(1) x = 51.96 x = 50 + (-1.96)(1) x = 48.04 95% Confidence Interval is captured by the scores 48.04 – 51.96

  25. mean = 121 standard deviation= 15 n = 25 σ standard error of the mean = Find a 95% Confidence Interval for this distribution n √ 100 110 120 130 140 raw score = mean + (z score)(standard error) 15 = = 3 √ 25 raw score = mean ± (z score)(sem) Please notice: We know the standard deviation and can calculate the standard error of the mean from it X = 121 ± (1.96)(3) = 121 ± 5.88 (115.12, 126.88) x = x ± (z)(σx) confidence interval

  26. Confidence intervals ? ? σ standard error of the mean 95% = n √ Mean = 50 Standard error mean = 10 Hint always draw a picture! Tell me the scores that border exactly the middle 95% of the curve We know this raw score = mean ± (z score)(standard deviation) Construct a 95 percent confidence interval around the mean Similar, but uses standard error the mean raw score = mean ± (z score)(standard error of the mean)

  27. Confidence Interval of 95%Has and alpha of 5%α = .05 Confidence Interval of 99% Has and alpha of 1% α = .01 99% Area outside confidence interval is alpha 95% Area in the tails is called alpha 90% Confidence Interval of 90% Has and alpha of 10% α = . 10 Area associated with most extreme scores is called alpha

  28. Measurements that occur within the middle part of the curve are ordinary (typical) and probably belong there Area outside confidence interval is alpha Area outside confidence interval is alpha Moving from descriptive stats into inferential stats…. 99% 95% Measurements that occur outside this middle ranges are suspicious, may be an error or belong elsewhere 90%

  29. Thank you! See you next time!!

More Related