1 / 143

BU255 FINAL Exam-AID Taught by: Greg Overholt

BU255 FINAL Exam-AID Taught by: Greg Overholt. What are we doing??. Stats  Lectures 10 to 20!! All of it. Lecture 10 & 11:Estimation. Chapter s 5 and 6 :

Download Presentation

BU255 FINAL Exam-AID Taught by: Greg Overholt

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BU255 FINAL Exam-AIDTaught by: Greg Overholt

  2. What are we doing?? • Stats  • Lectures 10 to 20!! • All of it..

  3. Lecture 10 & 11:Estimation

  4. Chapters5 and 6: • Binomial, Poisson, normal, and exponential distributions allow us to make probability statements about X (an individual member of the population). • To do so we need the population parameters. • Binomial: p • Poisson: μ • Normal: μ and σ • Exponential: λ

  5. Chapter 7: Sampling distributions allow us to make probability statements about sample statistics. We need the population parameters. Sample mean: µ and σ Sample proportion: p However, in almost all realistic situations parameters are unknown.We will use the sampling distribution to draw inferences about the unknown population parameters.

  6. Introduction to Statistical Inference • Estimation • Point and Interval Estimators • Properties of Estimators • Interval Estimation [ confidence intervals] • Determining Sample Size

  7. Estimation • Estimation: determining approximate value of pop parameter based on sample statistic. • 2 types: • Point Estimator • No good. Too small • Interval Estimator • Used almost all the time. • Uses an interval to estimate the population parameter. • Provides % certainty that it is between a lower and upper bound

  8. Estimating u when σ known You typically want to know u.. And if you have σ? 100(1-)%Confidence Interval ofμwhen  is known

  9. EXAMPLE • Diageo, sampled 85 Laurier students and determined the sample mean of alcohol consumption was 510 drinks a term. They previous calculated that the population standard deviation was 46. Please create interval of population mean with 95% confidence. X(bar) = 510 n = 85 σ = 46 Za/2 .. Unknown, but we want 95% confidence. 95% in the middle, so that’s 2.5% on each tail, so we want to find the Z value of .475 = 1.96 = 510 – 1.96(46/√85) < u < 510+ 1.96(46/√85) = 510 – 9.78 < u < 510+ 9.78 = 500.22 < u < 519.78 95% confident that the average number of drinks for the population is between 500.22 and 519.78

  10. Intro – T dist: What’s different? • In past, we have known standard deviation of the population (which is unrealistic) • With it, we can use Z stat to make inferences • NOW, we don’t know st dev. So, have to use the ‘sample st dev’ – why we use the T-stat • GOT to have a normal (or approx) population dist!

  11. T-Distribution

  12. Degrees of Freedom • It is the number of items that are free to vary to define the mean.. The best way to think of it is to assume one of the numbers is your mean, and the rest are simply numbers around the mean to determine its shape (so the degrees of freedom are the number of items that determine the shape (n – the one center value) • (Normal has df = infinity

  13. EXCEL (good MC) • T-dist calculations can be done using excel: • TDIST(x,degrees_freedom,tails) • This is when you want the % in the tail(s). • TDIST(1.3,60,1) • 1.3 is your t-value (like your z-value) and the curve is drawn with 60 degrees of freedom and you want the 1 tail test (vs 2). (ANSWER = 0.0992 (so 9% in the 1 tail test)) • TINV(p-value,degrees_freedom) • This is the inverse. Give it the % in the tails and it will give you the T-value. • NOTE: will give you the % in a 2-tail test!!!! • SO, if they wanted you to do the inverse of the q above to get a t-value of 1.3: • TINV(0.1984, 60) – you double the percentage for 2 tail!

  14. Formula / example • Assume pop is relatively normal • Confidence interval formula QUESTION: The researched average cost of a standing-room only ticket to a Leafs game from a scalper is $168. A random sample of buying 16 tickets from different scalpers resulted inxbar= $172.50, s = $15.40. Find the 95% interval estimate. Assume population distribution is relatively normal. Degrees of Freedom?: (n-1) = 15. What is the T value? (t .025, 15 ) =

  15. Formula / example • Assume pop is normal (relatively) • Confidence interval formula The researched average cost of a standing-room only ticket to a Leafs game from a scalper is $168. A random sample of buying 16 tickets from different scalpers resulted inxbar= $172.50, s = $15.40. Find the 95% interval estimate. Assume population distribution is relatively normal. Degrees of Freedom?: (n-1) = 15. What is the T value? (t .025, 15 ) = 2.131 SO formula is  172.50 – 2.131(15.40/4) < u < 172.50 + 2.131(15.4/4) INTERVAL of 95% confidence  164.3 to 180.7 … YES, this includes 168

  16. Estimating the Population Proportion Assumption: Formula (did this in midterm): Confidence Interval of p (new.. But simply rearranging letters):

  17. Example What proportion of male students in Canada are have a violent case of ‘Beiber Fever’? A random sample of 1,350 Laurier students were sampled, and 250 of them reveled they had ‘Beiber Fever’.. What is the 98% confidence interval for the population proportion?

  18. Example A random sample of 1,350 Laurier students were sampled, and 250 of them reveled they had ‘Beiber Fever’.. What is the 98% confidence interval for the population proportion? P (hat) = 250/1350 = .185 Z a/2 = the Z value which has 1% in each tail (2% in total for a 98% confidence)

  19. Example A random sample of 1,350 Laurier students were sampled, and 250 of them reveled they had ‘Beiber Fever’.. What is the 98% confidence interval for the population proportion? P (hat) = 250/1350 = .185 Z a/2 = the Z value which has 1% in each tail (2% in total for a 98% confidence)  2.325. RESULT= 0.185 + / – 2.325 ( √ .185 ( 1 - .185) / 1350 ) 98% confidence range = 0.1604 to 0.2095

  20. Selecting the sample size • The difference between the sample mean and the population mean is called the error of estimation. • You can make sure you stay within it, by another freaking formula: E = error (given in q)

  21. EXAMPLE • I want to know how many students I need to interview to find out how many times a Laurier student facebook stakes in 1 day. I want to be 95% certain and that the range of error is 2. It turns out the standard deviation of this stat is 5. GO: n = ? (what we want to find out) σ = 5 E = 2 Z a/2 = 95% confidence.. Which is 2.5% in each tail, which is a z value of 1.96 n = (1.96 * 5 / 2 ) 2 n = 24.01 (so need 25 people)

  22. Determining nwhen Estimatingp 1. Use the historical min or max of p, if available. 2. To be safe, use p = 0.5 if p is totally unknown. What proportion of students in statistics actually open their textbook? To estimate this proportion within 5% and be 95% confident, how large a sample should you take? If historically <15% of students ever do, and NO historical information is available. n = .15 ( .85) * 1.962 / .052 N = 100.062 N = 101 (MUST round up!)

  23. Class 12 & 13: Intro Hypothesis Testing for single populations

  24. Hypothesis Testing • There are two procedures for making inferences: • Estimation. • Hypotheses testing. • The purpose of hypothesis testing is to determine whether there is enough statistical evidence in favor of a certain belief about a parameter.

  25. Hypothesis Testing • There are two hypothesis: • Null Hypothesis (H0) • Assumed to be true • Ex. The defendant is innocent • Alternative (or research) Hypothesis (H1) • Opposite of H0 • Ex. The defendant is guilty • NOTE: The null will always states the parameter equal the value specified in the alternative.

  26. Hypothesis Testing Process • Step 1: State the Null and Alternative • Eg: You want to see if the exam average will be greater then 75%. • H0 = 75 • H1 > 75 • Step 2: randomly sample the pop and create a test statistic (in this case a sample mean) • The procedure begins with the NULL BEING TRUE (and the goal is to see if there is enough evidence to say that the alternative is true). • Step 3: Make statement about hypo • If t-stat value is inconsistent with null hypo, we reject the null  alternative is true.

  27. Hypo Testing Decisions • Reject the null in favour of the alternative • Sufficient evidence to support the alternative • Do not reject the null in favour of alt. • Does not mean ‘accepting the null’ (just not enough evidence) • Ex. Can’t prove that the defendant is guilty does not mean that he is innocent

  28. Hypo Testing Errors • Two types of errors are possible when making the decision whether to reject H0(the null hypothesis) • Type 1 error (alpha): reject null hypothesis – send a innocent man to jail (reject null when null is true!) MOST SERIOUS OF THE TWO!

  29. Hypo Testing Errors Type 2 error: don’t reject a false null hypothesis (go with the safe null assumption.. Don’t have the balls to reject it!!  ) Guilty man goes free. (not rejected null when null is actually false) It can be calculated .. (later). Our original hypothesis… THIS EXAMPLE IS TESTING AVG HYDRO BILLS, estimated mean of 170. Sample bills were taken to get x bar, and trying to figure out the critical range. our new assumption…

  30. 2 ways to Test: Rejection Region • Depending on you are looking for <, >, or not equal to, you define the rejection region • Level of significance = α

  31. Test It: P-value • The p-value of a test is the probability of observing a test statistic at least as extreme as the one computed given that the null hypothesis is true. • The smallest value of α for which H0 can be rejected QUESTION: Testing hydro bills.. If they think the average customer’s hydro bill is 170 (with standard deviation of 65), and they want to test to see if they are larger than that. The company tested 400 customers to find that they had an average of $178. Should you reject or accept the null hypothesis? (we want 95% confidence) p-value

  32. Test It: P-value The p-value of a test is the probability of observing a test statistic at least as extreme as the one computed given that the null hypothesis is true. The smallest value of α for which H0 can be rejected p-value P-value =.0069 Z=2.46

  33. Type II Error Example Example: • H0: µ = 170 • H1: µ > 170 • At a significance level of 5% we rejected H0 in favor of H1 since our sample mean (178) was greater than the critical value of (175.34). If want to do a Type 2 error - In the question – they will have to give you the new mean to test. ($180 mean) • β= P( x < 175.34, given that µ = 180), thus…

  34. Our original hypothesis… our new assumption… Chance we send a guilty man free

  35. Changing your confidence requirement!

  36. INCREASE THE SAMPLE SIZE!

  37. Estimating MEAN with T-stat (refresher) T-dist instead of normal & sample stdev and not population stdev. QUESTION: Tiger Woods is rumoured to pay his ‘girls’ $1million per year to stay quiet. If a random sample of 7 of them were taken and the mean was $800,000 with a stdev of $100,000. Find the 95% interval estimate of the population mean. (assume pop is normal..) Degrees of freedom = 6 = 800K + t(.025) (100K/√7) = 800K + 2.447(37,796) = 800,000 +/- 92,486 RANGE between $707,513 and $892,486

  38. Hypo testing with T-stat • T-statistic: Same as z-stat, just using sample mean and stdev! QUESTION: SO.. can we conclude with 95% confidence that the mean that Cheetah pays for his girls is not $1,000,000? H0. u = 1Million H1. u ≠ 1 million (two tail test) T-critical with 2.5% in each tail is at -/+ 2.447. -5.29 is definitely past -2.447, reject the null = mean isn’t 1 million! T = 800,000 – 1,000,000 / (100,000 / √7) T = -200,000 / 37,796 = -5.29

  39. Third type.. proportion Assumption:np > 5, n(1-p) > 5 The high school student council believes that 11% of its students will come to the school dance wasted, and they wanted to test their belief. A sample of 200 students resulted in 28 indicating they in fact, will be tanked before they arrive. Use a probability of a Type I error of 0.10. H0 = .11 will be drunk H1 ≠ .11 will br drunk (have to use 2 tail test.. Don’t know which way they are testing) P hat = 28/200 or .14

  40. The high school student council believes that 11% of its students will come to the school dance wasted, and they wanted to test their belief. A sample of 200 students resulted in 28 indicating they in fact, will be tanked before they arrive. Use a probability of a Type I error of 0.10. H0 = .11 will be drunk H1 ≠ .11 will br drunk P hat = 28/200 or .14 Z = .14 - .11 / √ ( .11 ( .88) / 200 ) Z = .03 / √ .000484 Z = .03 / .022 Z = 1.36 Is 1.36 far enough? On z-table, we need to know the p value of 1.36 and compare to .05 (half of significant level of .10)

  41. The high school student council believes that 11% of its students will come to the school dance wasted, and they wanted to test their belief. A sample of 200 students resulted in 28 indicating they in fact, will be tanked before they arrive. Use a probability of a Type I error of 0.10. H0 = .11 will be drunk H1 ≠ .11 will br drunk P hat = 28/200 or .14 Is 1.36 far enough? On z-table, we need to know the p value? Z = 1.36 has .0869 in the right tail. Z = .14 - .11 / √ ( .11 ( .88) / 200 ) Z = .03 / √ .000484 Z = .03 / .022 Z = 1.36 BUT this is a two-tail test, so it needed by < .05., so cannot reject the null in favour of the alternative.

  42. RECAP!! With 1 population, we talked about: Mean (known ) Mean (unknown ) Proportion Standard z-stat T-statistic z-stat for prop.

  43. Lectures 14 &15: Inference about comparing Two Populations

  44. inference about comparing two population With two populations, we can be: Comparing the means Comparing paired observations Comparing proportions

  45. COMPARING TWO MEANS • Similar to dealing with 1 mean, now we are looking at the difference of two pop means: 1. If you know the stdev’s?? Plug-and-play: Confidence Formula:

  46. A random sample of 32 business students from Laurier are asked how often they party so hard they don’t remember what happened during the night previous. A similar random sample is taken of 34 science students. The results and the population SDs are given below. Q. Is there enough evidence to say that they differ with a 5% confidence level? Question Example

More Related