1 / 37

Welcome to BUAD 310

Welcome to BUAD 310. Instructor: Kam Hamidieh Lecture 16, Monday March 24, 2014. Agenda & Announcement. Today: Chapter 17 You’ll have to read a few slides (27+) and parts of the chapter on your own. Reading: All of Chapter 17

ronna
Download Presentation

Welcome to BUAD 310

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Welcome to BUAD 310 Instructor: Kam Hamidieh Lecture 16, Monday March 24, 2014

  2. Agenda & Announcement • Today: • Chapter 17 • You’ll have to read a few slides (27+) and parts of the chapter on your own. • Reading: All of Chapter 17 • Note: Homework 4 is due this Wednesday, March 26, at 5 PM. No extensions will be given. • Next Time: Chapter 18! BUAD 310 - Kam Hamidieh

  3. From Last Time • Hypothesis testing with population mean μ: • Hypothesis testing with population proportion p: • The logic: assume H0 true, compute test stat, p-value, check for compatibility of data with H0 • Note, you can also create confidence intervals for μ and p. BUAD 310 - Kam Hamidieh

  4. From Last Time • Types of errors: • Type I = H0 true but you reject H0 • Type II = Ha true but you fail to reject H0 • α is our max tolerance level for probability of making type I error. • Final practical decisions must consider: • Consequences of the possible errors • Practical versus statistical significance BUAD 310 - Kam Hamidieh

  5. About P-Value • P-Value does not tells us the probability of null hypothesis being true. • P-Value does not tells us the probability of null hypothesis being true. • P-Value is computed by assuming that null is true. • If p-value is low then this means that your results are not consistent with assuming null is true. • When it comes to decision time we use the cut off α. • “Smaller p-values provide stronger evidence against the null.” BUAD 310 - Kam Hamidieh

  6. Not a fan of P-Values? You’re not alone! • A scathing criticism of hypothesis testing and p-values:http://www.phil.vt.edu/dmayo/personal_website/Schmidt_Hunter_Eight_Common_But_False_Objections.pdf • A more balanced view:http://www.ncbi.nlm.nih.gov/pubmed/10937333 BUAD 310 - Kam Hamidieh

  7. Today…the same! • We will have a new parameter: • Test statistics BUAD 310 - Kam Hamidieh

  8. Today…same thing! • The confidence intervals will have the form: Point Estimate ± (margin of error) BUAD 310 - Kam Hamidieh

  9. Example, Two Sample t-test A marketing team designed a promotional web page to increase online sales. Out of the total 100 visitors to the company’s website, 55 were assigned to the old page and 45 to the new page. The assignments were done at random. We have the following sales data in dollars: Does the new page generate (statistically) significant higher sales? Let α = 5%. BUAD 310 - Kam Hamidieh

  10. Example, Two Sample t-test All the possible spending amounts at new site μnew = mean customer spending, new site All the possible spending amounts at old site μold = mean customer spending, old site Independent random Samples Random sample from new site: new = sample mean sales from old new Random sample from old site: old = sample mean sales from old site new- old = Point estimate of μnew- μold BUAD 310 - Kam Hamidieh

  11. Example (Continued) Define:μold = mean customer spending at the old siteμnew= mean customer spending at the new site We want to determine whether the means of populations represented by two independent samples of a quantitative variables (amount spent) differ. We can form our hypotheses as follows: H0: μnew -μold ≤ 0 (μnew≤μold) Ha: μnew - μold> 0 (μnew>μold) BUAD 310 - Kam Hamidieh

  12. Example (Continued) The test-statistics: The distribution of the test statistics is approximately t distribution with df determined by software (or if done by hand, use smaller of (nnew -1 , nold – 1) = min(44,54)=44.) BUAD 310 - Kam Hamidieh

  13. Example (YouTube Videos will be posted.) StatCrunch: BUAD 310 - Kam Hamidieh

  14. Example (Continued) From software:P-Value ≈ 0.0042 We reject the null hypothesis since p-value is less than α. We have sufficient evidence from our data that the new site increases the mean sales. BUAD 310 - Kam Hamidieh

  15. In the Business World… An interesting article: http://www.wired.com/business/2012/04/ff_abtesting Look here too:https://www.optimizely.com/ BUAD 310 - Kam Hamidieh

  16. Two Sample Test in General Suppose a random sample of size n1 is drawn from a normal population with mean μ1, and an independent random sample of size n2 is drawn from a normal population with mean μ2. To perform your hypothesis test with null value D0, compute the two sample t statistics: and use p-values for a t distribution where the df is approximated by the software(or is by hand use smaller of n1 -1 and n2 – 1) BUAD 310 - Kam Hamidieh

  17. Some Comment • Theory tells us that the random sample must be from normal populations. However, we will stick with the guidelines from Slide 25, Lecture 13. • You can always use graphical tools such as boxplots and histograms to check your data. • In general, the independence condition is met as long as subjects or items were randomly assigned to two groups. BUAD 310 - Kam Hamidieh

  18. CI for Difference of Two Means Under the same conditions as the two sample t-test (see slide 15), (1-α)100% confidence interval for μ1 – μ2 is: where tα/2 comes from a t distribution with df determined by the software (or if done by hand use smaller of n1 -1 and n2 – 1.) BUAD 310 - Kam Hamidieh

  19. CI for Previous Example From StatCrunch, 95% CI (new – old) after some rounding is: (21, 139) We are 95% confident that the mean spending on the new site will be $21 to $139 more than the old site. (What would you get if you do this by hand?) BUAD 310 - Kam Hamidieh

  20. In Class Exercise 1 Exposure to dust at work can lead to lung disease later in life. One study – this was an actual study! – measured the workplace exposure of two tunnel construction workers: Indoor workers vs. Outdoor workers. The subjects were chosen at random. The data is summarized below: The measurement units are in milligram years per cubic meter. Use t-table: since we are doing this by hand, use df = 100. If you have your laptops, try using Stats Crunch. • Create a 95% confidence interval for the difference in the mean exposures. • Conduct a hypothesis test to see if the exposure for these do groups differ. (Use α = 5%) • Comments? BUAD 310 - Kam Hamidieh

  21. Relationship between CI and Testing • When dealing with mean (or means), a two-sided test at level α can be carried out directly from a confidence interval at (1-α)100%. • More specifically, given: H0: μ = μ0 vs. Ha: μ≠ μ0Reject H0 at α level ↔ μ0not in (1- α)100% CIFail to Reject H0 at α level ↔ μ0in (1- α)100% CI • Look at previous problem! BUAD 310 - Kam Hamidieh

  22. Examples The p-value for a two sided test of H0: μ = 30 is 0.033: • If α= 5%, does the 95% CI include the value 30? No! Since we would reject H0 => 30 not in 95% CI. • If α= 1%, does the 99% CI include the value 30? Yes! Since we fail to reject H0 => 30 in 99% CI. BUAD 310 - Kam Hamidieh

  23. In Class Exercise 2 A 95% CI for a population mean is (53, 62). • Can you reject H0: μ = 58 at α = 5% against a two sided hypothesis? Explain. • Can you reject H0: μ = 63 at α = 5% against a two sided hypothesis? Explain. BUAD 310 - Kam Hamidieh

  24. Some Terminology & Concepts • Think of the webpage example as an “experiment”. • You can think of the group that were sent to the new site as the treatment group. • The group that were sent to old site are called the control group. • We are comparing the treatment and control groups. • Randomization: This is the process by which subjects (or things in our study) get assigned to different groups. BUAD 310 - Kam Hamidieh

  25. Why Randomize? • Goal: make the two groups as similar as possible. • If the groups are similar in all aspects, and if we detect a difference based on our inference, then we can say that the difference is due to the condition we applied. BUAD 310 - Kam Hamidieh

  26. Randomization • Suppose we had not randomized in our example of webpage sales. • See any problems with this? • You pick your 45 subjects for the new site on Saturday. • Then you pick your 55 subjects for the old site on Wednesday. • The day of the week is a lurking variable! It is a variable you have not thought about but can explain the difference between the groups or variables. BUAD 310 - Kam Hamidieh

  27. Example of Paired T-Test BaseLine Final 1 159 194 2 93 122 3 130 158 4 174 154 5 148 93 6 148 90 7 85 101 8 180 99 9 92 183 10 89 82 11 204 100 12 182 104 13 110 72 14 88 108 15 134 110 16 84 81 The data for this example come from a study of a diet. The diet consists mostly of protein and animal fats, restricting the carbohydrate intake. Triglyceride values (mg/100 ml) are given for the male participants both before the diet and at end of a period of time following the diet. High levels of triglycerides have been linked to many diseases. We think that this diet will reduce the triglyceride levels. Set α = 0.05. BUAD 310 - Kam Hamidieh

  28. Example of Paired Data Baseline People Final Differences 159 194 35 We are interested in the differences! 93 122 29 84 81 -3 BUAD 310 - Kam Hamidieh

  29. Paired T-Tests • The term paired data means that the data have been observed in pairs. • You have two sets of data but they are dependent. • Generally you are interested in the differences between the two sets of data. • To perform inference, you just get the differences in two groups and perform a t-test on them. It is that simple! • The assumptions are the same for the one sample t-tests. BUAD 310 - Kam Hamidieh

  30. Example Continued • Define the population parameter of interest: d = population mean difference in the Triglyceride level of the Diet participants (final – baseline) • Our Hypotheses: H0: d≥ 0 Ha: d < 0 • Note the use of subscript d to emphasize difference. BUAD 310 - Kam Hamidieh

  31. Example Continued Sample mean of the difference: is a point estimate for population mean difference and likewise for μfinal - μbase = μd Sample sd of the differences: BUAD 310 - Kam Hamidieh

  32. Example • Our test statistic in this case: • If the null hypothesis were true, our observed value is -1.20 standard error units below the null or the hypothesized value of 0. BUAD 310 - Kam Hamidieh

  33. Example • Under the H0, the test statistic has a t-distribution with df= 16 – 1 = 15. We write T ~ t, df=15. • We have: P-Value = P(T ≤ -1.20). • The exact software value is 0.1244. (Done this many times!) BUAD 310 - Kam Hamidieh

  34. Example • Our p-value is large (>0.05) at 0.1244. • So we fail to reject the null hypothesis. We do not have a statistically significant result. • The data give no evidence that the diet has an effect on the mean Triglyceride levels. BUAD 310 - Kam Hamidieh

  35. Confidence Intervals • The (1-α)100% confidence intervals will be just like the confidence intervals for the one sample t tests: • Here: (Note:2.131 was obtained from StatCrunch but t table gives the same value) Zero in this interval; the difference is not statistically significant. BUAD 310 - Kam Hamidieh

  36. Some Comments • Again, nothing new: just perform a one sample t test on the differences. • Could we have performed a two-sample t test? Yes, but need to make sure the two groups are independent and similar. • Sometimes paired t-tests are not possible: testing to see if a new additive to concrete mixture improves maximum load. BUAD 310 - Kam Hamidieh

  37. In Class Exercise 3 Suppose you want to compare prices in two websites. For concreteness, let’s assume you want to compare textbook prices: amazon.com vs. bn.com. Clearly (?) you can’t obtain a list of all the items and make a full comparison. Assume that you can obtain a random selection of the same items from the two websites. How would you go about making the comparison? BUAD 310 - Kam Hamidieh

More Related