1 / 40

MGMT 276: Statistical Inference in Management

MGMT 276: Statistical Inference in Management. Welcome. Remember to pick up your homework. Please double check – All cell phones other electronic devices are turned off and stowed away. http://www.thedailyshow.com/video/index.jhtml?videoId=188474&title=an-arab-family-man. Remember….

errol
Download Presentation

MGMT 276: Statistical Inference in Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MGMT 276: Statistical Inference in Management Welcome Remember to pick up your homework Please double check – All cell phones other electronic devices are turned off and stowed away http://www.thedailyshow.com/video/index.jhtml?videoId=188474&title=an-arab-family-man

  2. Remember… In a negatively skewed distribution: mean < median < mode 95 = mode = tallest point 86.5 = median = middle score 83.7 = mean = balance point Mode Mean Median

  3. Hand in yourhomework Questions about the homework? http://www.thedailyshow.com/video/index.jhtml?videoId=188474&title=an-arab-family-man

  4. ANOVA Extra Credit - Due November 22nd • There are five parts • 1. A one page report of your design (includes all of the information from the writing assignment) • Describe your experiment: what is your question / what is your prediction? • State your Independent Variable (IV), how many levels there are, and the operational definition • State your Dependent Variable (DV), and operational definition • How many participants did you measure, and how did you recruit (sample) them • Was this a between or within participant design (why?) • 2. Gather the data • Try to get at least 10 people (or data points) per level • If you are working with other students in the class you should have 10 data points per level for each member of your group • 3. Input data into Excel (hand in data) • 4. Complete ANOVA analysis hand in ANOVA table • 5. Statement of results (see next slide for example) and include • a graph of your means (just like we did in the homework)

  5. Extra Credit Opportunity Design a question/topic Gather Data Present data in a memo

  6. Please click in Homework due next class November 22nd My last name starts with a letter somewhere between A. A – D B. E – L C. M – R D. S – Z Assignment 15: Regression worksheet (can be found on class website) Please double check – All cell phones other electronic devices are turned off and stowed away

  7. Use this as your study guide By the end of lecture today11/17/11 Logic of hypothesis testing with Correlations Interpreting the Correlations and scatterplots Simple and Multiple Regression Using correlation for predictions r versus r2 Regression uses the predictor variable (independent) to make predictions about the predicted variable (dependent)Coefficient of correlation is name for “r”Coefficient of determination is name for “r2”(remember it is always positive – no direction info)Standard error of the estimate is our measure of the variability of the dots around the regression line(average deviation of each data point from the regression line – like standard deviation) Coefficient of regression will “b” for each variable (like slope)

  8. Readings for next exam Lind Chapter 13: Linear Regression and Correlation Chapter 14: Multiple Regression Chapter 15: Chi-Square Plous Chapter 17: Social Influences Chapter 18: Group Judgments and Decisions

  9. Five steps to hypothesis testing Step 1: Identify the research problem (hypothesis) Describe the null and alternative hypotheses For correlation null is that r = 0 (no relationship) Step 2: Decision rule • Alpha level? (α= .05 or .01)? • Critical statistic (e.g. critical r) value from table? Step 3: Calculations MSBetween F = MSWithin Step 4: Make decision whether or not to reject null hypothesis If observed r is bigger then critical r then reject null Step 5: Conclusion - tie findings back in to research problem Review

  10. Correlation - the prediction line - what is it good for? Prediction line • makes the relationship easier to see • (even if specific observations - dots - are removed) • identifies the center of the cluster of (paired) observations • identifies the central tendency of the relationship(kind of like a mean) • can be used for prediction • should be drawn to provide a “best fit” for the data • should be drawn to provide maximum predictive power for the data • should be drawn to provide minimum predictive error Review

  11. YearlyIncome Expenses per year Correlation - What do we need to define a line Y-intercept = “a”Where the line crosses the Y axis Slope = “b” How steep the line is Review

  12. Prediction line Y’ = a+ b1X1 Y’ = 842 + (-37.5)X1 Interpreting regression equation Y-intercept a) Interpret the slope of the fitted regression line:Sales = 842 – 37.5 Price Slope Notice in this case it is negative A slope of “37.5” suggests that raising “price” by 1 unit will reduce “sales” by 37.5 units b) If “price” = 20, what is the prediction for “Sales”?Sales = 842 – 37.5 Price Sales = 842 - 37.5 Price Sales = 842 - (37.5) (20) Sales = 842 - (37.5) (20) = 842 – 750 = 92 Sales price of product

  13. Prediction line Y’ = a+ b1X1 Y’ = 842 + (-37.5)X1 Interpreting regression equation Y-intercept a) Interpret the slope of the fitted regression line:Sales = 842 – 37.5 Price Slope A slope of “37.5” suggests that raising “price” by 1 unit will reduce “sales” by 37.5 units b) If “price” = 20, what is the prediction for “Sales”?Sales = 842 – 37.5 Price Sales = 842 - 37.5 Price Sales = 842 - (37.5) (20) Sales = 842 - (37.5) (20) = 842 – 750 = 92 (20, 92) Sales probablyabout 92 units Sales price of product If Price = 20

  14. Prediction line Y’ = a+ b1X1 Y’ = 2.277 + (.0307)X1 Interpreting regression equation a) The regression equation: NetIncome = 2,277 + .0307 Revenue Interpret the slope Y-intercept Slope Notice in this case it is positive A slope of “.0307” suggests that raising “Revenue” by 1 dollar, NetIncome will raise by 3 cents b) If “Revenue” = 1,000, what is the prediction for “NetIncome”? NetIncome = 2,277 + .0307 Revenue NetIncome = 2,277 + (.0307 )(1,000) NetIncome = 2,277 + 30.7 = 2,307.7 (1,000, 2,307.7) NetIncome Revenue

  15. Prediction line Y’ = a+ b1X1 Y’ = 2,277 + (.0307)X1 Interpreting regression equation a) The regression equation: NetIncome = 2,277 + .0307 Revenue Interpret the slope Y-intercept Slope A slope of “.0307” suggests that raising “Revenue” by 1 dollar, NetIncome will raise by 3 cents b) If “Revenue” = 1,000, what is the prediction for “NetIncome”? NetIncome will be about 2,307.70 NetIncome = 2,277 + .0307 Revenue NetIncome = 2,277 + (.0307 )(1,000) NetIncome = 2,277 + 30.7 = 2,307.7 (1,000, 2,307.7) NetIncome Revenue If Revenue = 1000

  16. Prediction line Y’ = a+ b1X1 Other Problems Cost will be about 95.06 Cost Y-intercept The expected cost for dinner for two couples (4 people) would be $95.06Cost = 15.22 + 19.96 Persons People Slope If People = 4 If “Persons” = 4, what is the prediction for “Cost”? Cost = 15.22 + 19.96 Persons Cost = 15.22 + 19.96 (4) Cost = 15.22 + 79.84 = 95.06 If “Persons” = 1, what is the prediction for “Cost”? Cost = 15.22 + 19.96 Persons Cost = 15.22 + 19.96 (1) Cost = 15.22 + 19.96 = 35.18

  17. Prediction line Y’ = a+ b1X1 Other Problems Rent will be about 990 Cost Y-intercept Slope Square Feet If SqFt = 800 The expected cost for rent on an 800 square foot apartment is $990Rent = 150 + 1.05 SqFt If “SqFt” = 800, what is the prediction for “Rent”? Rent = 150 + 1.05 SqFt Rent = 150 + 1.05 (800) Rent = 150 + 840 = 990 If “SqFt” = 2500, what is the prediction for “Rent”? Rent = 150 + 1.05 SqFt Rent = 150 + 1.05 (2500) Rent = 150 + 840 = 2,775

  18. Prediction line Y’ = a+ b1X1 Frequency of Teeth brushing will be about Other Problems Y-intercept If number of cavities = 3 Slope The expected frequeny of teeth brushing for having one cavity is Frequency of teeth brushing= 5.5 + (-.91) Cavities If “Cavities” = 3, what is the prediction for “Frequency of teeth brushing”? Frequency of teeth brushing= 5.5 + (-.91) Cavities Frequency of teeth brushing= 5.5 + (-.91) (3) Frequency of teeth brushing= 5.5 + (-2.73) = 2.77 (3.0, 2.77)

  19. X Y XY X2 Y2 . 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 How did we get this regression info? r = -0.85 Review

  20. X Y XY X2 Y2 . 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Draw a scatterplot Review

  21. X Y XY X2 Y2 1 5 5 1 25 3 4 12 9 16 2 3 6 4 9 3 2 6 9 4 5 1 5 25 1 Σ 14 15 34 48 55 Draw a scatterplot Review

  22. Draw a regression line Review

  23. Draw a regression line Review

  24. r = -0.85 b= - 0.91(slope) a= 5.5 (intercept) Draw a regression line and regression equation Review

  25. Draw a regression line and regression equation Prediction line Y’ = b1X1+ b0 Y’ = (-.91)X 1+ 5.5 b0 = 5.5 (intercept) b1 = - 0.91(slope) r = - 0.85

  26. 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities Prediction line Y’ = b1X 1+ b0 Y’ = (-.91)X 1+ 5.5 Correlation - Evaluating the prediction line Does the prediction line perfectly predict the Ys from the Xs? No, let’s see How much “error” is there? Exactly? Residuals The green lines show how much “error” there is in our prediction line…how much we are wrong in our predictions

  27. 5 4 3 Number of times per day teeth are brushed 2 1 0 0 1 2 3 4 5 Number of cavities A note about curvilinear relationships and patterns of the residuals How well does the prediction line predict the Ys from the Xs? Residuals • Shorter green lines suggest better prediction – smaller error • Longer green lines suggest worse prediction – larger error • Why are green lines vertical? • Remember, we are predicting the variable on the Y axis • So, error would be how we are wrong about Y (vertical)

  28. 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities How well does the prediction line predict the Ys from the Xs? Residuals • Slope doesn’t give “variability” info • Intercept doesn’t give “variability info • Correlation “r” does give “variability info • Residuals do give “variability info

  29. Sound familiar?? What if we want to know the “average deviation score”? Finding the standard error of the estimate (line) Y’ = our estimate Y = actual data Standard error of the estimate (line) Standard error of the estimate: • a measure of the average amount of predictive error • the average amount that Y’ scores differ from Y scores • a mean of the lengths of the green lines

  30. 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities Correlation - let’s predict how often they brushed their teeth Find prediction line Y’ = b1 X + b0 Y’ = (-0.91) X + 5.5 Plot line - predict Y’ from X - Pick an X Let’s try X of 1 Y’ = (-0.91) 1 + 5.5 = 4.59 (plot 1,4.59) Let’s try X of 5 - Pick another X Y’ = (-0.91) 5 + 5.5 = 0.95 (plot 5,0.95)

  31. X Y Y’ Y-Y’. 1 5 4.59 0.41 3 4 2.77 1.23 2 3 3.68 -0.68 3 2 2.77 -0.77 5 1 0.95 0.05 A note on Rounding Errors 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities r = -0.85 b1 = - 0.91 b0 = 5.5 .41 Y’ = b1 X + b0 Y’ = (-0.91) X + 5.5 1.23 -.68 Y’ = (-0.91) 1 + 5.5 = 4.59 0.05 -.77 Y’ = (-0.91) 3 + 5.5 = 2.77 Y’ = (-0.91) 2 + 5.5 = 3.68 Y’ = (-0.91) 4+ 5.5 = 1.86 Y’ = (-0.91) 5 + 5.5 = .95 These are our “predicted values” for each X score

  32. X Y Y’ Y-Y’. (Y-Y’)2 1 5 4.59 0.41 0.168 3 4 2.77 1.23 1.513 2 3 3.68 -0.68 0.462 3 2 2.77 -0.77 0.593 5 1 0.95 0.05 .0025 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities r = -0.85 b1 = - 0.91 b0 = 5.5 2.739 .41 Y’ = b1 X + b0 Y’ = (-0.91) X + 5.5 1.23 -.68 Y’ = (-0.91) 1 + 5.5 = 4.59 0.05 -.77 Y’ = (-0.91) 3 + 5.5 = 2.77 Y’ = (-0.91) 2 + 5.5 = 3.68 Y’ = (-0.91) 4+ 5.5 = 1.86 Y’ = (-0.91) 5 + 5.5 = .95 This is like our average (or standard) size of our residual 2.739 0.95 “Standard Error of the Estimate” 3

  33. What if we want to know the “average deviation score”? Finding the standard error of the estimate (line) Standard error of the estimate (line) Standard error of the estimate: • a measure of the average amount of predictive error • the average amount that Y’ scores differ from Y scores • a mean of the lengths of the green lines

  34. Which minimizes errorbetter? 5 4 Number of times per day teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities Is the regression line better than just guessing the mean of the Y variable?How much does the information about the relationship actually help? 5 4 # of times teeth are brushed 3 2 1 0 0 1 2 3 4 5 Number of cavities How much better does the regression line predict the observed results? r2 Wow!

  35. What is r2? r2 = The proportion of the total variance in one variable that is predictable by its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what amount (proportion or percentage) of variance of mother’s height is accounted for by daughter’s height? .64 because (.8)2 = .64 or 64% because (8%)2 = 64%

  36. What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If mother’s and daughter’s heights are correlated with an r = .8, then what proportion of variance of mother’s height is not accountedfor by daughter’s height? .36 because (1.0 - .64) = .36 or 36% because 100% - 64% = 36%

  37. What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If ice cream sales and temperature are correlated with an r = .5, then what amount (proportion or percentage) of variance of ice cream sales is accounted for by temperature? .25 because (.5)2 = .25 or 25% because (5%)2 = 25%

  38. What is r2? r2 = The proportion of the total variance in one variable that is predictable for its relationship with the other variable Examples If ice cream sales and temperature are correlated with an r = .5, then what amount (proportion or percentage) of variance of ice cream sales is not accountedfor by temperature? .75 because (1.0 - .25) = .75 or 75% because 100% - 25% = 75%

  39. Correlation - the prediction line - what is it good for? Prediction line • makes the relationship easier to see • (even if specific observations - dots - are removed) • identifies the center of the cluster of (paired) observations • identifies the central tendency of the relationship(kind of like a mean) • can be used for prediction • should be drawn to provide a “best fit” for the data r2 • should be drawn to provide maximum predictive (explanatory) power for the data • should be drawn to provide minimum predictive error

  40. Thank you! Good luck with your studies!

More Related