1 / 44

Hypothesis testing

Hypothesis testing. HYPOTHESIS TESTING - CORRELATION, REGRESSION, SAMPLE T-TESTS, TEST FOR EQUAL VARIANCES. What is Hypothesis Testing?.

Download Presentation

Hypothesis testing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Hypothesis testing HYPOTHESIS TESTING - CORRELATION, REGRESSION, SAMPLE T-TESTS, TEST FOR EQUAL VARIANCES

  2. What is Hypothesis Testing? It is a way of analysing sample data to confirm if an alteration / improvement within this data would cause a significant difference within the population data Why Do Hypothesis Testing? 1) To improve the processes, we have to discover the factors which are seriously impacting upon the mean and / or standard deviation 2) To confirm that improvements to the inputs we have selected from the Measure phase will have a significant difference on the process. 3) Sometimes we decide to improve a system by personal judgement, this is a subjective approach to resolving the problem. Hypothesis testing gives an objective response, therefore it can be clearly seen if a process has improved or not

  3. Hypothesis Testing Terminology = or > 0.05 < 0.05 Hypothesis Testing Procedure: 1) State your Null Hypothesis - This usually follows the format ‘There will be no significant difference in process performance’ 2) State your Alternative Hypothesis - This usually follows the format ‘There will be a significant difference in process performance’ 3) Test the Alternative Hypothesis using one of the statistical tests that will be covered later on 4) Based on the results reject or accept the Alternative Hypothesis (Ha)

  4. Innocent, Set Free Guilty, Set Free Guilty, Jailed Innocent, Jailed Hypothesis Testing - Decision Errors This is classed as a Type II Error or b Error. This error occurs when a difference has occurred but has not been detected. Truth Ho = Person Is Innocent Ha = Person Is Guilty Ho Ha Innocent Guilty This is classed as a Type I Error or a Error. This error occurs when no difference has occurred but we have detected one Ho Set Free Verdict Ha To reduce the occurrence of these errors we place significant levels on the tests we perform. Normally we place a level of 5% on the tests, this means we are 95% confident that we have not made an error Jailed

  5. Hypothesis Testing - Procedures To perform hypothesis testing there is a procedure to follow to ensure that the tests on your sample size can truly reflect the behaviour of the population • Stage 1: Normality Test - as with the majority of statistical tests to be confident of our results we need to use normal data 1) Basic Statistics 2) Run Chart

  6. Hypothesis Testing - Procedures • Stage 2: Definition of inputs/outputs - depending on the type of data you have (discrete/continuous), depends what tools you can use

  7. Avg Avg Avg 75 60 65 Death 160 115 175 Major 100 65 135 Minor 15 10 25 None Hypothesis Testing - Statistical Roadmap Y=Gas Mileage (mpg) Y=Gas Mileage(mpg) 30 30 20 20 10 10 0 .5 1 1.5 2X=Car Weight (tons) A B C X=Car Brand X (Factor) treated as: Continuous Discrete • T-Test • Homogeneity of Variance • 1-Way ANOVA • Scatterplot • Simple Regression Continuous Y (Response) • Goodness of Fit • Test of Independence Discrete Y= InjurySeverity A B C X=Car Brand

  8. Y=Gas Mileage (mpg) Y=Gas Mileage(mpg) 30 30 20 20 10 10 Avg Avg 0 .5 1 1.5 2X=Car Weight (tons) A B C X=Car Brand Avg X (Factor) treated as: Continuous Discrete • T-Test • Homogeneity of Variance • 1-Way ANOVA • Scatterplot • Simple Regression Continuous Y (Response) • Goodness of Fit • Test of Independence Discrete Y= InjurySeverity A B C X=Car Brand 75 60 65 Death 160 115 175 Major 100 65 135 Minor 15 10 25 None Hypothesis Testing - Continuous X & Y

  9. Strong Negative Correlation Strong Positive Correlation 25 25 20 20 15 15 Series1 Series1 Y 10 Y 10 5 5 0 0 0 5 10 15 20 25 0 5 10 15 20 25 X X No Correlation 25 20 15 Series1 Y 10 5 0 0 5 10 15 20 25 X Hypothesis Testing - Continuous X & Y Scatter Plot Purpose: To study if there is a relationship between two variables Types of Relationships

  10. Scatterplot Exercise

  11. Scatterplot Exercise 1. Double click C2 then C1 Results for: AgeYear.MTW Plot - 2 30+ x - x x Yrs With- - 2 - x 20+ x x - - x - x x - 2 10+ x x - x x - xx x xx x - x2 x x - 4 x x 0+ x x x x x 2 ------+---------+---------+---------+---------+---------+Age 24.0 30.0 36.0 42.0 48.0 54.0 2. Click ‘OK’ Appears to be a relationship. As the age group increases then the number of years with the company increases

  12. Correlation Analysis • Purpose: To measure the strength and direction of the linear association between two or more continuous variables, Unlike the Scatter plot Correlation can perform analysis for Up to 15 factors (Compare to C & E Matrix, narrowing down). It does not however give a graphical display but statistical figure for the strength of the relationship. It does not look at relationship between Input and Output (Cause & Effect) just the association. Significant correlation does not always imply causality between the variables. • Key Features: Correlation value, or ‘r’ value as it is known, is always a number between –1 and +1 where r = -1 in the case of a perfect negative association, r = +1 in the case of a perfect positive association, and r = 0 when there is no linear association. Correlation analysis is a useful tool to help narrow down from multiple inputs to the inputs that prove vital. It does not however give you a specific ‘r’ values between inputs and outputs, all it does is to treat every variable independently and tries to discover if there are any relationships between them

  13. Correlation Analysis Exercise 1. Select these columns and click ‘OK’

  14. Correlation Analysis Exercise Correlations: C/O, N/M/A, N/P, P/R, Total Downtime C/O N/M/A N/P P/R N/M/A 0.113 0.311 N/P -0.132 -0.059 0.237 0.596 P/R -0.108 -0.087 -0.031 0.336 0.438 0.785 Total Do 0.129 0.794 0.053 -0.029 0.250 0.000 0.639 0.794 Cell Contents: Pearson correlation P-Value These two variables appear to be strongly correlated as the r value is near +1 and the p value is lower. We will see if they are closely related by performing regression analysis

  15. Simple Linear Regression FITTED LINE PLOT • We have shown how to make scatter plots of data and talked about positive and negative correlation of two data sets. • Regression analysis is a statistical technique used to model and investigate the relationship between two or more variables. The model is often used for prediction. • It may be used to analyze relationships between the “X’s,” or between “Y” and “X.” • Regression is a powerful tool, but can never replace engineering or manufacturing process knowledge about trends.

  16. Linear Regression Exercise File: Downtime3.MTW 2. Click ‘Options’

  17. Linear Regression Exercise Y = C + MX 3. Check both boxes 4. Check that this reads 95 5. Click ‘Ok’ twice The equation shown above the graph is a statistical model that represents the relationship between our two factors. The value below it (R-Sq) is saying that this mathematical equation will confidently explain 63% of the relationship behaviour M (how steep) C X

  18. Confidence & Prediction Bands • A confidence band is a measure of the certainty of the shape of the fitted regression line. In general, a 95% confidence band implies a 95% chance that the true line lies within the band. [Red lines] • A prediction band (or interval) is a measure of the certainty of the scatter of individual points about the regression line. In general 95% of the individual points (of the population on which the regression line is based) will be contained in the band. [Blue lines] • Regression line is the line of best fit. This line tries to show graphically the possible relationship between the two factors. Also it used as the basis for creating the mathematical model [Black line]

  19. The Significance Of the r - Value From our Regression analysis we had 82 samples and a confidence level of 95%. Our R -sq value from the linear regression was 63% or 0.63. Therefore the r - value = 0.63 = 0.794 If your calculated r - value is higher than the significance levels on the chart, then there is a definite relationship between your factors. So in our case 0.794 > 0.1829 Therefore a relationship does exist between NMA and Total Downtime

  20. Regression Analysis MULTIPLE • Purpose: To measure the strength of a linear association between one or more continuous inputs against a continuous output (Up to 9) • Key Features: Regression analysis studies closely the relationship between inputs and outputs. Unlike correlation it treats the inputs as dependents upon the output, therefore you can clearly see if the inputs you have selected are having an effect on the output. As with previous tools the p-value is used to discover importance with values below 0.05 having a significant impact. Regression analysis also produces a regression equation to allow you to see the optimum settings required to obtain your required target

  21. Regression Analysis Exercise File: Downtime3.MTW 1. Click ‘OK’

  22. Regression Analysis Exercise Session Window Output Here is the mathematical regression model for our data Regression Analysis: Total Downtime versus N/M/A The regression equation is Total Downtime = 1.41 + 0.904 N/M/A Predictor Coef SE Coef T P Constant 1.4057 0.1842 7.63 0.000 N/M/A 0.90381 0.07749 11.66 0.000 S = 1.462 R-Sq = 63.0% R-Sq(adj) = 62.5% Analysis of Variance Source DF SS MS F P Regression 1 290.97 290.97 136.04 0.000 Residual Error 80 171.11 2.14 Total 81 462.09 Unusual Observations Obs N/M/A Total Do Fit SE Fit Residual St Resid 11 1.0 6.000 2.309 0.162 3.691 2.54R 26 0.0 6.500 1.406 0.184 5.094 3.51R 33 0.0 4.500 1.406 0.184 3.094 2.13R 51 6.5 7.750 7.280 0.445 0.470 0.34 X 54 11.0 11.000 11.348 0.781 -0.348 -0.28 X 55 8.0 8.000 8.636 0.555 -0.636 -0.47 X 56 7.5 9.920 8.184 0.518 1.736 1.27 X 62 0.0 4.500 1.406 0.184 3.094 2.13R 64 0.0 4.500 1.406 0.184 3.094 2.13R R denotes an observation with a large standardized residual X denotes an observation whose X value gives it large influence. As these values are Below 0.05 it is Highlighting there Is a strong relationship Between Age and Years with Toshiba As these values are below 0.05 it is highlighting there is a strong relationship between Total Downtime and No Member Available

  23. Avg Avg Avg 75 60 65 Death 160 115 175 Major 100 65 135 Minor 15 10 25 None Hypothesis Testing - Continuous Y & Discrete X Y=Gas Mileage (mpg) Y=Gas Mileage(mpg) 30 30 20 20 10 10 0 .5 1 1.5 2X=Car Weight (tons) A B C X=Car Brand X (Factor) treated as: Continuous Discrete • T-Test • Homogeneity of Variance • 1-Way ANOVA • Scatterplot • Simple Regression Continuous Y (Response) • Goodness of Fit • Test of Independence Discrete Y= InjurySeverity A B C X=Car Brand

  24. Hypothesis Testing - Procedures • Stage 3: This section of the procedure is when we are comparing discrete inputs (x) and continuous outputs (y). As depending on how many samples and what format your data is in depends on what type of tool you use

  25. Single Sample Test For A Mean • 30 pieces of data were collected on a grinding operation for steam turbine buckets. • The sample mean was .96953. • The sample standard deviation was .00017. • The desired target was .96960. • Is this process off target? Ho = Population Mean = Target Value Ha = Population Mean <> Target Value H : m = .96960 = T a = .05 O x H : m ¹ .96960 a Acceptance Region Acceptance Region: Accept Ho if Target is in this region s s m x – t a/2,n-1 t a/2,n-1 < < x + n n There is 1-a certainty that the true population mean will be contained within the given confidence interval.

  26. Target = .96960 x = .96953 s = .00017 n = 30 n-1 = dof = 29 a = .05 a/2 = .025 t a/2, n-1 = t .025, 29 = 2.045 (from t-Distribution table) Test For A Mean - Hand Calculation This represents our requirement to be 95% confident in our result These figure are Taken from the T - Distribution Chart which is on The next page These figure are taken from the T - Distribution Chart which is on the next page s s x – t a/2, n-1 m t a/2, n-1 < < x + n n Our mean lies between these 2 values. We are 95% confident of this as we set our a level at 0.05 .96953 - 2.045 * .00017 < m < .96953 + 2.045 * .00017 30 30 .96946 < m < .96959 Conclusion: Because the target is not in the acceptance region, we conclude the process is off target. Accept Ha.

  27. a/2 1 - (a/2) .400 .600 .300 .700 .200 .800 .100 .900 .050 .950 .025 .975 .010 .990 .005 .995 n - 1 1 0.325 0.727 1.376 3.078 6.314 12.706 31.821 63.657 2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925 3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841 4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604 5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032 6 0.265 0.553 0.906 1.440 1.943 2.447 3.143 3.707 7 0.263 0.549 0.896 1.415 1.895 2.365 2.998 3.499 8 0.262 0.546 0.889 1.397 1.860 2.306 2.896 3.355 9 0.261 0.543 0.883 1.383 1.833 2.262 2.821 3.250 10 0.260 0.542 0.879 1.372 1.812 2.228 2.764 3.169 11 0.260 0.540 0.876 1.363 1.796 2.201 2.718 3.106 12 0.259 0.539 0.873 1.356 1.782 2.179 2.681 3.055 13 0.259 0.538 0.870 1.350 1.771 2.160 2.650 3.012 14 0.258 0.537 0.868 1.345 1.761 2.145 2.624 2.977 15 0.258 0.536 0.866 1.341 1.753 2.131 2.602 2.947 16 0.258 0.535 0.865 1.337 1.746 2.120 2.583 2.921 17 0.257 0.534 0.863 1.333 1.740 2.110 2.567 2.898 18 0.257 0.534 0.862 1.330 1.734 2.101 2.552 2.878 19 0.257 0.533 0.861 1.328 1.729 2.093 2.539 2.861 20 0.257 0.533 0.860 1.325 1.725 2.086 2.528 2.845 21 0.257 0.532 0.859 1.323 1.721 2.080 2.518 2.831 22 0.256 0.532 0.858 1.321 1.717 2.074 2.508 2.819 23 0.256 0.532 0.858 1.319 1.714 2.069 2.500 2.807 24 0.256 0.531 0.857 1.318 1.711 2.064 2.492 2.797 25 0.256 0.531 0.856 1.316 1.708 2.060 2.485 2.787 26 0.256 0.531 0.856 1.315 1.706 2.056 2.479 2.779 27 0.256 0.531 0.855 1.314 1.703 2.052 2.473 2.771 28 0.256 0.530 0.855 1.313 1.701 2.048 2.467 2.763 29 0.256 0.530 0.854 1.311 1.699 2.045 2.462 2.756 30 0.256 0.530 0.854 1.310 1.697 2.042 2.457 2.750 40 0.255 0.529 0.851 1.303 1.684 2.021 2.423 2.704 60 0.254 0.527 0.848 1.296 1.671 2.000 2.390 2.660 120 0.254 0.526 0.845 1.289 1.658 1.980 2.358 2.617 ¥ 0.253 0.524 1.645 1.960 2.326 2.576 0.842 1.282 t - Distribution Chart

  28. One-Sample t Test Exercise File: RunRate.MTW

  29. One-Sample t Test Of Mean Exercise 1. Double click C11 2. Check Test Mean & enter 75 Ho : m = Run Rate is 75% Ha : m = Run Rate isn’t 75% 3. Check ‘Boxplots’ 4. Click ‘OK’ Twice

  30. One-Sample t Test Of Mean Exercise From the graph we can see that see the placement of our Ho is within the 95% Confidence bands. This is supported by the p-Value from below which is > 0.05 therefore we reject Ha and accept Ho One-Sample T: No Panels Run Rate Test of mu = 75 vs mu not = 75 Variable N Mean StDev SE Mean No Panels Ru 31 78.90 27.10 4.87 Variable 95.0% CI T P No Panels Ru ( 68.96, 88.85) 0.80 0.429

  31. p-Value in t-Tests • The p-value is the probability of making a Type I error. • The p-value is the probability of observing an equal or more extreme test statistic result, assuming that Ho is true. • Unless there is an exception based on engineering judgment, we will standardize on a Type I error probability of a = 0.05. • Thus, any p-value less than 0.05 means we reject the null hypothesis and accept the alternative hypothesis. p < a: Accept Ha p > a: Reject Ha

  32. Hypothesis Testing - Procedures • Stage 3: This section of the procedure is when we are comparing discrete inputs (x) and continuous outputs (y). Depending on how many samples and in what format your data is in depends on what type of tool you use

  33. Two Samples - Testing For Means The Steps Required For Successful Testing 1) Before you can test to see if there is a significant difference between the two means of the data you have collected you must first see if the variances of the two sets of data differ. The reason for performing this test is that if they vary considerable you cannot perform a test on the two means with any real confidence 2) The test you will perform for testing the variances is called: Test For Equal Variances

  34. Two Samples - Testing For Means Exercise 1. Title C15 ‘No Panel & No Schedule’ and C16 ‘NP & NS Subs’ 2. Stack data from C11 and C14 into C15 File: RunRate.MTW

  35. Two Samples - Testing For Means Exercise 3. Double click on C16 then C15 Ho : m = Variances are equal Ha : m = Variances aren’t equal

  36. Two Samples - Testing For Means Exercise F-Test (normal distribution) Test Statistic: 1.281 P-Value : 0.502 Levene's Test (any continuous distribution) Test Statistic: 0.680 P-Value : 0.413 As we have 31 data points for each data set and our data is Normal we use the result from the F-Test*. Therefore as the p > 0.05 we accept Ho and say that the variances are equal, we can then proceed with testing the means for these two data sets You use the result from Levene’s Test if you have less than 30 points within your individual sample and/or you have been unable to transform your data into a Normal Distribution * This test maybe replaced by another called Bartlett’s Test depending on how the data is configured. Don’t worry the same rules apply

  37. Two Samples - Testing For Means Exercise The Steps Required For Successful Testing - Continued 4) Now that you have proved with 95% confidence that the variances of the two sets of data are equal you must now test to see if the means are different. 5) Before you begin this test you must first decide what type of test you wish to perform: Do you want to test to see if the two means are not equal? Do you want to test to see if one mean is greater than the other? Do you want to test to see if one mean is less than the other? 6) Once decided you are now ready to perform the following test on the means: 2 - Sample t-Test

  38. Two Samples - Testing For Means Exercise File: RunRate.MTW

  39. Two Samples - Testing For Means Exercise Ho : m = Means are equal Ha : m = Means aren’t equal 1. Check ‘Samples in different columns’ 2. Double click on C11 into ‘First’ then double click on C14 into ‘Second’ 4. Click on ‘Options’ 6. Click on ‘Graphs’ 5. Ensure ‘Alternative’ reads ‘not equal’ and that the Confidence level is set at 95% 3. As we have proved the variances are equal check this box 7. Check ‘Boxplots of data’ 8. Click on ‘OK’ twice

  40. Two Samples - Testing For Means Exercise As the p > 0.05 we accept Ho and that the means are not significantly different. Two-Sample T-Test and CI: No Panels Run Rate, No Schedule Run Rate Two-sample T for No Panels Run Rate vs No Schedule Run Rate N Mean StDev SE Mean No Panel 31 78.9 27.1 4.9 No Sched 31 85.6 23.9 4.3 Difference = mu No Panels Run Rate - mu No Schedule Run Rate Estimate for difference: -6.71 95% CI for difference: (-19.70, 6.29) T-Test of difference = 0 (vs not =): T-Value = -1.03 P-Value = 0.306 DF = 60 Both use Pooled StDev = 25.6

  41. Two Samples - Testing For Means Exercise 1. Click on the drop down menu and select ‘greater than’ 2. Repeat procedure from previous test Ho : m = First mean is less than or equal to the second Ha : m = First mean is greater than the second

  42. Two Samples - Testing For Means Exercise As the p > 0.05 we accept Ho and that the mean for No Schedule is not greater than that of No Panels Two-Sample T-Test and CI: No Panels Run Rate, No Schedule Run Rate Two-sample T for No Panels Run Rate vs No Schedule Run Rate N Mean StDev SE Mean No Panel 31 78.9 27.1 4.9 No Sched 31 85.6 23.9 4.3 Difference = mu No Panels Run Rate - mu No Schedule Run Rate Estimate for difference: -6.71 95% lower bound for difference: -17.56 T-Test of difference = 0 (vs >): T-Value = -1.03 P-Value = 0.847 DF = 60 Both use Pooled StDev = 25.6

  43. Two Samples - Testing For Medians (Nonparametric Test - Mann - Whitney) File: RunRate.MTW

  44. Two Samples - Testing For Medians (Nonparametric Test - Mann - Whitney) 1. Check ‘Samples in different columns’ Ho : Median = Medians are equal Ha : Median = Medians are not equal As the p > 0.05 we accept Ho and that the means are not significantly different. Mann-Whitney Test and CI: No Panels Run Rate, No Schedule Run Rate No Panel N = 31 Median = 87.92 No Sched N = 31 Median = 100.00 Point estimate for ETA1-ETA2 is -0.00 95.1 Percent CI for ETA1-ETA2 is (-12.50,-0.00) W = 872.0 Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.1431 The test is significant at 0.1175 (adjusted for ties) Cannot reject at alpha = 0.05

More Related