440 likes | 654 Views
Lecture 3. Estimation and Hypothesis Testing . Objectives: Formulate a null and alternate hypothesis for a question of interest; Explain what a test statistic is; Explain p-values; Describe the reasoning of hypothesis testing;
E N D
Lecture 3 Estimation and Hypothesis Testing
Objectives: • Formulate a null and alternate hypothesis for a question of interest; • Explain what a test statistic is; • Explain p-values; • Describe the reasoning of hypothesis testing; • Determine and check assumptions for the sampling distribution model; • Compare p-values to a pre-determined significance level to decide whether to reject the null hypothesis; • Recognise the value of estimating and reporting the effect size; and • Explain Type I and Type II errors when testing hypotheses. Lecture 3
Hypotheses • Unproven statements or propositions about a factor or phenomenon that is of interest • We seek evidence to determine if a null hypothesis can be rejected • Null Hypothesis (H0): states the parameter is equal to some value (no change) • Alternate Hypothesis (HA): what we are trying to prove true and states the parameter is not equal to the value of H0 (the situation has changed) • Examples • H0: Customer expenditure = $180 • H1: Customer expenditure ≠ $180 • H0: Average delivery time ≤ 20mins • H1: Average delivery time > 20mins Non Directional Two Sided Directional One Sided Hypothesis Testing
Hypotheses are tests for something • Tests require a standard or basis of conclusion • Is a parameter equal to some hypothesised value? • Will there be a difference between groups? • Will the variables be linked or associated? • Tests are selected by considering: • What you want to investigate • Whether you have one, two or more variables • Whether data are nominal, ordinal or interval or ratio. Hypothesis Testing
Considers a simple statement about a population • This is the hypothesis • Uses a sample to test whether the statement is likely to be true or is unlikely • It sees whether or not data from a sample supports a hypothesis about the population • If the hypothesis is unlikely, the hypothesis is rejected and another implied hypothesis must be true Hypothesis testing in simple terms
The same logic used in jury trials is used in statistical tests of hypotheses: • We begin by assuming that a hypothesis is true. • Next we consider whether the data are consistent with the hypothesis. • If they are, all we can do is retain the hypothesis we started with. • If they are not, then like a jury, we ask whether they are unlikely beyond a reasonable doubt. A Trial as a Hypothesis Test
A politician claims that 10% of factories in an area are losing money and we wish to test this claim. (null hypothesis is the politician’s statement) • A sample of 30 shows that all of them are profitable • The data is compared with what we would expect if the null hypothesis is true (test statistic) • If the null hypothesis is true, the probabilitythat they are all profitable is 0.0424 (p-value) • This is very unlikely, so we can reject the null hypothesis (test result) Simple Example
Step 1: Establish the null and alternative hypothesis • Step 2: Determine the appropriate statistical test • Step 3: Determine the significance level (alpha) • Step 4: Establish the decision rule • Step 5: Gather and analyse sample data • Step 6: Reach a statistical conclusion • Step 7: Make a business Decision HYPOTHESISE MODEL MECHANICS CONCLUSION Hypothesis Testing
Null (H0): • A statement of no difference or no effect • Specifies a population model parameter and proposes a value for that parameter • If null is not rejected, no changes will be made (assumes the default is true) • Always the hypothesis that is tested • Alternate (HA): • Some difference or effect is expected • Contains the values of the parameter that we consider plausible if we reject the null hypothesis • If accepted will lead to changes in opinions or actions Formulate H0 and HA
E.g. Smiths Chips uses an automatic filling machine to fill bags to their desired weight. The machine is set to fill bags with 50g of chips. Each hour, a sample of bags is collected and weighed to determine whether the machine is working properly. • Null (H0): • We assume the default is true, therefore: • H0: μ = 50g • Alternate (HA): • What other values are plausible if we reject H0? • H0: μ ≠ 50g Example H0 and HA
Type I Error • The null hypothesis is true but we mistakenly reject it. • E.g. We find a difference when there isn’t one • The significance level is the probability of making a Type I error • Type II Error • The null hypothesis is false, but we fail to reject it. • E.g. We don’t find a difference but there ISone Type I and Type II Errors
Here’s an illustration of the situations: When you choose level α, you’re setting the probability of a Type I error to α. Type I and Type II Errors
The significance level (alpha level, α)is the criterion used for rejecting the null hypothesis • It is the maximum probability of rejecting the null hypothesis when the null hypothesis is actually true • If the significance level is 0.05 this means we are willing to risk a 5% chance of rejecting a true null hypothesis Select the significance level
How do we determine if the null hypothesis is supported? • H0: μ = 50g • HA: μ ≠ 50g Collect Sample • Certain values will support the null hypothesis, whereas other values will support the alternate hypothesis • H0: μ = 50g Which values would support this? • H0: μ ≠ 50g Which values would support this? • What about sampling error?? • Would you reject the null hypothesis if the average weight was 51g?? Performing Hypothesis Tests
At what point do we stop attributing the result to sampling error? • We need to select a cut-off point • If the test statistic > Cut-off, reject H0 • If the test statistic ≤ Cut-off, do not reject H0 Sampling Distribution Probability of committing a Type I error = α Do not reject H0 Cut-off μ = 50 Reject H0 Performing Hypothesis Tests Note: A test statistic is a numerical summary of the sample data e.g. sample mean, z, t, chi square
P-value • Measures the strength of evidence in support of a null hypothesis • Is the probability of observing a test statistic (or one more extreme) if the null hypothesis is true. • If the P-value is less than the significance level, we reject the null hypothesis. • Region of Acceptance • Defined so that the chance of making a Type I error is equal to the significance level (it is a range of values) • If the test statistic falls within the region of acceptance, the null hypothesis is not rejected. • The set of values outside the region of acceptance is called the region of rejection. • If the test statistic falls within the region of rejection, the null hypothesis is rejected. Decision Rules: Two Approaches
Compute a test statistic from sample data and compare it to the hypothesized sampling distribution of the test statistic • Divide the sampling distribution into a rejection region and non-rejection region. • If the test statistic falls in the rejection region, reject H0 (concluding that H1 is true); otherwise, fail to reject H0 Decision Rules – Approach 1
Sampling Distribution Probability of committing a Type I error = α Do not reject H0 Critical Value Rejection Region μ = 50 • The area of the rejection region gives the probability of getting a sample mean larger than the cut-off (critical value) when the population mean is actually 50g • This is the probability of making a Type 1 error • This probability is called the significance level (α = alpha) Performing Hypothesis Tests
H0 = HA≠ H0 ≤ HA> H0 ≥ HA< Rejection Regions
A critical value, z*, corresponds to a selected confidence level. Before computers could calculate p-values, you had to: Select an alpha level Look up the corresponding critical value for the relevant model Calculate how many standard deviations the observed statistic was away from the hypothesized value (z-score) Compare that value directly against the z* values. Any z-score (test statistic) larger in magnitude than a particular critical value means the null hypothesis should be rejected. Critical Values traditional z* critical values from the Normal model
H0: μ = 50g • HA: μ ≠ 50g Normal (Z) distribution • Assume the significance level is 0.05 • The z critical value = 1.64 • Lets assume the sample size =30, • With a s.d. of 3. • Critical sample mean = • = 50 ± 1.96 (3/√30) • = 48.93, 51.07 α/2 = 0.025 α/2 = 0.025 Non Rejection Region Rejection Region Rejection Region 0.0 -1.96 1.96 48.93 50 51.07
2. Model • You need to determine the appropriate modelfor the sampling distribution of the statistic you will use to test the null hypothesis and the parameter of interest. • Critical values differ depending on the sampling distribution χ2 distribution F-distribution Normal (Z) distribution t-distribution Each test has its own assumptions and you will learn these as we learn each test. Performing Hypothesis Tests
3. Mechanics – Using SPSS • Perform the actual calculation of the test statistic from the data. Usually, the mechanics are handled by a statistics program (e.g. SPSS) • Different tests will have different formulas and different test statistics. • The goal of the calculation is to obtain a P-value. Performing Hypothesis Tests
p-value = probability of obtaining a test statistic value equal to or more extreme than that obtained from the sample data when H0 is true. P-value Rejection Region Rejection Region P-value Test Statistic DO NOT REJECT Approach 2 - Using p-values Test Statistic REJECT
“Could these data plausibly have happened by chance if the null hypothesis were true?” • The p-value is the probability of seeing the observed data (or something even less likely) given the null hypothesis. • A low p-value means: • The observed data would be unlikely if the null hypothesis was true • You believe the data and reject the null hypothesis • A high p-value means: • The data are consistent with the model from the null hypothesis, and we have no reason to reject the null hypothesis. • Formally, we say that we “fail to reject” the null hypothesis. Compare to the p-value
Making a statistical Decision: • First, the difference between the data results and the null hypothesis is determined (test statistic) • Then, assuming the null hypothesis is true, the probability of a difference that large or larger is computed (p-value) • This probability is compared to the significance level (0.05). • If the probability is less than or equal to the significance level, then the null hypothesis is rejected and the outcome is said to be statistically significant Hypothesis Testing
If probability (p-value) < 0.05 • REJECTnull hypothesis • Statistically significant If probability (p-value) > 0.05 • DO NOT REJECT null hypothesis • Differences / Associations are not statistically significant • Only due to sampling variation Compare p-value with significance level
H0: μ = 50g • HA: μ ≠ 50g Assume the significance level is 0.05 • Sample mean = 49.05g • Using SPSS: p-value = 0.454 Should we reject or not reject the null hypothesis? Remember the confidence interval we created earlier, do you reach the same conclusion? 95% Confidence Interval = (48.93; 51.07) Using p-values
Formulate H0 and HA Typically α = 0.05 Select an appropriate test Choose the level of Significance, α Calculate test statistic and determine the probability of the data being consistent with the Null hypothesis (p-value) Compare p-value with level of significance Is HA Supported? P-value < α P-value > α Reject H0 Do not reject H0 Hypothesis Testing Mechanics Draw Conclusion
4. Conclusions and Decisions • The primary conclusion in a formal hypothesis test is only a statement stating whether we reject or fail to reject that hypothesis. • Your conclusion about the null hypothesis should never be the end of the process. • You can’t make a decision based solely on a P-value. Performing Hypothesis Tests
Business decisions should always take into consideration threethings: • the statistical significance of the test, • the costof the proposed action, and • the effect size of the statistic observed (How big a difference would matter?) Performing Hypothesis Tests
Is the mean profit of sales rep’s insurance policies less than $1500 • H0: μ = $1500 • HA: μ < $1500 • Using SPSS we obtain the following : • Sample mean = $1438.90 • Test statistic = -0.2517 • p-value = 0.4015 Is the difference between the sample mean and the hypothesised population mean significant?
Do males spend more a month on fast food than females? • H0: Average Expenditure (Males) = Average Expenditure (Females) • HA: Average Expenditure (Males) > Average Expenditure (Females) • Using SPSS we obtain the following : • Sample mean (males) = $43.80 • Sample mean (females) = 41.20 • Test statistic = -12.580 • p-value = 0.000 Is the difference between the two means significant? Is the effect size significant?
The rest of Module 1 is designated to learning different hypothesis tests for different scenarios: • Testing a single sample mean • Difference between two independent sample means • Difference between two dependent sample means • Difference between more than two sample means • Relationships between categorical variables • Relationships between quantitative variables Next Week
In statistics, estimation refers to the process by which one makes inferences about a population, based on information obtained from a sample. • Sampling Distributions give us the foundation that allows us to take a sample and use it to estimate a population parameter. Estimation
A point estimateis a single value of a statistic • E.g. The sample mean is a point estimate for a population mean • An interval estimate • Defined by two values • The point estimate is said to lie between these two values • It provides a confidence level for the point estimate. • These are called confidence intervals • E.g. The population mean is greater than a but less than b. Upper Confidence Limit Lower Confidence Limit Point Estimate Width of confidence interval
An interval gives a range of values: • Takes into consideration variation in sample statistics from sample to sample • Based on observations from 1 sample • Gives information about closeness to unknown population parameters • Stated in terms of level of confidence. (Can never be 100% confident) • The general formula for all confidence intervals is equal to: Point Estimate ± (Critical Value)(Standard Error)
Consider: Confidence level = 95% [(1 - ) = .95] • is the proportion of the distribution in the two tails areas outside the confidence interval Point Estimate Z = -1.96 Z = 1.96 Lower Confidence Limit Upper Confidence Limit
A relative frequency interpretation: • If all possible samples of size n are taken and their means and intervals are estimated, 95% of all the intervals will include the true value of that the unknown parameter • A specific interval either will contain or will not contain the true parameter (due to the 5% risk)
A confidence interval can be expressed as below: The extent of that interval on either side of the point estimate is called the margin of error (ME). The general confidence interval can now be expressed in terms of the ME. Margin of Error: Certainty Vs Precision
The more confident we want to be, the larger the margin of error must be. Every confidence interval is a balance between certainty and precision. Which wall would you be more confident hitting?
To change the confidence level, we’ll need to change the number of SE’sto correspond to the new level. For any confidence level the number of SE’s we must stretch out on either side of the point estimate is called the critical value. E.g. Critical values for the normal model. -1.96 1.96
A hypothesis is an unproven statements or propositions about a factor or phenomenon that is of interest. It is tested using the scientific method of falsification. • The p-value is the probability of obtaining the value observed if the null hypothesis was correct. We usually reject the null hypothesis where P-value < 0.05. • A Type I error is when we reject the null hypothesis and it is actually correct. • A Type II error is when we fail to reject the null hypothesis, however the null is incorrect. • Estimation can be used to make inferences about a population based on information obtained from a sample. Summary