80 likes | 256 Views
STATISTICS. Advanced Higher. Chi-squared test. STATISTICS. Advanced Higher. Chi-squared test Finding if there is a significant association between sets of data. Lesson Objectives 1. Explain why it is used. 2. List the advantages and disadvantages .
E N D
STATISTICS Advanced Higher Chi-squared test
STATISTICS Advanced Higher Chi-squared test Finding if there is a significant association between sets of data. Lesson Objectives 1. Explain why it is used. 2. List the advantages and disadvantages . 3. Understand how to apply the statistical test. 4. Apply it to a relevant context.
STATISTICS Advanced Higher The situation A group of students have visited the Lake District National Park to investigate the impact of tourism upon the landscape. One of their data collection techniques is to record the amount of traditional and modern looking houses in 8 villages inside of the National Park boundary and 8 villages outside of the boundary line... Chi-squared test: looking for a difference • What should they do? • What data should they have collected to complete this investigation? • How much data should they collect? • How can they make sure that the data is reliable? • What initial data representation skill could they utilise to discover an initial impression? • What statistical test should they use to confidently state there is or is not a relationship?
What did you observe? (what data did you actually collect?) • What would you expect if there was no association? O = the Observed frequency (what you actually counted) E = the Expected frequency (what you would expect if there was no association) (O-E)2 X2 = S E Chi-squared test: looking for a difference Modern houses Traditional houses We found 103 modern homes inside of the National Park and 452 outside. We found 180 traditional homes inside of the National Park and 23 outside.
1st: construct a table with the data that you have observed TESTING THE RELATIONSHIP Null Hypothesis: There is no significant difference between building ages inside and outside of the National Park (O-E)2 (O-E)2 (O-E)2 X2 = S X2 = S X2 = S 2nd: work out the expected frequency E E E Alternative Hypothesis:There is a significant difference between building ages inside and outside of the National Park Expected Frequency = row total x column total Grand total
CALCULATE THE DEGREES OF FREEDOM: (Number of Rows – 1) x (Number of Columns – 1) TESTING THE RELATIONSHIP Null Hypothesis: There is no significant difference between building ages inside and outside of the National Park Chi2 value of ____ is higher than3.84 and 6.64 so… • FINAL STATEMENT • IF X2 IS HIGHER THAN OR EQUAL TO THE CRITICAL VALUE REJECT THE NULL HYPOTHESIS AND ACCEPT THE ALTERNATIVE. • As X2 is (greater than / less than) the Critical Value I can (accept / reject) the Null Hypothesis and (accept / reject) the Alternative Hypothesis. • Therefore I can state that there (is no / is a) significant association… • …to a significance level of 0.05 (95% sure results have not occurred by chance). Alternative Hypothesis:There is a significant difference between building ages inside and outside of the National Park
There is a significant association between housing age inside and outside of the Lake District National Park. State the answer in terms of the alternative hypothesis. • Reasons to use it • It allows you to identify if there is a difference or a relationship between two characteristics. • It is simple to carry out • It compares the data that you have observed with what you would expect to happen. • Disadvantages of using it • The data must be in the form of frequencies. • The frequency of the data must have a precise numerical value and be able to be organised into categories or groups. • The total number of observations must be more than 20. • The expected frequency in any one cell of the table must be more than 5. Justify the suitability of using chi2 test. Referring to a National park that you have studied, comment on the results shown in this test. • Sometimes buildings are built recently but designed to look old. • The survey may have included unused farm buildings as traditional but not necessarily used as homes. • It is uncertain how the survey determined what was modern or traditional. • The survey indicates that villages inside of the Park are smaller. • Perhaps there is a static village size and new buildings aren’t being built.
You compare the observed data with the data that you would expect. • Looking for a difference between O & E. • If there is a difference, then there is an association! • Reason to use this test: • If you have categorical data (eg. blue eyes) • means are not a category. Colours, for example, are. • Must have: • More than one category • A minimum of 5 in each one