1 / 69

STATISTIK PENDIDIKAN EDU5950 SEM1 2015-16

STATISTIK PENDIDIKAN EDU5950 SEM1 2015-16. STATISTIK INFERENSI: PENGUJIAN HIPOTESIS BAGI ANALISIS KHI-KUASA DUA. Chi – Square (x ² ). Chi – Square (x ² ). ANALISIS “CHI-SQUARE” (KUASA-DUA KHI).

bwanda
Download Presentation

STATISTIK PENDIDIKAN EDU5950 SEM1 2015-16

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STATISTIK PENDIDIKANEDU5950SEM1 2015-16 STATISTIK INFERENSI: PENGUJIAN HIPOTESIS BAGI ANALISIS KHI-KUASA DUA Rohani Ahmad Tarmizi - EDU5950

  2. Chi – Square (x²) Chi – Square (x²)

  3. ANALISIS “CHI-SQUARE”(KUASA-DUA KHI) • Ini juga merupakan analisis hubungan tetapi lebih dikenali sebagai analisis perkaitan (association) • Analisis ini digunakan pakai bagi menentukan perkaitan antara pasangan pembolehubah yang diukur pada skala nominal atau ordinal ataupun jika salah satunya dipadankan dengan data sela dan nisbah. • Dengan itu pembolehubah seperti • Bangsa, • Jantina, • Suka/tidak suka makanan, • Tinggi pencapaian/rendah pencapaian, • Kebimbangan tinggi/ kebimbangan sederhana/ kebimbangan rendah • Data frekuensi dicerap dengan membilang kejadian (occurance setiap perkara). Sesuai untuk kajian tinjauan • Daripada frekuensi yang dicerap (observed frequency) analisis “chi-square” memberi kita makluman bahawa ada/tiada perkaitan antara kedua-dua pemboleh ubah.

  4. ANALISIS “CHI-SQUARE” (KUASA-DUA KHI) • KATAKANLAH, penyelidik mengumpul maklumat tentang bangsa bagi responden dan juga kategori amalan pemakanan setiap responden, • ATAU penyelidik tinjau pelajar dibeberapa buah sekolah dari segi jantina dan minta/tidak minat kepada aliran sains • ATAU penyelidik tinjau bapa-bapa dan mengumpul maklumat tahap pendidikan (tinggi/ sederhana/ rendah) dan dikaitkan dengan kategori gaji • Bagi ketiga-tiga contoh tersebut analisis yang sesuai dijalankan adalah analisis tak parametrik (analisis kuasa-dua khi) • dan seterusnya dibina jadual kontingensi atau jadual“crosstabulation”. • Daripada frekuensi yang dicerap (observed frequency) analisis “chi-square” memberi kita makluman bahawa ada/tiada perkaitan antara kedua-dua pemboleh ubah.

  5. ANALISIS “CHI-SQUARE”(KUASA-DUA KHI) • Terdapat dua cara/kategori – CHI-SQUARE TEST OF GOODNESS OF FIT dan TEST OF INDEPENDENCE/DEPENDENCE • TEST GOODNESS OF FIT – menjawab persoalan “adakah terdapat perbezaan kadar bagi sesuatu perkara/kejadian/persetujuan” • TEST OF INDEPENDENCE/ DEPENDENCE – menjawab persoalan “adakah terdapat perkaitan/kebersandaran/ hubungan antara dua perkara

  6. ANALISIS “CHI-SQUARE”(KUASA-DUA KHI) • Dapatan bagi analisis ini lazimnya dalam bentuk jadual frekuensi yang dipanggil jadual kontingensi atau jadual “crosstabulation”. • Daripada frekuensi yang dicerap (observed frequency) analisis “chi-square” ini memberi kita makluman bahawa ada/tiada perkaitan yang signifikan antara kedua-dua pembolehubah yang dikaji • Ataupun ada/tiada perbezaan frekuensi yang signifikan antara kategori-kategori yang dikaji.

  7. Daripada jadual tersebut kita boleh telitikan atau kajikan sama ada terdapat hubungan atau perkaitan antara kedua-dua pemboleh ubah tersebut. • Selanjutnya analisis pengujian hipotesis perlu dijalankan ia itu untuk menguji terdapatnya perkaitan antara kedua-dua pemboleh ubah tersebut dengan signifikan. • Pengujian hipotesis ini adalah ujian kuasa dua khi. • Sekiranya, terdapat perkaitan yang signifikan maka langkah seterusnya adalah dengan menentukan darjah atau magnitud hubungan tersebut.

  8. Bagi analisis ini, data adalah dalam bentuk kekerapan dan sudah semestinya taburan skor adalah tidak normal. • Dengan itu taburan ini dipanggil taburan bebas (distribution-free). • Ujian ini juga dipanggil ujian tak parametrik oleh kerana ia tidak bertabur secara normal. • Sebagai “rule-of-thumb” penggunaan ujian parametrik digalakkan oleh kerana oleh kerana “power” atau kekuatannya, walaubagaimana pun jika data adalah dalam bentuk nominal serta juga terdapat taburan data yang tidak normal maka ujian tak parametrik diterima pakai. • Ujian-ujian parametrik – sign test, Mann-Whitney U test, Wilcoxon matched-pairs signed ranks, Kruskal-Wallis, Chi-square.

  9. Different Scales, Different Measures of Association

  10. Types of Chi-Square Test 1. Goodness-of-fit To test for certain assumption regarding one categorical variable 2. Test of Independence Test on association between variables regarding contingency tables

  11. The Chi-Square Distribution ♠ The Chi-Square distribution has only one parameter degrees of freedom ♠ The shape of the distribution curve is skewed to the right for small degrees of freedom and becomes symmetric for large degrees of freedom ♠ The entire Chi-square distribution lies to the right of the y-axis ♠ the Chi-square distribution assumes nonnegative values

  12. The Chi – Squares Distribution df = 2 df = 7 df = 12 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

  13. x² Goodness - of - fit

  14. Steps in Test of Hypothesis – 1. State the null and alternative hypothesis. 2. Determine the appropriate sampling distribution and the critical value. 3. Calculate the test statistics - calculation based on: O – Observed frequency E – Expected frequency Formula to calculate χ² =  ( O-E) ² / E 4. Make decision. 5. Conclusion.

  15. ( O – E )² E x² = Σ 1 Goodness-of-fit ♠ Test assumption for categorical variable ♠ Only one variable ♠ calculation based on: O – Observed frequency E – Expected frequency ♠ Formula to calculate x²

  16. Based on: O and E ( O – E )² E x²= Σ 2 Determine the sampling distribution and Critical Value Hypothesis Test 1 State HO and HA 4 Decision 3 Calculate x² 5 Conclusion

  17. Manual CriteriaDecision x²cal > x²critical Reject HO, Accept HA x²cal≤x²critical Fail to Reject HO SPSS Criteria Decision Sig-x²< αReject HO, Accept HA Sig-x²≥αFail to Reject HO Step in testing Hypothesis 1. State the null and alternative hypotheses HO: Statement of assumption HA: Statement opposite of the assumption 2. Calculate the best statistic x²value 3. Determine critical value • α • df = k-1 4. Make your decision 5. Make conclusion

  18. Example 1: The following table displays the age distribution for a sample summoned for traffic violations. Test the hypothesis that the proportion of people summoned for traffic violations is different for all age groups at 0.05 level of significance. Age <20 20-29 30-39 40-49 >49 Summon 32 25 19 16 8 Assumption: p1 = p2 = p3 = p4 = p5 = 0.20

  19. Answer: 1. Hypotheses HO: The proportion of people involve in traffic violation is the same for all age groups HA: The proportion of people involve in traffic violation is not the same for all age groups 2. Test statistic Age O E ( O – E ) ( O – E )² E <20 32 20 12 144 7.20 20-29 25 20 5 25 1.25 30-39 19 20 -1 1 0.05 40-49 16 20 -4 16 0.80 >49 8 20 -12 144 7.20 10016.50 ( O – E )²

  20. Fail to reject HO Reject HO 3. Critical value df = k – 1 = 5 – 1 = 4 x² 4,0.05 = 9.49 4. Decision Since x² cal (16.5) is bigger than x² critical (9.49) Reject HO, H Accept HA. 5. Conclusion: The proportion of people summoned for traffic violations is not the same significantly for all age groups, x² (n=100) = 16.5, p < .05. This indicated that people of different age group differ significantly in frequency of traffic violation. Findings also indicated that the lower age group performed more traffic violations. α = 0.05

  21. SPSS Chi-Square output Age group involved in traffic offences Observed N Expected N Residual <20 32 20.0 12.0 20-29 25 20.0 5.0 30-39 19 20.0 -1.0 40-49 16 20.0 -4.0 >49 8 20.0 -12.0 Total 100 Test Statistics Age group involved In traffic offences Chi-Square 16.500 df 4 Asymp. Sig. 0.002 x²is valid if less than 20% of the cells with Expected values < 5

  22. Example 2: In the 2008 poll, adults were asked, “Do you agree with the move to increase highway speed limit to 120 km/hour?” Results revealed that 58% said yes, 31% said no and 11% said do not know. Suppose the result hold true for the 2008. A recent poll produced the following distribution in response to the same question. Test the hypothesis that the current distribution of adult belonging to the three categories is different from that for 2008 at 0.01 level of significance. Table 2 Category Yes No Do not know Frequency 313 146 41 Prob. Yes 0.58 No 0.31 Do not know 0.11 variable – agreement towards the move to increase highway speed limit to 120 km/hour

  23. Answer 1. Hypotheses HO: The current percentage distribution of adults belonged to the three categories as that for 2008 HA: The current percentage distribution of adults do not belonged to the three categories from that for 2008 2. Test statistic Opinion O E ( O – E ) ( O – E )² E Yes 313 290 23 529 1.824 No 146 155 -9 81 0.523 Do not know 41 55 -14 196 3.564 500 5.911 ( O – E )²

  24. Fail to reject HO Reject HO 3. Critical value df = k – 1 = 3 – 1 = 2 x² 2,0.01 = 9.21 4. Decision Since x² cal (5.911) is smaller thanx² critical (9.21) Fail to reject HO 5. Conclusion: The current percentage distribution of adults do belong to the three categories as from that of 2008 at 0.01 level of significance, x² (2, n=500) = 5.911, p > .01. The current percentage is not different from the distribution in 2008. Therefore the agreement of adults in the current survey Is similar to those in year 2008. α = 0.01 9.21

  25. SPSS Chi-Square output Perception Observed N Expected N Residual Agree 313 290.0 23.0 Disagree 146 155.0 -9.0 Don’t know 41 55.0 -14.0 Total 500 Test Statistics Perception Chi-Square 5.910 df 2 Asymp. Sig. 0.052 a a. 0 cells (0.0%) have expected frequencies less than 5. The minimum expected cell frequency is 55.0.

  26. x² Test of Independence

  27. Assumptions • Chi Square test of independence-dependence is used when two variables are measured on a nominal scale. • Chi-square goodness-of-fit is used for test of differences when you have only one variable. • It can be applied to interval or ratio data that have been categorized into a small number of groups. • It assumes that the observations are randomly sampled from the population.

  28. Assumptions • All observations are independent (an individual can appear only once in a table and there are no overlapping categories). • It does not make any assumptions about the shape of the distribution nor about the homogeneity of variances

  29. Steps in Test of Hypothesis – 1. State the null and alternative hypothesis. 2. Determine the appropriate sampling distribution and the critical value. 3. Calculate the test statistics - calculation based on: O – Observed frequency E – Expected frequency Formula to calculate χ² =  ( O-E) ² / E 4. Make decision. 5. Conclusion.

  30. The Hypothesis:Whether There is an Association or Not • Ho : The two variables are independent • Ha : The two variables are associated or dependent

  31. Calculating Test Statistics • Contrasts observed frequencies in each cell of a contingency table with expected frequencies. • The expected frequencies represent the number of cases that would be found in each cell if the null hypothesis were true ( i.e. the nominal variables are unrelated). • Expected frequency of two unrelated events is product of the row and column frequency divided by total number of cases. E or FE= Fr Fc / N

  32. Calculating Test Statistics Continued

  33. Calculating Test Statistics Observed frequencies Expected frequency Expected frequency

  34. Determine Degrees of Freedom df = (R-1)(C-1) Number of levels in column variable Number of levels in row variable

  35. Compare computed test statistic against a tabled/critical value • The computed value of the Pearson chi- square statistic is compared with the critical value to determine if the computed value is improbable • The critical tabled values are based on sampling distributions of the Pearson chi-square statistic • If calculated 2 is greater than 2 table value, reject Ho

  36. Example • Suppose a researcher is interested in voting preferences on environmental control issues. • A questionnaire was developed and sent to a random sample of 90 voters. • The researcher also collects information about the political party membership of the sample of 90 respondents.

  37. Bivariate Frequency Table or Contingency Table

  38. Bivariate Frequency Table or Contingency Table Observed frequencies

  39. Row frequency Bivariate Frequency Table or Contingency Table

  40. Bivariate Frequency Table or Contingency Table Column frequency

  41. Determine The Hypothesis • Party Membership ( 2 levels) and Nominal • Preference ( 3 levels) and Nominal • Ho : There is no difference between B & P in their opinion on environmental control issue. • Ha : There is difference between B & P in their opinion on environmental control issue. • Ho: There is no association between responses to the environmental survey and the party membership in the population. • Ha: There is association between responses to the environmental survey and the party membership in the population.

  42. Calculating Test Statistics

  43. Continued Calculating Test Statistics = 50*25/90

  44. Continued Calculating Test Statistics = 40* 25/90

  45. Continued Calculating Test Statistics = 11.03

  46. Determine Degrees of Freedom df = (R-1)(C-1) =(2-1)(3-1) = 2

  47. Fail to reject HO Reject HO • 3. Critical value • df = 2 • x² 2,0.05 = 5.99 • 4. Decision • Since x² cal (11.02) is bigger than x² critical (5.99) • Reject HO • Conclusion: There is significant association between responses to the • environmental survey and the party membership (Barisan or Pakatan) in the population, x² ( 2,n=90) = 11.02, p < .05. OR • There is significant difference between the Barisan and Pakatan voters in their opinion on environmental control issue, x² ( 2,n=90) = 11.02, p < .05. α = 0.05

  48. Compare computed test statistic against a tabled/critical value • α = 0.05 • df = 2 • Critical tabled value = 5.991 • Test statistic, 11.03, exceeds critical value • Null hypothesis is rejected • Barisan & Pakatan differ significantly in their opinions on gun control issues

  49. Phi Coefficient • Pearson Chi-Square provides information about the existence of relationship between 2 nominal variables, but not about the magnitude of the relationship • Phi coefficient is the measure of the strength of the association

  50. Cramer’s V • When the table is larger than 2 by 2, a different index must be used to measure the strength of the relationship between the variables. One such index is Cramer’s V. • If Cramer’s V is large, it means that there is a tendency for particular categories of the first variable to be associated with particular categories of the second variable. Smallest of number of rows or columns Total Number of cases

More Related