1 / 27

«A chi- square test showed that ...» – or did it really ?

«A chi- square test showed that ...» – or did it really ?. Bård Uri Jensen http://privat.hihm.no/buj/ bard.jensen@hihm.no. Allowing [ statistical software ] to do our thinking is a sure recipe for disaster . ( Good & Hardin , 2012, p. xi). «Simple» statistical tests.

melina
Download Presentation

«A chi- square test showed that ...» – or did it really ?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. «A chi-square test showedthat...» – or did it really? Bård Uri Jensen http://privat.hihm.no/buj/ bard.jensen@hihm.no

  2. Allowing [statisticalsoftware] to do ourthinking is a sure recipe for disaster. (Good & Hardin, 2012, p. xi) - or did it really?

  3. «Simple» statistical tests • chi-square (X2) test • t-test - or did it really?

  4. Statistical hypothesis testing • Formulate a hypothesis • E.g. In Norwegian L2, Vietnamese have more TENSE errorsthan Somali. • Formulate a null-hypothesis • Vietnamese and Somalis have the same rate of TENSE errors. • «Disprove» the null-hypothesis = demonstrateitsunlikelihood • E.g. less than 5% chance for the null-hypothesis to be true • = «Significance» • Wechooseαaccording to whatweconsider an acceptable risk of false conclusions • Often 5% in linguisticresearch - or did it really?

  5. Conditionsofuse • Independentobservations • chi-square test • t-test • Parametric assumptions • t-test • The dangersofrepeated testing • any test - or did it really?

  6. A simple example from ornithology - or did it really?

  7. A simple example from ornithology - or did it really?

  8. A simple example from ornithology - or did it really?

  9. A simple example from ornithology - or did it really?

  10. A simple example from corpuslinguistics - or did it really?

  11. A simple example from corpuslinguistics • The observationsshould be independent. • An importantconditionofuse for • chi-squared test • t-test • The observationsshould be of different individuals. «Chi-square is a much-abused test in secondlanguageresearch studies, and oftenoneofitsassumptions (thatofindependenceof data) is violated as a matter ofcourse.» Larson-Hall (2010, p.206) - or did it really?

  12. Example 1: Chi-squared test, non-independentobservations • Blom & Paradis 2013 • Journal of Speech, Language, and Hearing Research • On past tense production in L2 children with language impairment • 48 children with English as L2 • Overregularization of past tense • Hypothesis: Less common in verb stems ending in /d/ or /t/ • X 2 (1) = 3.45, p (one-sided) = 0.032 • Problem: n = 85 + 140, N = 48 • Observations are not independent, so the result is invalid. - or did it really?

  13. Example 1: Chi-squared test, non-independentobservations • Solution A: • Pick just oneobservation from eachauthor/speaker • “To exclude the author as one more relevant factor, the database was cleaned so that there is only one example for each verb from any single author.” Sokolova 2012, p. 94 - or did it really?

  14. Example 1: Chi-squared test, non-independentobservations • Solution A: • Pick just oneobservation from eachauthor/speaker • Sokolova 2012 • Solution B: • Calculateaveragevalues for each informant • Usetheaveragevalues as independentobservations • Test significancewith an appropriate test, e.g. t-test or U-test • Gujord 2013 • Boththesesolutionsmightrequire a largercorpus! • «Solution» C: • Alter theresearchquestion • Danckaert 2011 - or did it really?

  15. Example 1: Chi-squared test, non-independentobservations • Solution B: - or did it really?

  16. Example 2:T-test, non-independentobservations • Klavan 2012 • PhDthesis from Tartu University • Investigationofadposition ‘peal’ and adessive case • 450 observationsofeach, from 2 corpora • t = 8.02, p < 0.001 • Conclusion: adessivephrasesare longer than ‘peal’-phrases • Problem: Observationsare not independent. • The conclusion is invalid. - or did it really?

  17. - or did it really?

  18. Example 3: T-test, non-normal populations • Hunter (2011, s. 48) • PhDthesis from Birmingham University • On grammaticalityjudgements by L2 students • Conclusion: • the accuracy (max. = 1) for the teacher group (M = .98, SD = .14) was significantly higher than the student group (M = .64, SD = .49), t(1) = 4.9, p < .001. • Problem: • Mean = 0.98, Maximum value = 1 • Standard deviation= 0.14 • The distribution cannot possibly be normal. • The result is invalid. - or did it really?

  19. - or did it really?

  20. Example 4Repeated testing • Leedham 2011 • PhDthesis, The Open University • Features in thewritingofChinese students in UK universities • Conclusion: • Therearedifferences in frequenciesofcertainphrasesbetween 3rdyear students and younger students • Problem: • Repeated testing withoutadjustingtheprobabilityvalues • Someoftheresultsare not valid. - or did it really?

  21. CV CV - or did it really?

  22. Moral Thereareno simple tests. • Youshould understand theconditionsofthe test. • Youshouldtaketheconditionsintoaccount. • Youshoulddocumentproperly • howyouperformthe test, • whatnumbersyouputinto it, • howtheconditionsare met. «A chi-square test showed that the difference is significant.» - or did it really?

  23. Is it reallythatimportant? • «[C]ompared to othersocialsciences (e.g., psychology, communication, sociology, anthropology, …) or branchesoflinguistics (e.g., psycholinguistics, phonetics, sociolinguistics…), most ofcorpuslinguistics has paradoxicallyonlybegun to developthismethodologicalawareness.» Gries (forthcoming, p.1) - or did it really?

  24. Is it reallythatimportant? • «It has become increasingly apparent over a period of several years that psychologists, taken in the aggregate, employ the chi-square test incorrectly.» Lewis and Burke (1949) - or did it really?

  25. Whoseresponsibility is it? - or did it really?

  26. «Corpus linguistics needs to ‘catch up’ [...]» Gries (forthcoming, p.1) - or did it really?

  27. References (http://privat.hihm.no/buj) Boneau, A. C. (1960). The effects of violations of assumptions underlying the t test. Psychological Bulletin, 57(1), 49-64. Good, P.I. & Hardin, J.W. (2012). Common errors in statistics (and how to avoid them). Hoboken: John Wiley. Gries, S (forthcoming). Quantitative designs and statistical techniques. http://www.linguistics.ucsb.edu/faculty/stgries/research/InProgr_STG_QuantDesAndMethCorpLing_CUPHb.pdf Larson-Hall, J. (2010). A Guide to Doing Statistics in Second Language Research Using SPSS. New York: Routledge. Lewis, D., & Burke, C. J. (1949). The use and misuse of the chi-square test. Psychological Bulletin, 46(6), 433-489. Blom & Paradis (2013). Past Tense Production by English Second Language Learners With and Without Language Impairment. In Journal of Speech, Language, and Hearing Research. 56, 281-294. Danckaert, L. (2011). On the left periphery of Latin embedded clauses. Ph.D. thesis. University of Gent. Gujord, A.H. (2013). Grammatical encoding of past time in L2 Norwegian : The roles of L1 influence and verb semantics. Ph.D. thesis. University of Bergen. Hunter, J.D. (2011). A multi-method investigation of the effectiveness and utility of delayed corrective feedback in second-language oral production. Ph.D. thesis. University of Birmingham. Klavan, j. (2012). Evidence in linguistics : corpus-linguistic and experimental methods for studying grammatical synonymy. Ph.D. thesis. University of Tartu. Leedham, M. (2011). A corpus-driven study of features of Chinese students’ undergraduate writing in UK universities. Ph.D. thesis. The Open University. Sokolova, S. (2012). Asymmetries in Linguistic Construal : Russian Prefixes and the Locative Alternation. Ph.D. thesis. University of Tromsø. - or did it really?

More Related