1 / 60

Data Collection and Analysis in Sociolinguistics

Enrico Giai BA in Translating and Interpreting MA Student in Translation Studies Turin University Email: enrico.giai@gmail.com. Data Collection and Analysis in Sociolinguistics. Practical elements for research methods in sociolinguistics. Turin , 07-08 April 2014.

yoshi
Download Presentation

Data Collection and Analysis in Sociolinguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Enrico Giai BA in Translating and Interpreting MA Student in Translation Studies Turin University Email: enrico.giai@gmail.com Data Collection and Analysis in Sociolinguistics Practical elements for research methods in sociolinguistics Turin, 07-08 April 2014

  2. Tuesday, April 8thMain topics • Inferentialstatistics • Variables • Hypothesis • Null Hypothesis • Likelihood • Chi square test • ANOVA • Rbrul for inferential and multivariate statistics

  3. Inferential Statistics –Variables • Two types of variables • Dependent • Independent • The independent variable(s) affect the dependent variable in some predictable way • Anotherclassification (for questions): • Categorytypevariables (usuallydependentvariables) • Ordinaltypevariables • Continuoustypevariables(usuallyindependentvariables)

  4. Inferential Statistics –Experimental and Null Hypothesis • Experimental hypothesis • The hypothesis according to which a certain variable is affected in a predictable & systematic way by some other variable • Must be tested • Null hypothesis: the exact opposite of the experimental hypothesis

  5. Inferential Statistics –Likelihood and Statistical Significance • Likelihood, or statistical significance • The probability for the null hypothesis to be true • Expressed by a percentage • As a convention in the humanities and social sciences, we take 5% sure that the null hypothesis is true (p = 0.05) as a cut-off point. Greater than 5% sure (p > 0.05), we cannot reject the null hypothesis; less than or equal to 5% sure (p ≤ 0.05), we reject the null hypothesis (Levon 2010:71)

  6. Inferential Statistics – Chi Square Test (1) • Related to 2 categorytypequestions • The test compares the observed frequencies with the expected ones, in order to establish whether the null hypothesis is true or false • How to • Calculate the observedfrequencies • Calculate the expectedfrequencies • Calculate the chi squaredvalues • Sum the chi squaredvalues up • Calculate the degree of freedom • If the criticalvalue of significanceishigherthan the onerelated to p=0.05, the nullhypotesiswill be true • You can use RBRUL or TEST.CHI.QUAD Excel formula

  7. Inferential Statistics – Chi Square Test (2) • You can use TEST.CHI.QUAD Excel formula • Example: occurrences of code-switching in relation to agebrackets in Filipinolanguagesurvey • N.B.: Age as a categorytypequestionbecauseweconsideragebrackets!

  8. Inferential Statistics – Chi Square Test (3) • Observedfrequencies: • =(E3*B5)/E5 in H3 • =(E3*C5)/E5 in I3 • =(E3*D5)/E5 in J3

  9. Inferential Statistics – Chi Square Test (4) • Chi Square Test in J5: • =TEST.CHI.QUAD(B3:D3;H3:J3) The valueis > 0.05, therefore the resultswereachieved by chance (NO STATISTICAL SIGNIFICANCE)

  10. Inferential Statistics – Scatterplot • Related to 2 continuoustypequestions • Compares the correlationbetweentwovariables • Positive correlation • Negative correlation • You can use RBRUL – see slide #56

  11. Inferential Statistics – ANOVA • ANalysisOf VAriance • Bi/Multivariate Regression Analysis • Relatedto more thanonecategorytypequestionand more thanonecontinuoustypequestion • Youcan use RBRUL – see e-book

  12. Inferential and multivariate statistics • Inferential statistics • Formulating and testing hypothesis • Key concepts: likelihood, dependent and independent variables, hypothesis and null hypothesis • Multivariate statistics, or statistical modelling • How a dependant variable changes in relation to two or more independent variables • Key concept: the three lines of evidence (See Tagliamonte 2012) • Statistical significance (p<0.05) • Factor weight (FW→1) • Strength of factor group

  13. Rbrul and multivariate statistics • Rbrul • Based on R • Tool for multivariate statistics • Input: Excel worksheet • Output: numbers • What for? • Formulating hypothesis after descriptive analysis of a questionnaire/corpus • Testing hypothesis with inferential multivariate analysis • What do we need? • Excel worksheet in .csv format • R

  14. Converting .xlsx format to .csv (1) Let’sconsider the Filipinolanguagesurvey (.xls format) 1. Go to http://www.docspal.com (or another online converter)

  15. Converting .xlsx format to .csv (2) 2. Upload .xls or .xlsx Excel file and select .csv in “convert to”

  16. Converting .xlsx format to .csv (3) 3. Click on “Convert”

  17. Converting .xlsx format to .csv (4) 4. Click on output file

  18. Converting .xlsx format to .csv (5) 5. Click on “Salvapagina con nome”

  19. Converting .xlsx format to .csv (6) 6. .csv output file

  20. Rbrul: Installation step-by-step (1) 1. Download R: http://cran.r-project.org/bin/windows/base/

  21. Rbrul: Installation step-by-step (2) 2. Press “Avanti” until the installation process finishes.

  22. Rbrul: Installation step-by-step (3) 3. Open R. If you have troubles, right-click “Esegui come amministratore”.

  23. Rbrul: Installation step-by-step (4) 4. Open R.

  24. Rbrul: Installation step-by-step (5) 5. Write: source(“http://www.danielezrajohnson.com/Rbrul.R”)

  25. Rbrul: Installation step-by-step (6) 6. Hit the Enter key

  26. Rbrul: Installation step-by-step (7) 7. Write rbrul()

  27. Rbrul: Installation step-by-step (8) 8. Hit the Enter key. Now you are in Rbrul.

  28. Rbrul: Loading data (1) 1. Write 1 and press the Enter key

  29. Rbrul: Loading data (2) 2. Write c and press the Enter key

  30. Rbrul: Loading data (3) 3. Open the questionnaire in .csv

  31. Rbrul: Loading data (4) 4. Now you are ready

  32. Example: Linguisticsurvey and RBRUL (1) • Consideredvariables: • Code-switching (categorytypevariable/question) • Whospeakswhatlanguage(s) at work, with friends, & with family in IT & PH (continuoustypequestion) • Whouseswhatlanguage(s) whenwatching TV, reading, dreaming, & thinking(categorytypequestion) • Number of knownlanguages (continuoustypequestion) • Age (continuoustypequestion)

  33. Example: Linguisticsurvey and RBRUL (2) • Hypothesis: • Code-switching & whospeakswhatlanguage(s) at work, with friends, & with family in IT & PH (cat+con: bivariateanalysis) • Code-switching & Whouseswhatlanguage(s) whenwatching TV, reading, dreaming, & thinking(cat+cat: cross tabulation) • Number of knownlanguages & age (con+con: scatterplot)

  34. Hypothesis: Code-switching and language use (1) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis. Isthere a relation between the number of languagesused to talk with friends in PH and in IT & the occurrences of code-switching? Average PH: 1.43 Average IT: 1.31 Category+continuous: bivariateanalysis • Press 5 for bivariate analysis and hit Enter key.

  35. Hypothesis: Code-switching and language use (2) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 2. Choose variables (1)

  36. Hypothesis: Code-switching and language use (3) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 3. Dependant variable (50)

  37. Hypothesis: Code-switching and language use (4) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 4. Type of response (Enter)

  38. Hypothesis: Code-switching and language use (5) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 5. Choose application (2 + Enter x3)

  39. Hypothesis: Code-switching and language use (6) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 6. Choose independent variable (# lang used with Friends in IT/PH) (42 Enter 46 Enter x2)

  40. Hypothesis: Code-switching and language use (7) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 7. Choose continuous variable (42 Enter 46 Enter x2)

  41. Hypothesis: Code-switching and language use (8) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 8. Modelling (5 Enter)

  42. Hypothesis: Code-switching and language use (9) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis 8. Modelling (5 Enter)

  43. Hypothesis: Code-switching and language use (10) Formulate hypothesis on code-switching and language use with friends in IT/PH using bivariate analysis Logodd: 0.571 vs 0.292 (Ifpositive, high likelihood) Deviance: 142.818 vs 144.821 (The larger the deviance, the less accurate the result given) P value: 0.0644 vs 0.234 (>0.05) Therefore: Correlation code-switching/language use with friends is NOT SIGNIFICANT

  44. Hypothesis: Code-switching and language use (11) The same procedure can be adopted in analysing the relation between code-switching & languageused with family & at work in Italy and in the Philippines

  45. Hypothesis: Code-switching and language use – TV (1) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. Isthere a relation betweenthe languagesused to watch TV & the occurrences of code-switching? Category+category: cross tabulation 1. Press 4for cross tabulation and hit Enter key.

  46. Hypothesis: Code-switching and language use – TV (2) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. 2. Choose factors for columns (50 Enter)

  47. Hypothesis: Code-switching and language use – TV (3) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. 3. Choose factors for rows (51 Enter x3)

  48. Hypothesis: Code-switching and language use – TV (4) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. 4. Cross tabulation Do thosewhowatch TV in Italian code-switch more?

  49. Hypothesis: Code-switching and language use – TV (5) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. 5. Chi Square Test in Excel • Effectivefrequency of Italian/Code-switching: 45 • Expectedfrequency of Italian/Code-switching: 32.09 • Multiply the total amount of observed frequencies related to the first independent variable (=45) and the total amount of observed frequencies related to its dependent variable (=87). The amount is then divided by the total amount of the frequencies (=122).

  50. Hypothesis: Code-switching and language use – TV (6) Formulate hypothesis on code-switching and language use when watching TV using cross tabulation and Chi Square Test. 5. Chi Square Test in Excel

More Related