460 likes | 650 Views
Chapter 8 Coding. Course: Quantitative Research Group members: Catherine, Rainie, Louis. Outline. Preparing data for coding Transcribing oral data Transcription Conventions Transcription Machines Technology and Transcription Data Coding Nominal Data Ordinal Data
E N D
Chapter 8 Coding Course: Quantitative Research Group members: Catherine, Rainie, Louis
Outline • Preparing data for coding • Transcribing oral data Transcription Conventions Transcription Machines Technology and Transcription • Data Coding Nominal Data Ordinal Data Interval Data
Preparing data for coding Once data are collected, it is necessary to organize and analyze them…. • Excel, SPSS, SAS,JMP…etc. • Digital form (numbers) Coding involves making decisions: • Classify or categorize data raw data well-organized data oral datacoding form of essays test scores diaries checkmarks on observation schemes transcribe
transcription conventionsTranscribing oral data transcription machine Transcription conventions • facilitate the oral data in a written format • useful for coding & for providing examples Notations in transcripts: In the study of scaffolding in L2 peer revision p.223 • Italics (De Guerrero and Villamil,2000) • [ brackets ] • ( parentheses ) • … sequence of dots • Boldface • “ Quotation marks”
Example of transcription conventionsp.224 Emma: uh HONEY I'LL PRAY FOR EV'RYBODY= Lottie: [=Alri:ght,] Emma: [=I:- I ] don't kno:w,hh this: [uh w]orld goes o::n= Lottie: [Yeh.] Emma: =we have to keep ¯ goin' do[n't we.] Lottie: [Ye:ah, ] (.) Lottie: [U h h u h ] Emma: [D'you feel h]a:ppy toda:y? (0.4) Lottie: Ye:ah. Emma: Good. (.) http://www.lboro.ac.uk/departments/ss/JP-docs/Transcription%20conventions.htm
Transcription machines foot pedal • become easier to transcribe headphones • rewind tapes automatically Technology and Transcription (digital recording equipment) • More reasonably priced and accessible • Automate the bulk of transcription task ✖ nonnative accents can’t handle
Nominal data • Often used for classifying categorical data • Numerical values Dichotomous variables gender- 1:male 2: female Nondichotomous variables language- 1: Chinese 2: English 3: Spanish
Ordinal data • Often used for ranking data • Indicate two students are close together Advantages of rank groups group • Some data can be discount top 25% (middle range score) bottom 25 %
Interval data • Often used for ranking data and indicating the distance • Pay attention to impact→ -10 -10
Outline • Coding System - Common coding systems and categories (1) T-units (2) Suppliance in Obligatory Contexts (3) CHAT - Custom-Made Coding System (1) Question Formation (2) Negative Feedback (3) Classroom Interaction (4) Second Language Writing Instruction
8.3 Coding System • Bernard (1995) suggested that “ the general principle in research is always use the highest level of measurement that you can” • If researchers code data using as finely grained a measurement as possible, the data can always be collapsed into a broader level of coding. • The categories in coding should be narrow as possible.
Different coding practices can be used with 2nd language data → allow researchers to gain a deeper understanding of the information they’ve collected. • Coding systems are often referred to as (1) sheets (3) techniques (2) charts (4) schemes • Many researchers develop a coding scheme based on their specific research questions. • It would be helpful if researchers made more use of existing coding schemes, because this would facilitate easy comparison across studies.
However, sometimes existing schemes require: (1) refinements to capture new knowledge (2) new schemes are required • Coding systems range from those based on standard measures. • Many different schemes have been developed for coding all sorts of second language data. • However, it’s important to recognize that it would be impossible to cover the whole range of existing schemes.
8.3.1 Common Coding Systems &Categories • A number of coding systems for oral & written data include: • T-units • Suppliance in obligatory context (SOC)counts • CHAT conversation • Turns • Utterances • Sentences • Communication units • Tone units • Analysis of speech units • Idea units • Clauses • S-nodes per sentence • Type-token ratios • Targetlike usage counts
8.3.1.1 T-Units • Defined as “one main clause with all subordinate clauses attached to it. • They were originally used to measure syntactic development in children’s L1 writing but later become a common measurement in second language research as well.
Example of a T-unit: After she had eaten, Sally went to the park ↗ subordinate clause↗ main clause →This T-unit is error-free, it contains no nontargetlike language. • Another alternative T-unit example: After eat, Peter go to bed →contain error↗ main clause →This T-unit contains an error.
To code using T-units, a researcher may go through an essay or a transcription and count the total number of T-units. • From this number, the researcher could count all the total number of T-units not containing any errors & then present a ratio. • T-units have been used as a measurement of linguistic complexity, as well as accuracy.
8.3.1.2 Suppliance in obligatory context (SOC) counts • The learners’ level of a acquisition can be measured in terms of how often these features are supplied where they are required. And this known as SOC. EX: He is singing right now. → the –ing is required because this is a context in which the progressive form is obligatory.
SOC was first used in studies of the acquisition of grammatical morphemes by children acquiring English as their 1st language (later applied in 2nd ) • Although SOC is a useful measurement of morpheme use in required contexts, Pica’s (1984) study criticized that. • Pica indicated that SOC does not account for learners’ use of morphemes in inappropriate contexts. Pica used target-like usage (TLU) as an additional measure. It takes into account both appropriate and inappropriate contexts.
8.3.1.3 CHAT • CHAT was developed as a tool for the study of first and second language acquisition as part of the Child Language Data Exchange System (CHILDS) database.
CHAT has become an common system for: (1) the coding of conversational interactions (2) employs detailed conversations • CHAT is particular useful in qualitative research. • It is important to realize that many researchers working in qualitative paradigms have argued that quantification of coding does not adequately represent their data. • The goal for researchers is to ascertain how best to investigate one’s own research questions. • In much 2nd language research, preexisting coding systems and categories are the exception rather than the rule.
8.3.2 Custom-Made Coding System 8.3.2.1 Question Formation • The researchers needed a coding scheme that would allow them to identify how the learners’ question formation changed over time. • To code the data, Mackey & Philp first designated the questions produced by their child learners as belonging to one of the six stages based on the Pienemann-John hierarchy. The modified version is on Table 8.6.
Coding for Questions: Tentative Stages for Question Formation Goal: to capture processing capabilities & developing linguistic complexity
After following the assignment of each question to a particular stage, the next step is to determine the highest level stage that the learners reached. • The next step of the coding involved the assignment of an overall stage to each learner, based on two highest-level question forms asked in two different tests. • It was then possible to examine whether the learners had improved over time.
Table 8.7 Coding for Question Stage ID PretestImmediate PosttestDelayed Posttest Task Task Task FinalTask Task Task FinalTask Task Task Final 1 2 3 Stage1 2 3 Stage1 2 3 Stage AB 3 3 2 3 3 3 3 3 33 2 3 AA 3 3 3 3 5 5 4 5 5 5 4 5 AC 3 4 3 3 2 2 3 2 3 3 3 3 AD 3 3 4 4 3 5 5 5 5 3 3 3 • Learner AB continues throughout the study at the third stage. • Learner AA began the study at stage 3 & continued through the next three posttest at Stage 5. • Once this sort of coding has been carried out, the researcher can make decisions about the analysis.
8.3.2.2 Negative Feedback • Oliver developed a hierarchical coding system for analysis that first divided all teacher-student and NS-NNS conversations into three parts: →Native Speaker – Nonnative Speaker (1) NNS’s initial turn (2) the response given by the teacher of NS partner (3) the NNS’ reaction → each part was subjected to further coding.
Figure 8.1 Three-turn coding scheme rated as Initial Turn→ Correct Non-target Incomplete ↙↓ ↘ NS Response→ Ignore Negative Feedback Continue ↙↓↘ NNS Response→ Response Ignore No Chance • As with many schemes, this one is top-down, known as hierarchical, & the categories are mutually exclusive.→ meaning that it is possible to code each piece of data in only one way.
8.3.2.3 Classroom Interaction • When an error occurred, the next turn was examined to determine whether the error was occurred, or whether it was ignored and the topic continued by the teacher or the learner. • If the error was corrected, the following turn was examined and coded according to whether the learner produced uptake or whether the topic was continued. • Finally, the talk following uptake was examined with regard to whether the uptake was reinforced or the topic or the topic continued.
Second language writing instruction • Two studies used coding categories: (1) Adams (2003): → investigate the effects of written error correction on learners’ subsequent 2nd language writing (2) Sachs & Polis (2004) → compared three feedback conditions
Although the two studies focused on changes in linguistic accuracy in L2 writing, the researchers used different coding schemes to fit the question to compare the four feedback conditions with each other. (1) original error (s) (+) (2) completely corrected (0) (3) completely unchanged (-) (4) not applicable (n/a) • Adams coded individual forms as: (1) more targetlike (2) not more targetlike (3) not attempted (avoided) • Sachs & Polio considered T-unit codings of “at least partially changed” (+) to be possible evidence of noticing even when the forms were not completely more targetlike.
Introduction • Task planning • Coding Qualitative Data • Interrater Reliability • The mechanics of coding • Conclusion
8.3.2.5. Task planning Through operationalization • Yuan and Ellis (2003): (1) Fluency: (a) number of syllables per minute, and (b) number of meaningful syllables per minute (2) Complexity: syntactic complexity, the ratio of clauses to t-units; syntactic variety, the total number of different grammatical verb forms used; and mean segmental type-token ration. (3) Accuracy: the percentage of error-free clauses, and correct verb forms (the percentage of accurately used verb forms). • Benefit of a coding system: is similar enough to those used in previous studies that results are comparable, while also finely grained enough to capture new information.
8.3.3 Coding qualitative data(1) • The schemes for qualitative coding generally emerge from the data (open coding) • Consider aspects such as the range of variation within individual categories can assist: (1) the procedure of adapting (2) finalizing the coding system • Looking for anything pertinent to the research question or problem Examining the data for emergent patterns and themes
8.3.3 Coding qualitative data(2) • Themes and topics should emerge from the first round of insights into the data (independent categories) • Problem: With developing highly specific coding schemes, it can be problematic to compare qualitative coding and results across studies and contexts. • Watson-Gegeo (1988): “Although it may not be possible to compare coding between settings on a surface level, it may still be possible to do so on an abstract level.”
8.4. Interrater reliability(1) • If interrater reliability is high, results will be very similar. • Only one coder and no intracoder reliability measures the reader’s confidence may be undermined • To increase confidence: (1)More than one rater code the data wherever possible (2)Carefully select and train the raters • To keep coders selectively blind about what part of the data or for which group they are coding, in order to reduce the possibility of inadvertent coder biases
8.4. Interrater reliability(2) • Another way to increase rater reliability is to schedule coding in rounds or trials to reduce boredom or drift • How much data should be coded: as much as is feasible give the time and resources available for the study • How much data should be coded by a second rater (the nature of the coding scheme) • With highly objective, low-inference coding schemes, it is possible to establish confidence in rater reliability with as little as 10% of the data.
8.4.1.1. Simple percentage agreement • The ratio of all coding agreements over the total number of coding decisions made by the coders (appropriate for continuous data) • The drawback:they have a tendency to ignore the possibility that some of the agreement may have occurred by chance.
8.4.1.2. Cohen’s kappa • This statistic represents the average rate of agreement for an entire set of scores, accounting for the frequency of both agreements and disagreements by category. • In a dichotomous coding scheme ( like targetlike or nontargetlike): (1)First coder: targetlike, nontargetlike (2)Second coder: targetlike, nontargetlike (3)First and Second coders: targetlike • It also accounts for chance.
8.4.1.3. Additional measures of reliability • Pearson’s Product Moment or Spearman Rank Correlation Coefficients: are based on measures of correlation and reflect the degree of association between the ratings provided by two raters.
8.4.1.4. Good practice guidelines for interrater reliability • “There is no well-developed framework for choosing appropriate reliability measures.” (Rust&Cooil 1994) • General good practice guidelines suggest that researchers should state: (1)Which measure was used to calculate interrater reliability (2)What the score was (3)Briefly explain why that particular measure was chosen.
8.4.1.5 How data are selected for interrater reliability tests • Semi-randomly select a portion of the data (say 25%), then coded by a second rater • To create comprehensive datasets for random selection of the 25% from different parts of the main dataset • Intrarater reliability refers to whether a rater will assign the same score after a set time period.
8.4.1.6. When to carry out coding reliability checks • The following reporting on coding: (1)What measure was used (2)The amount of data coded (3)Number of raters employed (4)Rationale for choosing the measurement used (5)Interrater reliability statistics (6)What happened to data about which there was disagreement • Complete reporting will help the researcher provide a solid foundation for the claims made in the study, and will also facilitate the process of replicating studies.
8.5. The mechanics of coding • (1)Using highlighting pens, working directly on transcripts. • (2)Listening to tapes or watching videotapes without transcribing everything: May simply mark coding sheets, when the phenomena researchers are interested in occur. • (3)Using computer programs (CALL programs).
8.5.1. How much to code • (1)Consider and justify why they are not coding all their data • (2)Determining how much of the data to code ( data sampling or data segmentation) • (3)The data must be representative of the dataset as a whole and should also be appropriate for comparisons • (4)The research questions should ultimately drive the decisions made, and to specify principled reasons for selecting data to code.
8.5.2 When to make coding decisions • How to code and who much to code prior to the data collection process • An adequate pilot study: This will allow for piloting not only of materials and methods, but also of coding and analysis. • The most effective way to avoid potential problems: Designing coding sheets ahead of data collection and then testing them out in a pilot study
8.6. Conclusion • Many of processes involved in data coding can be thought through ahead of time and then pilot tested. • These include the preparation of raw data for coding, transcription, the modification or creation of appropriate coding systems, and the plan for determining reliability.