400 likes | 560 Views
An introduction to CHILDES. Rianne Schippers r.schippers@uu.nl. Outline. What is CHILDES? Where do you find CHILDES? Why would you use CHILDES? How do you use CHILDES?. What is CHILDES?. Chi ld L anguage D ata E xchange S ystem Brain MacWhinney
E N D
An introduction to CHILDES Rianne Schippers r.schippers@uu.nl
Outline • What is CHILDES? • Where do you find CHILDES? • Why would you use CHILDES? • How do you use CHILDES?
What is CHILDES? • Child Language Data Exchange System • Brain MacWhinney • Online database of first and second language acquisition in children. • Written transcripts • Audio • Video • Also contains data from not typically developing children.
What is CHILDES? • Recorded, natural speech. • Recorded in home setting • Recorded at regular intervals • Longitudinal data • Typological variation. • Germanic • Romance • Slavic • Asian
Where is CHILDES? Link: http://childes.psy.cmu.edu/ • “Databases”, i.e. the datasets. • “Database manual” describing each dataset. • Programs you can use to browse the databases • Manuals that explain how to use the programs
Why use CHILDES? • Answer questions about language acquisition • Experimental studies • Does child at age X know Y? • Do 3-year-olds know passives? • Do 2-year-olds know inflectional morphology? • What interpretation do children at age X assign to Y? • Do 4-year-olds understand binding? • Do 5-year-olds understand scope freezing?
Why use CHILDES? • Questions experiments cannot easily answer: • Role played by input • Order of acquisition • Manner of acquisition • Causality • Longitudinal study Big, universal questions • Lexical categories • Inflectional morphology • Argument structure
Why use CHILDES? • Does the interaction between language type and pronoun omission match the predictions of parameter-setting models? • Are children with Down syndrome responsive to maternal requests? • How do children first learn mental state verbs such as “remember” or “know”?
Why use CHILDES? • Smaller, language specific questions • Verb second • Subjects (EPP) • Particle verbs • Comparative studies • Acquisition of determiners • Exploration • Mean Length of Utterance, frequencies
How to use CHILDES? • Download and install the dataset(s) you are interested in. The “database manual” describes • Language • Age(s) • Number of children • Download and install CLAN (Computerized Language Analysis): • A search and statistics engine for CHILDES. • OR use the NLTK’s CHILDES module.
How to use CHILDES? • All files are transcribed in CHAT format • Codes for the Human Analysis of Transcripts • Format • Files start with @-headers: information about participants and setting • The rest of the file contains *-tiers and %-tiers • *-tiers: specify the speaker (*CHI = child) • %-tiers: are related to the previous *-tier and give extralinguistic information
How to use CHILDES? • %-tiers are also used for coding • %pho for phonology • %mor for morphology *CHI: I have a ball %mor: PRO|I&1S V|have-PRES DET|a&INDEF N|ball • %syn for syntax
How to use CHILDES? • Some more annotations #unfilled pause between words 6 schwa & phonological fragment xxx unintelligible speech [/] retracing without correction, e.g..: then [/] then [//] retracing, with correction, e.g.: then [//] but < >["] quotation mark, used when the child literally repeats something • All notation can be found in the CHAT manual
How to use CHILDES? • Go to the command window • Every search starts with a command • kwal: word search • combo: combined search for 2 or more words • freq: frequency counts • mlu: mlu counts • A command is followed by search parameters
How to use CHILDES? • Some standard CLAN parameters +t selects the utterances of a specified speaker +s selects a word to be searched +u specifies that all search results are stored in one file +r deals with the treatment of material between parentheses +f output is stored in the (specified) file(s) • Not all commands have the same search parameters • Type the command in the command window and hit enter
How to use CHILDES? • Searching with kwal • Speaker(s) • Word • File(s) • Command must come first, the order in which the search parameters are given is irrelevant • Every search parameter and the command must be separated from each other by a space
How to use CHILDES? • Setting the speaker parameter • Identify the speaker(s) +t = look for that specific speaker -t = look for everyone but that specific speaker • We are interested in the child • command parameter-speaker-child kwal +t*CHI
How to use CHILDES? • Setting the word parameter • Decide what word you want to look for +s = look for that specific word -s = look for everything except that specific word • Let’s say we want to know whether the child has used the auxiliary ‘want’. • command speaker parameter-word-want kwal +t*CHI +s”want”
How to use CHILDES? • Specifying the file • Two ways: • Using the ‘file in’ button • Specifying the file in the command line • Let’s say we want to start our search in file sarah023.cha • Command speaker word file kwal +t*CHI +s”want” sarah023.cha
How to use CHILDES? Exercise: • Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha
How to use CHILDES? Exercise: • Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha Steps to take: • Determine the command • Identify the speaker • Decide on the word • Specify the file
How to use CHILDES? Exercise: • Discover whether the mother uses the auxiliary ‘want’ in file sarah023.cha Steps to take: • Determine the command • Identify the speaker • Decide on the word • Specify the file kwal +t*MOT +s”want” sarah023.cha
How to use CHILDES? • Searching for several words • Make a list in .txt format • Enter the list as the word you are looking for • For example: • A list with all auxiliaries • Named auxiliary.txt • Parameter: +s@auxiliary.txt kwal +t*CHI +s@auxiliary.txt sarah023.cha
How to use CHILDES? • Output screen is limited • Store the data in a separate file • Parameter: +f • File name has three letters • For example: aux • Command speaker word parameter-store-filename file kwal +t*CHI +s”want” +faux sarah023.cha
How to use CHILDES? • Retype the command: kwal +t*CHI +s”want” sarah023.cha • Notice: some material is in between brackets *CHI: wan(t) do (a)gain • What does this mean? • Child actually said ‘wan’ instead of ‘want’. • CLAN will standardly include the material in between brackets. • CLAN will look for ‘want’
How to use CHILDES? • What does this mean? • A search for ‘want’ will give you both ‘wan(t)’ and ‘want’. • Control whether the search includes material in between brackets. • +r parameter +r1 = default, include material in brackets +r2 = exclude material in brackets +r5 = exclude rephrased material
How to use CHILDES? • Try out: kwal +t*CHI +s”want” +r2 sarah023.cha • +r5 allows for exclusion of rephrased material • What is rephrased material? *CHI: I wanna [: want to] eat cereal • In the default setting, CLAN will look for rephrased material • The +r5 option allows you to look for ‘wanna’.
How to use CHILDES? • Searching with both +s and –s • CLAN only allows you to specify either +s or -s • Imagine you want to look for all the conjugations of one verb, but are not interested in any other, identical words • For example: all the verbal forms of ‘go’ • First of all: wild card • Wild card *, allows you to look for anything
How to use CHILDES? • Adding the * to the word search +s”go*” • Words that this search will find are: go, gone, goes, going • But also words such as: got, good, goat, god etc. • Ideally, you want to specify both +s and –s • Piping option
How to use CHILDES? • Piping: the second command operates on the output of the first command • First command: look for ‘go*’ second command: exclude ‘good’, ‘got’, etc. • In order for the second command to be able to operate on the first, the first command must give an output in CHAT format • +d option
How to use CHILDES? • First command: • Look for ‘go*’ • For the speaker *CHI • Output must be in CHAT format • In file sarah040.cha kwal +t*CHI +s”go*” +d sarah040.cha • Second command: exclude ‘got’ kwal –s”got”
How to use CHILDES? • Piping the first and the second command first command piping-operation second command kwal +t*CHI +s”go*” +d sarah040.cha | kwal –s”got”
How to use CHILDES? • Looking for more than one word at a time • Searching with combo • Speaker(s) • Words • File(s) • Boolean operators: ^ = immediately followed by * = any character + = or ! = not
How to use CHILDES? • Setting the speaker parameter combo +t*CHI • Setting the word parameter • Let’s look for the combination of ‘want’ and ‘to’ • ‘want’ immediately followed by ‘to’ combo +t*CHI +s”want^to”
How to use CHILDES? • Specifying the file • Let’s look in file sarah034.cha combo +t*CHI +s”want^to” sarah034.cha • Combo looks for the words in sequence by default • The +x parameter allows you to look for two or more words in any order
How to use CHILDES? • Searching for ‘want’ directly followed by ‘to’ without +x only gives ‘want to’ combo +t*CHI +s”want^to” sarah034.cha • Searching for ‘want’ directly followed by ‘to’ with +x gives both ‘want to’ and ‘to want’ combo +t*CHI +s”want^to” +x sarah034.cha
Pitfalls and limitations • Cannot test for acceptability or ungrammaticality • Be aware of: • Routines • Imitations • Speech errors • Mistranscriptions
Protocol • CHILDES transcripts were collected with great effort and are now freely available. In return for using them, you reward the creators with citations. • Cite latest copy of MacWhinney’s book: MacWhinney, B. (2000). The CHILDES project: Tools for analyzing talk. Third Edition. Mahwah, NJ: Lawrence Erlbaum Associates. • Cite the publication selected by the creator(s) of the database(s) you have used. • References can be found in the ‘database manuals’ on the site