1 / 30

Estimating Rates Of Lexical Change

Estimating Rates Of Lexical Change. Andrew Meade a.meade@reading.ac.uk University of Reading. Rates Of Lexical Change. Lexical rate variation Inferring evolutionary histories Calculating rates of lexical evolution Underlying processes. 15 lines representing a bull.

meryle
Download Presentation

Estimating Rates Of Lexical Change

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Estimating Rates Of Lexical Change Andrew Meade a.meade@reading.ac.uk University of Reading

  2. Rates Of Lexical Change • Lexical rate variation • Inferring evolutionary histories • Calculating rates of lexical evolution • Underlying processes

  3. 15 lines representing a bull

  4. Phylogenetic Comparative Methods • Popular in biology for 20 years • ancestral states, correlated evolution and rates of evolution, hypothesis testing Traditional statistics Assumes data is independent Comparative methods

  5. The Language ‘gene’ • Swadesh list, Morris Swadesh 1940, onwards • 200 meaning forming basic vocabulary • Chosen to be stable, fundamental and resistant to borrowing. • 95 Indo European languages + Hittite and Tocharian

  6. Cognate classes • Word with a common evolutionary ancestry English Fish Danish Fisk Dutch Visch Czech Ryba Russian Ryba Bulgarian Riba Fish Ryba 34other languages 23 other languages

  7. IE cognate classes Average 17 1 “Who”, “Three” 35 “Person”, “Dirty” 1 17 35

  8. Languages Meanings

  9. Phylogenetic inference Time 1000 years Q10 0 Non cognate 1 Cognate Q01 0 0 0 0 0 0 1 1

  10. MCMC Phylogenetic inference • Creates a statically justified sample of trees • Sample tress in proportion to there probability • Used to correct of the non-independence in the data Results = Data + Method

  11. Random tree -58204 Log units 4.1 x 1014107 Most probable Infinite number of poor trees

  12. Out group Greek Indo-Iranian Slavic Celtic Germanic Romance

  13. Inferring lexical rates “Name”, 3 cognate classes Class A, Gypsy (Alav), Persian (Esm) Class B, Latvian (Vards), Lithuanian (Vardas) Class C, All the rest, Hindi (Nam), Greek (Onoma), Italian (Nome) Class A B  A, C B, ect The estimated instantiations transition rate C  A B  A A  B A  C B  C Class B Class C C  B To many parameters, not enough data

  14. Inferring lexical rates 2 cognate classes Class 1 Class 2 Slow rate Fast rate

  15. “Salt” “Red” “Five”

  16. Mean rates for the 200 words Mean = 3.05  1.82 Median = 2.74 Min. = 0.09 Max = 9.27 100 fold difference Slow ‘two’, ‘who’, ‘one’, ‘night’, ‘to die’ Fast ‘dirty’, ‘to turn’, ‘to stab’,

  17. Word Half life 50% chance of the word being replaced by a non-cognate form Based on IE being 8000 years

  18. I-E tree showing variation in rates of lexical replacement, per 10k years “One” 0.43 “Ear” 0.88 “Sand” 4.5 ROMANCE GERMANIC GERMANIC SLAVIC INDO- IRANIAN GREEK

  19. Approximately 100-fold variation in rates of word evolution • Cultural replicators can evolve more slowly than some human genes (e.g., compare 'five' with lactase gene) • Possibility of deep linguistic reconstructions • What processes explain the variation ?

  20. Spoken word frequency British National Corpus N = 4840 words mean = 194 geometric mean = 35.94 median = 25

  21. Distribution of frequency of word use (20-100 million words) Most words used < 100 times per million

  22. Correlations between frequencies of word use r=0.88 r=0.87 Frequent of use is very stable thru out IE r=0.87

  23. Frequency vs rate of lexical evolution r=-0.37 r=-0.35 r=-0.41 r=-0.32

  24. Parts of speech conjunctions ---- prepositions ---- adjectives ---- verbs ---- nouns ---- special adverbs---- pronouns ---- numbers ---- R2=0.48 R2=0.48 Numbers, pronouns, special adverbs Stronger selection? R2=0.48 R2=0.50

  25. Summary • Simple model accounts for 50% of variation in rates of evolution across 87 languages representing ~130,000 years of evolution • Spoken word frequency seems to exert a general influence on rates of word evolution • High frequency words less likely to be borrowed • Languages evolve initially in less frequently used parts of vocabulary, retaining mutual intelligibility • Cultural replicators can evolve more slowly than some human genes (e.g., compare 'five'” with lactase gene)

  26. Acknowledgements • Mark Pagel • Quentin Atkinson • Russell Gray • ACET

  27. Some similarities between linguistic and genetic systems

More Related