450 likes | 557 Views
Calculating the geospatial cost of sound change. Mark Livengood , Thomas Purnell, Eric Raimy and Joseph Salmons University of Wisconsin-Madison. Goals. To advance understanding of the relationship between geographically-coded data and language data
E N D
Calculating the geospatial cost of sound change Mark Livengood, Thomas Purnell, Eric Raimy and Joseph Salmons University of Wisconsin-Madison
Goals To advance understanding of the relationship between geographically-coded data and language data To transform our notion of dialect and speech community based on geographical, demographic and social distribution of multiple features
Overview We know that /æ/ is changing in this region and this time period. Question: How does that change spread over time and space? Geo-social structures (the gravity model; Trudgill 1974) can trump straight-line geography (the wave model; Schmidt 1872), Value addition with georeferenced social factors (Britain 2002; Lee & Kretzschmar 1993)
The issue of scale The most important related work (e.g. Trudgill, Lee & Kretzschmar, Labov) has focused on vast areas — Grieve et al. use North America. We start from this position: Language and social structure are local. Use data that is more representative than ANAE and measure diphthongization
Neighborhoods by demographics x distance x linguistic features Chambers and Trudgill (1998: 178ff): cross-city influence matrix P=population of geographic center d=distance between centers S=index of linguistic similarity
Sociolinguistic Literature tell us … • Language varies … • As individuals speak to one another (locality) • Language is a brokered agreement between humans and used for various ends • Both within and across geographic domains (identity) (historical continuity) (translocational communication) • E.g., Blacks share markers with whites within a location differentiating them from Blacks elsewhere; yet, speakers often share pan-AAE markers
Geolinguistic Literature tells us … • Language varies … • By presence/absence of barriers (boundary conditions) • By sphere of influence to immediately smaller locations where similarity and status matters (gravity) • E.g., Chicago to Rockford and St. Louis • By large sweeping patterns where distance matters (wave) • E.g., CAUGHT~COT merger in US
Social Science Literature tells us … • Local knowledge varies … • In a rapidly decaying fashion (rapid decay) • E.g., there is a ‘nearness’ factor and not all data points have equal influence over each other • Multiple factors influence the spread (or not) of local knowledge (regressive covariation) (costly) • E.g., cost involved with transferring information regarding competition and cooperation
Features of Model • Locality, identity & historical continuity by community: geographic and social barriers • Gender, ethnicity, age, immigration, topography • Gravity & rapid decay: attraction by population centers within proximate range based on population • Regressive covariation & cost: varying weights and multiple solutions by location • Wave & measurable features: known markers that spread
Speakers • 20 speakers from WELS and DARE datasets • 1870s: 2 • 1880s & 1890s: 4 + 2 • 1900s & 1910s: 4 + 2 • 1920s, 1930s, & 1940s: 3 + 1 + 2 • 16 Locations in WI
Idealized Model This model accounts for regressive covariation and cost For some speech knowledge qua behavior in locale ℓ, Kℓ = βK1 Sℓ + βK2 Gℓ + εK K is a proxy for knowledge output (acoustic measures) Sℓ = social factors Gℓ = geographic factors
Society • Locality, identity & historical continuity by community: geographic and social barriers • Gender, ethnicity, age, immigration, topography Sℓ = βS1 Fℓ + βS2 Eℓ + βS3 log( Lℓ ) + βS4 log (Wℓ ) + εK • F = % of population, foreign born in 1900 • E = % of population, black in 1900 • L = value of livestock in 1900 • W = total manufacturing wages in 1900
Geography Gravity & rapid decay: attraction by population centers within proximate range based on population The features for the more geographic features can be stated similarly, as Gℓ = βK1 log( Pℓ ) + βK2 log (Tℓ ) + βK3 Bℓ + εK P = log (county population per sq. mi.) T = log (time to Milwaukee per time to Minneapolis) B = index of public or private transportation costs to MKE
Geographic Measures • Designed to capture gravity and decay • Population density • 1900 population / sq mi in county • Measure of time of transportation • log(distance to Mke/distance to Minn) ℓ • Negative value is beneficial • Measure of manner of transportation • Number of ‘jumps’ in transportation type, and cost of transportation (0-3) • Private is more costly than public • Train is more costly than bus
So, what’s Kℓ? • Ceteris paribus, presence or absence of regional markers • /æ/ class of words Kℓ = βK1 log( VDℓ ) + βK2 log (F1Nℓ ) + βK3 log (F2Nℓ ) + βK4 log (TLℓ ) + βK4 log (Θℓ ) + εK • Speaker variables • birth year • gender
Vowel Measures • Recordings of “Arthur the Rat” • Extracted from WELS/DARE recordings • Aligned TextGrid for Praat from Penn Aligner • Corrected edges of /ae/ and neighboring segments • Post processing selection • Pre-obstruent V > 40 msec in the front of the vowel space • /æ/ measures from Praat: VD=vowel duration, F1N=nucleus height, F2N=nucleus backness, TL=trajectory length, Θ=trajectory angle
Preliminaries Problem 1: acoustic similarity and grouping speakers with respect to birth year and gender Problem 2: Covariance matrix for Geography Problem 3: Covariance matrix for Society
K = Acoustic similarity • Cluster analysis on individual characteristics • First threw out a speaker because outlier on vowel height • New N = 19, but from one of the communities with two speakers • Clusters — but driven by birth year and gender • 1. males of all ages • 2. females born before 1900 • 3. females born after 1900
Preliminaries Problem 1: acoustic similarity and grouping speakers with respect to birth year and gender Problem 2: Covariance matrix for Geography Problem 3: Covariance matrix for Society
Geographic measures • Recall: two gradient measures • Travel time differential to Milwaukee • Population density • Linear covariation near significant • R2 = 0.15, p=0.056 • One potential outlier; would make significant • Selected transportation time • Transportation captures density
Preliminaries Problem 1: acoustic similarity and grouping speakers with respect to birth year and gender Problem 2: Covariance matrix for Geography Problem 3: Covariance matrix for Society
Society measures • Recall 4 measures • Urban class, rural class, ethnicity, immigration • Covary? • Rural class with urban class (R2 = 0.19, p<0.05) • Rural covaries with transportation time (R2 = 0.39, p<0.05); urban doesn’t • Immigrants with rural class (R2 = 0.48, p<0.05) and urban class (R2 = 0.41, p<0.05) • Ethnicity does not covary urban class or transportation
Revised (realistic) model Dep var: Indiv acoustic measures Ind vars: urban class + ethnicity + transportation time Weight by speaker class (birth year, gender)
Not significant Vowel backness Vowel height Angle of trajectory
Significant relation 1 Duration x urban social class
Significant relation 2 Trajectory length x transportation time
Whence straight-line distances? • Longitude is significant for vowel trajectory and almost for duration • Neither latitude nor longitude is significant for the other three measures • Interpretation • Bias toward westward settlement patterns • For eastward moving CAUGHT~COT expect inverse relation
Summary • Clarification of the broad sociolinguistic category of “geography • Parametric power: encodes distance and population • Reduces complex matrix of Chambers & Trudgill • Broadly reconceputalizes the notion of “geography” Lx measures = urban class + ethnicity + transportation time Weighted by age, gender • Keeps the focus local
Geographic influence on language variation? • Testing to see if georeferenced data is better than straightline distance • Knew this going in, but need to demonstrate this because current studies continue to ignore this • Some features do fall out by longitude (duration, trajectory length); how many other studies are due to source of change being at the statistical corner of the analysis space? • Transportation time should overcome this problem because it doesn’t matter which direction one comes from.
Future work • Convert county data to more local data (April 2, 2012??) • Will permit more robust GIS computation • Better treatment of biases • Ethnicity • Immigration • Geography • Build continuity with new data collections
Thanks! UW Graduate School The Dictionary of American Regional English Wisconsin Englishes Project (Luke Annear, Trini Stickle, Nick Williams)