1 / 14

A Stochastic, Corpus-Based Approach to Mid-Vowel Distribution in Italian

A Stochastic, Corpus-Based Approach to Mid-Vowel Distribution in Italian. Erica Cei UCLA, Spring 2012. Presentation based on a Linguistics 199 project done under Prof. Bruce Hayes in Fall 2011. Italian: a 7-vowel system. Tense vs. lax vowels: minimal pairs

terah
Download Presentation

A Stochastic, Corpus-Based Approach to Mid-Vowel Distribution in Italian

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Stochastic, Corpus-Based Approach to Mid-Vowel Distribution in Italian Erica Cei UCLA, Spring 2012 Presentation based on a Linguistics 199 project done under Prof. Bruce Hayes in Fall 2011

  2. Italian: a 7-vowel system • Tense vs. lax vowels: minimal pairs <pesca> [ˈpes.ka] ‘(s)he fishes’ <pesca> [ˈpɛs.ka] ‘peach’ <torre> [ˈtor.re] ‘tower’ <torre> [ˈtɔr.re] ‘to remove’ • Vowel reduction in unstressed syllables <peschina> [pesˈki.na] ‘little peach’ <torretta> [torˈret.ta] ‘little tower’ i u e o ɛ ɔ a Note: This study focuses on the dialect of Tuscan Italian spoken in the province of Pisa (esp. town of Cascina).

  3. Constructing the Corpus • Text sources • Television subtitles from 2008 (Matthias Buchmeier) • Text of a 1923 novel (ItaloSvevo, La coscienzadi Zeno) • Consultants • Me (fluent heritage speaker, monolingual in Italian to age of 3) • Parents, friends from same region

  4. Constructing the Corpus (cont’d.) • Methodology • First rough pass: Orthography to IPA (with Excel) • Refinement: Details not in orthography added in by hand (with a custom program made by Bruce Hayes of UCLA) • [s]/[z] distinction, [ts]/[dz] distinction, mid vowel height, stress, transcription of foreign words • Tags

  5. Questioning the Status of e, ɛ, o, and ɔ • …Is there really a phonemic distinction between tense and lax vowels? • In words unknown to me, I guessed at better than chance • Enter the English Phonology Search software (programmed by Bruce Hayes in 2011) • Goal: Identify effect of every possible preceding and following environment for mid vowels; do some favor one height over another? English Phonology Search available at linguistics.ucla.edu/people/hayes/EnglishPhonologySearch/index.htm

  6. Logistic Regression • Statistical model that separates out ‘overlapping’ factors (i.e. accounts for interaction effects) (with R) • Set to try to predict lax using probabilities R (a statistics program) is available at http://www.r-project.org/

  7. Results • 42 significant factors:

  8. Some Highlights (cont’d.) • Front and back vowels pattern differently • Almost all significant contexts encouraged lax Mid vowels before glides Mid vowels before codas

  9. Some Highlights • There are zero sequences *ɛɲ]σ • Mid vowels before laterals tend to be lax Mid vowels before ɲ]σ Lax Tense

  10. Performance of the Model Thin line: prediction of a model that assumes that in every context, tense and lax vowels are equally likely to occur. Scatterplot: bulge above the thin line shows model performs at better than chance.

  11. Performance of the Model (cont’d.) As before, the thin line shows what we would predict if we assumed that mid vowels were equally likely to be tense or lax in any given context. The downward bulge shows that the model performs better than chance.

  12. Performance of the Model (cont’d.) • On a -1 to 1 scale, the correlation between reality and 0.577565 • Words with low probability of lax: • impegna, sgombra • Words with medium probability of lax: • aziendali, gomiti • Words with high probability of lax: • coniugi, logica

  13. Future steps • Accounting for morphology (in progress) • Suffixes with stressed mid vowel, i.e. –ɛllo • May cause some effects to vanish • Wug testing • A wug test has been designed but needs to be tested with more subjects • Very, very preliminary results suggest that the model tends to correctly predict responses

  14. Bibliography and Acknowledgements Ryan, Kevin. Gradient Weight in Phonology. UCLA, 2011. Web. <http://www.linguistics.ucla.edu/general/dissertations/RyanDissertationUCLA2011.pdf>. Many thanks to Prof. Bruce Hayes

More Related