Introducing Phon 1.4

Introducing Phon 1.4 • Yvan Rose • Memorial University of Newfoundland

Acknowledgements Funding: Current development of Phon and PhonBank is supported by the National Institute of Health. Earlier development of Phon was funded by grants from National Science Foundation, Canada Fund for Innovation, Social Sciences and Humanities Research Council of Canada, Petro-Canada Fund for Young Innovators, and the Office of the Vice-President (Research) and the Faculty of Arts at Memorial University of Newfoundland. Dictionaries: Built-in dictionaries of pronounced forms were obtained from several organizations (see http://phon.ling.mun.ca/phontrac/ for details) Special thanks: We owe special thanks to a wonderful group of early adopters and beta testers. These include researchers and PhD students from Universidade da Lisboa (Laetitia Almeida, Susana Correia, Teresa da Costa, Maria João Freitas); Universitat Autònoma de Barcelona (Joan Borràs Comes, Ana Estrella, Maria del Mar Vanrell, Pilar Prieto, Jill Thorson); Center for Advanced Research in Theoretical Linguistics (Helene Nordgård Andreassen, Bruce Morén); Universiteit Leiden (Claartje Levelt); Radboud Universiteit Nijmegen (Paula Fikkert, Nicole Altvater-Mackensen); Université Lumière Lyon 2 (Christophe dos Santos, Sophie Kern); Université Paris 3 (Aliyah Morgenstern, Naomi Yamaguchi); Université Paris 10 (Christophe Parisse); UBC (May Bernhardt, Joe Stemberger ); Memorial University of Newfoundland (Lindsay Babcock, Christine Champdoizeau, Carla Peddle, Erica Davis, Sarah Knee, Megan Maloney, Ashleigh Noel, Erin Swain, Kevin Terry); Cynthia, Babs, Suzanne, Jessie, Heather, Mits, Marijn, H. Buchan…

Phon development • Development teams: • Phon team at Memorial University of Newfoundland • CHILDES team at Carnegie Mellon University • Design and implementation criteria • Reliability • Simplicity • Flexibility/Neutrality (no analytical bias) • Compatibility, Extensibility • Availability • Phon can be used for all types of transcription-based phonological research (L1, L2, Adult, disordered)

Technical overview • Programmed in Java • Main computer platforms supported • Unicode compliant • XML (TalkBank) data structure • Working toward compatibility with other TalkBank-compliant applications • Support for IPA characters and diacritics • Most features integrated within a unique interface • Open-source

Phon’s interface (r)evolution • Early interfaces posed a number of problems • Cluttered in many ways • Not very flexible • Improvement of user experience, mostly based on user-feedback • Comfort • Flexibility • Additional refinements from tools available in the open-source universe

Look back: Interface in Phon 1.0, 1.1

Look back: Interface in Phon 1.2, 1.3

Phon 1.4: Interface improvements - Streamlined visuals - External media player (stable; optional) - Flexible, user-defined interface - Waveform visualization

Workflow supported* Project management Media linkage & segmentation Data transcription Transcript validation Syllabification and alignment Corpus query Query results visualization & management

Project management

… … … Project management • Project management from within the application • Ability to move/copy transcripts across corpora/projects Project structure: Project Corpus 1 Corpus 2 Corpus n… Transcript(s) Transcript(s) Transcript(s)

Project management

Media linkage and segmentation

Media linkage and segmentation • For projects based on multimedia data • Linkage of media file to transcript • Identification of the time intervals that are relevant for research, for each participant • Media playback • Whole media • Segmented portion • (Scene playback not yet implemented in 1.4) • One crucial constraint: no more than one media file can be associated to a transcript (session)

Data transcription

Data transcription • Support for IPA transcriptions • Built-in IPA chart • Dictionaries of pronounced forms • Languages supported: Catalan, German, English, French, Icelandic, Italian, Dutch, and Spanish • Support for ‘sandhi’ rules • English plurals (e.g. cat[s] versus dog[z]) • French contractions (e.g. l’ami) • Dictionary utility (proof of concept) • Supplements built-in dictionaries

Data transcription • Word grouping (for sub-utterance segmentation) • Ability to export sound/video clips • Facilitates access to acoustic measurements • (Also useful for presentation purposes) • Integrated system for multiple-blind support • Password-protected blind transcriptions • Interface for transcription validation

Data transcription

Transcript validation

Transcript validation • Required under the multi-blind protocol • Method based on comparisons between multiple blind transcriptions • Integrated within the session editor • Best performed by a team of transcript validators • Simultaneous listening of the media • Exporting of sound clip for acoustic measurement • Selection of the most accurate one transcription • Further adjustment of the selected form if needed

Transcript validation

Syllabification and alignment

Syllabification and alignment • Automatic labeling of segments for syllable information • Support for: • Various languages • Various theoretical assumptions • Automatic alignment of transcribed phones in IPA Target-Actual pairs of transcribed words • Required for comparison, process identification • In all cases: ultimate control by user

Syllabification and alignment

Data compilations

Data compilations • Search ‘plugin’ system • Ability to create own search plugin without reprogramming Phon • Based on Java (ECMA) scripting • Support for text, phonological expressions and regular expressions • Built-in script editor • Persistent search results • Queries and results saved in a relational database • Query history

Data compilations • Inventories (phones, syllables types, stress patterns) • Textual or phonological data • Character strings, feature sets, syllable positions • Syllable types • Aligned phones and groups • Consonant and vowel harmony • Consonant metathesis • Combinations of the above • You can even build your own!

Data visualization and reporting • Search results integrated with the session editor • Session opens as results are displayed • Results summaries • Clipboard-accessible (for quick usage in other applications) • Report format • Several file formats supported • Various result types combined into reports Let’s see some real action!

Other useful features • Multiple undos supported in data tiers • User-defined tiers • Data copy to clipboard (tier, record, session) • Audio/video clip export • Compatibility with CLAN through data conversion utilities • CHAT2XML, XML2Phon (lots accomplished on this front in recent months) • XML and CSV data import and export (to be reimplemented soon) • Discussion group

On the horizon: Phon ⇔ Praat • Integration between Phon and Praat, through TextGrid • Acoustic measurements performed within Praat • Integration of Praat-generated data in Phon data compilations • E.g. Get FØ, intensity and duration data for all mid vowels in word-initial, stressed syllables • Fuller vision to be expressed in Paul’s talk • Building on this development: • Platform for testing transcription accuracy • Support for tonal/intonational coding

Longer Term Goal Phonological Data Acoustic Data CLAN, ELAN, SFS…

Some areas of contribution • Easier exchange between researchers • Study and comparisons of corpora • Within and across languages, populations, … • Better understanding of: • Linguistic phenomena • Acquisition-related patterns • Speech impediments • More efficient educational and clinical interventions • Development and verification of theoretical models

Thanks for your attention! Phon and user manual: http://childes.psy.cmu.edu/phon/ http://phon.ling.mun.ca/phontrac/Corpora: http: //childes.psy.cmu.edu/data/PhonBank/ http: //childes.psy.cmu.edu/data/PhonBank-Phon/ Discussion Group: phon@googlegroups.com Technical forums: http://phon.ling.mun.ca/phontrac/ Questions, feedback: yrose@mun.ca

Introducing Phon 1.4

Introducing Phon 1.4

Presentation Transcript

Let s Phon-ercise

Phon-ercise!

1.4

1.4 Lines

phon

Phon-

tele phon photo graph

§ 1.4

§ 1.4

1.4

1.4

1.4

1.4

1.4

“Phon” Book

Chemistry 1.4

Section 1.4

1.4

§ 1.4