330 likes | 509 Views
Introducing Phon 1.4. Yvan Rose Memorial University of Newfoundland. Acknowledgements.
E N D
Introducing Phon 1.4 • Yvan Rose • Memorial University of Newfoundland
Acknowledgements Funding: Current development of Phon and PhonBank is supported by the National Institute of Health. Earlier development of Phon was funded by grants from National Science Foundation, Canada Fund for Innovation, Social Sciences and Humanities Research Council of Canada, Petro-Canada Fund for Young Innovators, and the Office of the Vice-President (Research) and the Faculty of Arts at Memorial University of Newfoundland. Dictionaries: Built-in dictionaries of pronounced forms were obtained from several organizations (see http://phon.ling.mun.ca/phontrac/ for details) Special thanks: We owe special thanks to a wonderful group of early adopters and beta testers. These include researchers and PhD students from Universidade da Lisboa (Laetitia Almeida, Susana Correia, Teresa da Costa, Maria João Freitas); Universitat Autònoma de Barcelona (Joan Borràs Comes, Ana Estrella, Maria del Mar Vanrell, Pilar Prieto, Jill Thorson); Center for Advanced Research in Theoretical Linguistics (Helene Nordgård Andreassen, Bruce Morén); Universiteit Leiden (Claartje Levelt); Radboud Universiteit Nijmegen (Paula Fikkert, Nicole Altvater-Mackensen); Université Lumière Lyon 2 (Christophe dos Santos, Sophie Kern); Université Paris 3 (Aliyah Morgenstern, Naomi Yamaguchi); Université Paris 10 (Christophe Parisse); UBC (May Bernhardt, Joe Stemberger ); Memorial University of Newfoundland (Lindsay Babcock, Christine Champdoizeau, Carla Peddle, Erica Davis, Sarah Knee, Megan Maloney, Ashleigh Noel, Erin Swain, Kevin Terry); Cynthia, Babs, Suzanne, Jessie, Heather, Mits, Marijn, H. Buchan…
Phon development • Development teams: • Phon team at Memorial University of Newfoundland • CHILDES team at Carnegie Mellon University • Design and implementation criteria • Reliability • Simplicity • Flexibility/Neutrality (no analytical bias) • Compatibility, Extensibility • Availability • Phon can be used for all types of transcription-based phonological research (L1, L2, Adult, disordered)
Technical overview • Programmed in Java • Main computer platforms supported • Unicode compliant • XML (TalkBank) data structure • Working toward compatibility with other TalkBank-compliant applications • Support for IPA characters and diacritics • Most features integrated within a unique interface • Open-source
Phon’s interface (r)evolution • Early interfaces posed a number of problems • Cluttered in many ways • Not very flexible • Improvement of user experience, mostly based on user-feedback • Comfort • Flexibility • Additional refinements from tools available in the open-source universe
Phon 1.4: Interface improvements - Streamlined visuals - External media player (stable; optional) - Flexible, user-defined interface - Waveform visualization
Workflow supported* Project management Media linkage & segmentation Data transcription Transcript validation Syllabification and alignment Corpus query Query results visualization & management
… … … Project management • Project management from within the application • Ability to move/copy transcripts across corpora/projects Project structure: Project Corpus 1 Corpus 2 Corpus n… Transcript(s) Transcript(s) Transcript(s)
Media linkage and segmentation • For projects based on multimedia data • Linkage of media file to transcript • Identification of the time intervals that are relevant for research, for each participant • Media playback • Whole media • Segmented portion • (Scene playback not yet implemented in 1.4) • One crucial constraint: no more than one media file can be associated to a transcript (session)
Data transcription • Support for IPA transcriptions • Built-in IPA chart • Dictionaries of pronounced forms • Languages supported: Catalan, German, English, French, Icelandic, Italian, Dutch, and Spanish • Support for ‘sandhi’ rules • English plurals (e.g. cat[s] versus dog[z]) • French contractions (e.g. l’ami) • Dictionary utility (proof of concept) • Supplements built-in dictionaries
Data transcription • Word grouping (for sub-utterance segmentation) • Ability to export sound/video clips • Facilitates access to acoustic measurements • (Also useful for presentation purposes) • Integrated system for multiple-blind support • Password-protected blind transcriptions • Interface for transcription validation
Transcript validation • Required under the multi-blind protocol • Method based on comparisons between multiple blind transcriptions • Integrated within the session editor • Best performed by a team of transcript validators • Simultaneous listening of the media • Exporting of sound clip for acoustic measurement • Selection of the most accurate one transcription • Further adjustment of the selected form if needed
Syllabification and alignment • Automatic labeling of segments for syllable information • Support for: • Various languages • Various theoretical assumptions • Automatic alignment of transcribed phones in IPA Target-Actual pairs of transcribed words • Required for comparison, process identification • In all cases: ultimate control by user
Data compilations • Search ‘plugin’ system • Ability to create own search plugin without reprogramming Phon • Based on Java (ECMA) scripting • Support for text, phonological expressions and regular expressions • Built-in script editor • Persistent search results • Queries and results saved in a relational database • Query history
Data compilations • Inventories (phones, syllables types, stress patterns) • Textual or phonological data • Character strings, feature sets, syllable positions • Syllable types • Aligned phones and groups • Consonant and vowel harmony • Consonant metathesis • Combinations of the above • You can even build your own!
Data visualization and reporting • Search results integrated with the session editor • Session opens as results are displayed • Results summaries • Clipboard-accessible (for quick usage in other applications) • Report format • Several file formats supported • Various result types combined into reports Let’s see some real action!
Other useful features • Multiple undos supported in data tiers • User-defined tiers • Data copy to clipboard (tier, record, session) • Audio/video clip export • Compatibility with CLAN through data conversion utilities • CHAT2XML, XML2Phon (lots accomplished on this front in recent months) • XML and CSV data import and export (to be reimplemented soon) • Discussion group
On the horizon: Phon ⇔ Praat • Integration between Phon and Praat, through TextGrid • Acoustic measurements performed within Praat • Integration of Praat-generated data in Phon data compilations • E.g. Get FØ, intensity and duration data for all mid vowels in word-initial, stressed syllables • Fuller vision to be expressed in Paul’s talk • Building on this development: • Platform for testing transcription accuracy • Support for tonal/intonational coding
Longer Term Goal Phonological Data Acoustic Data CLAN, ELAN, SFS…
Some areas of contribution • Easier exchange between researchers • Study and comparisons of corpora • Within and across languages, populations, … • Better understanding of: • Linguistic phenomena • Acquisition-related patterns • Speech impediments • More efficient educational and clinical interventions • Development and verification of theoretical models
Thanks for your attention! Phon and user manual: http://childes.psy.cmu.edu/phon/ http://phon.ling.mun.ca/phontrac/Corpora: http: //childes.psy.cmu.edu/data/PhonBank/ http: //childes.psy.cmu.edu/data/PhonBank-Phon/ Discussion Group: phon@googlegroups.com Technical forums: http://phon.ling.mun.ca/phontrac/ Questions, feedback: yrose@mun.ca