100 likes | 120 Views
This project aims to construct and preserve a comprehensive lexicon for the Chemehuevi language, a Uto-Aztecan language spoken in Western Arizona and Eastern California. The lexicon will be accessible online and stored in an open exchange format (XML) for easy sharing and exportation. The project also implements an innovative approach to the maintenance and updating of digital language resources using XSLT to transform the XML lexicon into HTML webpages.
E N D
A Chemehuevi Lexicon Hans Nelson, Mike Manookin, and Dirk Elzinga Department of Linguistics - Brigham Young University
Chemehuevi • A Uto-Aztecan language closely related to Southern Paiute • Spoken in Western Arizona and Eastern California • The Colorado and Chemehuevi Valley Indian Reservations • Presently, there are fewer than 20 speakers (all over 40 yrs. old)
Previous Work • Previous documentation of Chemehuevi is rather sparse. • A few words are collected in Kroeber's Notes on Shoshonean Dialects of Southern California (1909). • J. P. Harrington collected large amounts of Chemehuevi vocabulary. • George Laird later published two books on Chemehuevi ethnology which include short texts in Chemehuevi and fair sized word lists (Laird 1976, 1984). • Margaret Press’s UCLA Ph.D. Dissertation (1975), revised 1979, represents the most detailed linguistic studies of the language. • All print sources (nothing digitally archived)
Project Task • Task: the continued construction and preservation of a lexicon for the Chemehuevi language. • Designed to satisfy two goals: (1) provide the Chemehuevi community on-line access to a dictionary of their language and (2) store such a dictionary in an open exchange format (XML) capable of export to various other formats (such as other XML formats, HTML, etc.). • Wanted easy data entry for linguists (Excel – Mac/PC)
Excel XML Application • ‘ExcelApp’ – (1) transforms Excel spreadsheet into XML, (2) GUI for invoking MSXML 4.0 XSLT • VB.NET Application • Interacts directly with the Excel Object Model • checks special XML characters • Excel XML, TBX, eXML
<!-- Version 1 Chemehuevi Language XML DTD (Root Element: "clxml")> <!-- Author: Hans Nelson Feb. 28th, 2004 --> <!ELEMENT clxml (term)+> <!ATTLIST clxml lang CDATA #REQUIRED > <!ELEMENT term (surfaceForm,pos,singularForm,dualForm,pluralForm,momentaneous,durative,definition,translation,etymology,source,audio,sample,sentence,illustration)> <!ATTLIST term id CDATA #REQUIRED > <!ELEMENT surfaceForm (#PCDATA)> <!ELEMENT pos (#PCDATA)> <!ELEMENT singularForm (#PCDATA)> <!ELEMENT dualForm (#PCDATA)> <!ELEMENT pluralForm (#PCDATA)> <!ELEMENT momentaneous (#PCDATA)> <!ELEMENT durative (#PCDATA)> <!ELEMENT definition (#PCDATA)> <!ELEMENT translation (#PCDATA)> <!ELEMENT etymology (#PCDATA)> <!ELEMENT source (#PCDATA)> <!ELEMENT audio (#PCDATA)> <!ELEMENT sample (#PCDATA)> <!ELEMENT sentence (#PCDATA)> <!ELEMENT illustration (#PCDATA)>
http://linguistics.byu.edu/faculty/elzingad/chemehuevi_dictionary/http://linguistics.byu.edu/faculty/elzingad/chemehuevi_dictionary/
Final Thoughts • The content, since it is in xml can be mapped easily to common ontology and linguistic terminology. • Format is nonproprietary and uses XML constrained by a DTD. • The lexicon is also accessible on-line and preserved in XML. • This project provides crucial documentation for Chemehuevi. • Makes data available in a variety of formats via storage in XML. • This project also implements a new approach to the problem of online documentation (e.g. maintenance, updating) of digital language resources by allowing XSLT to rapidly and automatically transform this XML based lexicon into HTML WebPages.