160 likes | 364 Views
Language Codes. Anthony Aristar LINGUIST List, ILIT & Eastern Michigan University DELAMAN, London, 3 rd November 2006. A Brief History of Ethnologue. Begun over 50 years ago First edition 1951 3-letter codes instituted in 1971 Publishes a new version every 4 years
E N D
Language Codes Anthony Aristar LINGUIST List, ILIT & Eastern Michigan University DELAMAN, London, 3rd November 2006
A Brief History of Ethnologue • Begun over 50 years ago • First edition 1951 • 3-letter codes instituted in 1971 • Publishes a new version every 4 years • The latest version is Ethnologue 15)
Principles behind Ethnologue Codes • Consistently apply an operational definition of language so that all entities for which an identifier is assigned are of a comparable nature • Encompass all of the languages of the world, • Clearly document the speech variety that each identifier denotes • Maintain and update the system on an on-going basis • Make the system freely and readily accessible to the public over the Internet
Range of Coverage • The Ethnologue system is intended to encompass only those languages of the world in current use. Thus the Ge’ez (Ethnologue code gez) and Sanskrit (Ethnologue code san) languages both appear in Ethnologue • Most ancient languages are thus absent
Ancient Languages added • Agreement made between Ethnologue and the LINGUIST List that LINGUIST would add codes for ancient and constructed languages • Furthermore…
The Canary Agreement, 2002 • All languages which require codes and which became extinct before 1950 should become the responsibility of LINGUIST. All languages after 1950 will be in the purview of Ethnologue.
The need for a standard • The ISO organization had adopted in 1988 a 2-letter set of language codes (ISO 639-1) • Inadequate: 136 codes • In 1998 adopted 3-letter 639-2 codes • Still inadequate: 460 codes, many defining multiple languages.
Becoming a standard • In 2002, ISO TC37/SC2 invited Ethnologue to participate in the development of a new standard • Must be a superset of ISO 639-2 • Would provide identifiers for all known languages.
Issues • Ethnologue had to violate its own rules to accomplish this: • Macrolanguages had to be included, e.g. code zho for all Chinese languages • Codes had to be reused (e.g. code san now used for Sanskrit (previously skt), once used for the Niger-Congo language Sakata (once san) • But collective codes (e.g. afa for Afroasiatic) were abandoned
Ethnologue/LINGUIST codes a standard • In 2004 the revised Ethnologue/LINGUIST codes became what is called a DIS or “Draft International Standard”. • Usually called simply ISO 639-3, but its correct title is ISO/DIS 639-3. • SIL became the Registration Authority or curator of the codes
Dissatisfaction with Ethnologue • But now that Ethnologue was a standard, people had to use it • E.g. NSF now requires ISO 639-3 codes • LSA requires them when you submit an abstract… • Many digital sites using them… • And there are shortcomings in Ethnologue…
Shortcomings in Ethnologue • Every language in Ethnologue is documented to a greater or lesser degree, but… • We usually do not have a clear idea of the evidence upon which it was decided to assign the language a unique code. • Languages which should not be there • Languages which should be there, but aren’t • Dialects called languages… and vice versa • Wrong names used for languages… • Wrong locations… wrong populations…
Most of all… • It was very hard to get SIL to make a change in the code-set!
Meeting at LSA in Oakland • Members of the community and representatives of SIL • Very cooperative meeting • SIL accepting of need for more community input
Decision that committees would best handle mass code-set changes for specific areas • Committee set up to oversee thoroughgoing revision of code-set for Americas, • Chairman Lyle Campbell • SSILA (Society for the Study of the Indigenous Languages of the Americas) initiating the process
How the process works • Forms available from address: • ISO639-3@sil.org • Two forms: • Change request form • New code request form • Formal review of next set of code changes starts December 1, 2006