70 likes | 80 Views
Constructing language databases requires rethinking entities. Field data provides utterances, not entire languages. Addressing the issue of language variety coding through standards and geographical coordinates is essential for accuracy. Proposing encoding for minimal language communities and discrete linguistic areas. Improved mapping tools are needed for precise representation.
E N D
Problems • Database construction forces you to think in terms of discrete entities • Languages are not discrete entities • Field data (and linguistic data in general) do not represent ”languages” as wholes but utterances specific in time and space • There are just too many language varieties in the world to be assigned codes
Cutting the gordic knot • We need standards for specifying language varieties more precisely • …in particular, for locating language varieties geographically • … by an open-ended system
Consequences • We need to use coordinates… • …and standards for representing them 50°20´N 12°15´W 50.33 -12.25
New entities • Instead of languages: • minimal language communities? typically: a settlement or of less than 1000 inhabitants or nomadic group of a similar size • minimal discrete linguistic areas (=dialect continuum)? example: Continental Scandinavian
Possible encoding: SCD 59.25 18.33 = the variety of Scandinavian spoken at 59.25 degrees north 18.33 degrees east
Better maps! • …but that is for another time