220 likes | 374 Views
DICTIONARY OF TOPONYMS IN SERBIAN . Gordana Pavlovi ć -La ž eti ć 1 Cvetana Krstev 2 Du š ko Vitas 1 1 Faculty of Mathematics 2 Faculty of Philology University of Belgrade Serbia. DICTIONARY OF TOPONYMS: OVERVIEW. AIMS SOURCES & CONTENTS STRUCTURE & FORMAT TAGS SPECIFICS
E N D
Gordana Pavlović-Lažetić1Cvetana Krstev2Duško Vitas11Faculty of Mathematics2Faculty of PhilologyUniversity of BelgradeSerbia
DICTIONARY OF TOPONYMS: OVERVIEW • AIMS • SOURCES & CONTENTS • STRUCTURE & FORMAT • TAGS • SPECIFICS • EXAMPLES: daily newspapers • FURTHER DEVELOPMENT 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: AIMS • Daily newspaper analysis: Intex • Web browsers • Unknown words: proper names – personal and toponyms • Prolintex: Maurel, Piton • Encyclopedic dictionary with morphological tags – Intex; experimental • Delas-top: toponyms, hydronyms, oronyms – size ~2600; simple words 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SOURCES • YU / foreign geography • Geographic atlas used in education in Serbia • Official statistics register of inhabited places in former Yugoslavia • Belgrade, Serbia, Monte Negro • Bosnia, Macedonia, Croatia, Slovenia • South / Central / West / North Europe • Near East, Asia, Africa • North America (USA & Canada), South America • Australia, Pacific islands • Arctic & Antarctic, seas & oceans • Earth, continents, cosmos, Sun system 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: CHOICE • Country • Official languages • Capital city • Administrative divisions of common importance (e.g., US states) • Cities (more than 10000 / 50000 /100000 inhabitants) • Hydronyms (rivers, lakes, swamps; associated with mouth-country) • Mountains, volcanoes, etc, if of common importance (for, e.g., newspaper text) 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: CONTENTS • Proper names • Relational adjectives (-ski, -sxki, -cyki) • Names of inhabitants, including pejoratives (if exist; e.g., sxvaba,sxiptar) • Possessive adjectives (-ov/-ev; -in) • Examples: • Pariz,N - pariski,A+Rel • Parizxanin,Nm - Parizxaninov,A+Poss • Parizxanka,Nf - Parizxankin,A+Poss • parizxanski,A+Rel 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: CONTENTS • YU-geography: local official names • Foreign: exonyms (rarely official / traditional) • E.g., Becy, Rim, Solun, Bitolx, Skoplxe, Prag • Transcription (adjusted): Cyrillic orthography (e.g., Bolonxa) • Orthography transcription rules • Phonetic / adjusted to writing (e.g., Cykago, Cyikagou) / declination • Tradition (spontaneous: e.g., Peking) • Latin: diacritic (Be~) / rarely transliteration 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: STRUCTURE & FORMAT • Delas-top entry: syntactic & semantic attributes • Syntactic: flective classes for simple words (Delas for Serbian) • Generated a part of DELAF-top dictionary • Semantics: Prolintex 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: TAGS - codes • DER (derivative): Beogradxanka, .Nfs+Der(Beograd) • Top (toponym, place):Beograd, .Nms +Top • Hyd – hydronym: dunavski,.A+Rel+Hyd • Oro – oronym: Gocy,.Nms+Oro • Hum- human beings: Valxevka,.Nfs+Hum • IsoXX – ISO-country-code for toponyms:Beograd,.Nms+Top+IsoYU • LngXX – language code: grcyki, .A +LngEL • PR – proper names: Beograd,.Nms+Top+PR • PG – pejorative name: Sxvabica,.Nfs+PG 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: TAGS • PAut: Autonom. region: Vojvodina,.Nfs+Top+PAut +IsoYU • PCen: Regional center: Nisx,.Nms+PCen • PDgr – City or place part: Autokomanda, .Nfs +PDgr • PDrz – Country: Francuska,.NAfs+PDrz • PGr1-PGr4, PGgr: Sofija, .Nfs+Top+PGgr+PGr4+IsoBG • POps – Township: Cyukarica,.Nfs+Top+POps+IsoYU • PKon – Continent: Evropa,.Nfs+PKon • POst – Island: Elba,.Nfs+PR+Top+POst+IsoIT • PPla – Mountain: Rila,.Nfs+PR+Oro+PPla+IsoBG • PReg – Region: Metohija,.Nfs+Top+PReg+IsoYU • PRgr – Reg. Cap. city: Prisxtina,.Nfs+Top+PRgr+IsoYU 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SPECIFICS: nouns • Fem.: Top, Hyd, Oro, Hum (inhab.) • Size • Total: 911 • YU: 146 (16%) • Foreign: 765 (84%) • Types • Tops: 364 (40%) • Hyds: 131 (14%) • Oros: 33 (4%) • Fem. inhab.:383 (42%) 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SPECIFICS: nouns (cont.) • Inhabitants (fem.): a) –ka, 361 (94%), same synt. cl., e.g., Beogradxanka, Bugarka, Parizxanka, Kanadxanka b) –ca, 7 (2%), diff. synt. cl., e.g., Sxvabica, Sremica c) –nxa, 15 (4%), same synt. cl., e.g., Grkinxa, Polxakinxa, Turkinxa, Francuskinxa, Renkinxa 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SPECIFICS: adjectives • Possessive, relational • Size • Total: 1673 • YU: 268 (16%) • Foreign:1405 (84%) • Types • Rel: 1002 (60%) • Poss: 671 (40%) 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SPECIFICS: adjectives (cont.) • Relational: • -ski, -sxki, -cyki • e.g., beogradski, prasxki, becyki • orthography: small l. • Possessive: • -in, e.g., Beogradxankin (f.), Becylijin(m.) • -ov, -ev (m.), e.g., Beogradxaninov,Prisxtincyev 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: SPECIFICS: homonymy • Mostly nouns, same synt. cl., e.g., Barselona, .Nfs+Top+PGr2+IsoVE Barselona, .Nfs+Top+PGr4+IsoES Alabama, .Nfs+Top+PFed+IsoUS Alabama, .Nfs+Hyd+IsoUS Drenica, .Nfs+Top+PReg+IsoYU Drenica, .Nfs+Hyd+IsoYU • Some adjectives, e.g., kolumbijski, .A+Rel+Top+IsoUS kolumbijski, .A+Rel+Top+PDrz+IsoCO 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: EXAMPLES: “Politika” • Bilateral relationships in former YU • Bilateral relationships in former YU + AL • Officials in former YU + US + GB • Officials + bilateral officials 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: EXAMPLES: “Politika” 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: EXAMPLES: “Politika” 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: EXAMPLES: “Politika” 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: EXAMPLES: “Politika” 6th Intex Workshop, 28-30 May 2003
DICTIONARY OF TOPONYMS: FURTHER DEVELOPMENT • Further development of DELAS-TOP (dictionary by continents) • Description of compounds (DELAFC-top), e.g., Novi Sad • DB system • Local grammars describing relationships between toponyms / grouping, e.g., Jugoslavija: ex-YU + SRJ +SCG +... Balkan part (+ normalization - body) • Tags’ reduction (e.g., PGr1, PGr2, PGr3, PGr4, PVgr, PGgr -> PGrad) – Prolintex2 • Information extraction 6th Intex Workshop, 28-30 May 2003