190 likes | 294 Views
linguistics research and analysis of the bulgarian folklore. experimental implementation of linguistic components in bulgarian folklore digital library. Konstantin Rangochev 1 Maxim Goynov 1 Desislava Paneva-Marinova 1 Detelin Luchev 2 1 Institute of Mathematics and Informatics-BAS
E N D
linguistics research and analysis of the bulgarian folklore. experimental implementation of linguistic components in bulgarian folklore digital library Konstantin Rangochev1 Maxim Goynov1 Desislava Paneva-Marinova1 Detelin Luchev2 1 Institute of Mathematics and Informatics-BAS 2 Ethnographic Institute with Museum-BAS International Conference on Information Research and Applications 24-27 June 2010, Varna, Bulgaria
Presentation overview • Linguistics research and analysis of the Bulgarian folklore • National research project: “Knowledge Technologies for Creation of Digital Presentation and Significant Repositories of Folklore Heritage”(FolkKnow) • Functionality “Bulgarian folklore digital library” multimedia digital library • Experimental implementation of a linguistic component in BFDL
Linguistics research and analysis of the Bulgarian folklore (1) • The main component of the linguistic research of the Bulgarian folklore is the analysis of its lexical structure. • How many and what token it contains? • Is there and what is the domination or the lack of some groups of tokens? • Paradigm relationships in the folklore lexemes • Context lexemes/Folklore language formulas • Frequency of the lexemes, verses/sentences in which they are, number, numbering in the song, etc. of the verses/sentences. • Word forms • Regional characteristics of the folklore lexical structure, etc.
Linguistics research and analysis of the Bulgarian folklore (2) • Tools, formalizing the folklore analysis: • Frequency dictionary • A general frequency dictionary – it contains the all lexical units which are in a folklore object repository; • A regional frequency dictionary – it contains all the text units which come of a definite folklore region or of a concrete settlement; • A functional frequency dictionary – it contains all the text units which have identical functions: descriptions of the rites, various types of songs, narratives, etc.
Linguistics research and analysis of the Bulgarian folklore (3) Table: Comparison of the Bulgarian folklore and spoken languages.
Linguistics research and analysis of the Bulgarian folklore (4) • Concordance dictionaries show the lexeme with/in her context. • Example for songs: “Fifty heroes are drinking wine” – the underlined lexeme is the examined and the lexemes in italic are her context. • Example for narrative text: In the description of the rituals one complete sentence is the context of the observed lexeme (from point to point).
FolkKnow project • FolkKnow project: “Knowledge Technologies for Creation of Digital Presentation and Significant Repositories of Folklore Heritage”(contract number: IO-03-03/2006) • Supported by National Science Fund of the Bulgarian Ministry of Education and Science • Partners: Institute of Mathematics and Informatics - BAS, Institute for Folklore-BAS, Veliko Tarnovo University • Module 3:“Development of Digital Libraries and Information Portal with Virtual Exposition - Bulgarian Folklore Heritage”
Bulgarian folklore digital library Web address: http://213.191.194.27/folklor/
Folklore object preview • Description of folklore object
Main services (3) • Extended search through all the object’s characteristics
Main services (4) • Module for • Managing and monitoring users’ data and activities: registration, logs, data changes, level set, actions, related to the object manipulation: search, preview , delete, add, edit, select, etc., administrative actions. • File format conversion • XML export of the BFDL objects
Linguistic search in text folklore objects • Search of a word in the different types of dictionaries; • Search of two or more words, searching of verbal formulas in the folklore lexis: “Drinking wine”, “Marko seated”. • Search of a group of words, investigating the paradigmatic relations in the folklore lexis (river- stream- brook- rill…) • Search for a root of a word, studying the folklore word-formation: “drink” (I am drinking, I have drunk, they have drunk…).
Experimental implementation of a linguistic component in BFDL Frequency dictionary functional specification • Linguistic analysis of the available set of test folklore objects; • Determination of the frequency of meeting the lexemes in text folklore objects; • Creating of lists of the lexemes, • in frequency order • in alphabetical order • Taking the number of the lexical units; • Taking the number of the repeats of the lexical units.
Experimental implementation of a linguistic component in BFDL Sequence Diagram
Experimental implementation of a linguistic component in BFDL Analysis class diagram for the BFDL linguistic component
Implementation of the Bulgarian folklore digital library The main tools and languages used: • Microsoft Windows Server 2008 x64 Standard; • Web server: Apache HTTP Server v 2.2, PHP v 2.2.9; • Database management system: MySQL v 5.1 Standard; • Tools for the additional modules: FFMPEG, vwWare, HTML, JavaScript, AJAX; • Database query language: SPARQL