1 / 19

SYRCoSE-2011

SYRCoSE-2011. APPLICATION OF THE FUNCTIONAL PROGRAMMING TOOLS IN THE TASKS OF LANGUAGE AND INTERLANGUAGE STRUCTURES REPRESENTATION Ermakov Peter, Kozhunova Olga The Institute of the Informatics Problems, The Russian Academy of Sciences. Natural text representation and formalisms.

bonner
Download Presentation

SYRCoSE-2011

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SYRCoSE-2011 APPLICATION OF THE FUNCTIONAL PROGRAMMING TOOLS IN THE TASKS OF LANGUAGE AND INTERLANGUAGE STRUCTURES REPRESENTATION Ermakov Peter, Kozhunova Olga The Institute of the Informatics Problems, The Russian Academy of Sciences

  2. Natural text representation and formalisms • Machine text analysis – what to choose? • Linguistic resources: electronic dictionaries, syntactic parsers, language ontologies, generators of syntactic trees (WordNet, EuroWordNet, Ontolingua, etc.) • Growing demands to the language models: a search of new approaches to language structures representation and hybridization of the well-functioning old methods

  3. Usage of the natural text analysis • Automatic translators • Data mining • Natural language interface • Parallel text analysis and comparison

  4. Formal grammars for language analysis and representation • Noam Chomsky – the pioneer • Formal grammar definition • Grammar classes (Chomsky classification): • - unrestricted • - context-sensitive • - context-free • - regular

  5. Finite state automata disadvantages • More rules – more states • Weak natural language grammar handling • No problem-oriented analytical engine is available

  6. Functional approach to the grammar rules representation • Grammar rules – functions in mathematical sense • Mapping from set A to set B • Difference:

  7. Functional approach advantages • Functional programming tools usage • n-tuples usage • Higher-order functions usage

  8. Computational and Functional Grammar(FCG)

  9. Set of parametric n-tuples The symbol «_» is suggested to denote an n-tuple element which value one may ignore when defining the transformation function.

  10. Atoms • Attributive characteristics of natural language words • Specific instrument for simplifying of natural language structures analysis • So, all possible attributive characteristics of language structures are defined in the Grammar

  11. FCG Example #1 Atoms = {noun, singular, plural} func({X, noun, singular}) → X func({X, noun, plural}) → X ++ «s»

  12. FCG Example #2 Atoms = {noun, verb, plural, singular, ok, not ok} f({noun, _}) → {ok} f({verb, _}) → {not ok}

  13. Parallel texts analysis and comparison • Task: interlanguage structures transfer from one language into the other • Example: patent claims (in chemical technologies) in German and English

  14. Parallel texts example • Claim in German: Verfahren zur Epoxidierung einer organischen Verbindung mit wenigstens einer C C-Doppelbindung mit Wasserstoffperoxid in Gegenwart wenigstens einer katalytisch aktiven Verbindung und wenigstens eines Lösungsmittels, dadurch gekennzeichnet, dass ein Produktgemisch umfassend a-Hydroperoxyalkohole unter Einsatz wenigstens eines Reduktionsmittels reduziert wird. • Claim in English:A process for the epoxidation of an organic compound having at least one C-C double bond by means of hydrogen peroxide in the presence of at least one catalytically active compound and at least one solvent, whereina product mixture comprising [alpha]-hydroperoxyalcohols is reduced using at least one reducing agent.

  15. Parallel texts transformations • (a) Verfahren zur Epoxidierung → A process for the epoxidation N [verb, nom, neutr, sg] + Prep [zu+der, dat, comp, fem, sg] + N [dat, fem, sg] → Art [indef, sg] + N [com, sg] + Prep + Art [def, 0] + N [com,sg] • (b) ein Produktgemisch → a product mixture Art [indef, masc, nom, sg] + N [comp, nom, neutr, sg] → Art [indef, sg] + N [com, sg] + N [com,sg] • (c) dadurch gekennzeichnet → wherein Pron + Part [II f, masc, sg] → Adv

  16. (b) ein Produktgemisch → a product mixture v({«ein»,art,indef,masc,nom,sg}) → «a», v({«Productgemisch»,noun,comp,nom,neutr,sg})→ «product mixture»; fgerman-english({X1,art,indef,masc,nom,sg},{X2,noun,comp,nom,neutr,sg}) → v({X1,art,indef,masc,nom,sg}) ++ v({X2,noun,comp,nom,neutr,sg});

  17. Conclusions • Approach to natural language grammar representation as functions in mathematical sense • Opportunities of applying functional programming tools to building systems of transfer • Practical application of the approach: parallel texts analysis and comparison

  18. Next steps • Customizing of the existing representations of the natural language grammars to functional form • Creation of problem-oriented system of functional programming • Enhancement of functional programming tools taking into account needs and tasks of computer linguistics

  19. THANK YOU FOR YOUR ATTENTION!

More Related