1 / 21

Applications of Natural Language Processing

Applications of Natural Language Processing. Course 8 – 26 April 2012 Diana Trandab ăț dtrandabat@info.uaic.ro. Content. Computational lexicography Ubiquitous computing. Computational lexicography. Exploiting published dictionaries for use in new computer programs

oneida
Download Presentation

Applications of Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applications of Natural Language Processing Course 8 – 26 April 2012 Diana Trandabățdtrandabat@info.uaic.ro

  2. Content • Computational lexicography • Ubiquitous computing

  3. Computational lexicography • Exploiting published dictionaries for use in new computer programs • Using computer programs to create new dictionaries

  4. Using dictionaries for computational purposes • Inventory of the words of a language + tokenization, lemmatization • Word class recognition (noun vs. verb vs. adj.) • but dictionaries don’t give comparative frequencies • Word sense disambiguation • assumes that dictionary sense distinctions are reliable. • dictionaries don’t give comparative frequencies!

  5. Dictionaries before corpora • Based on collections of citations (from literary texts) • In some dictionaries, examples were – and are – based on introspection, not taken from actual texts. • Definitions in dictionaries of the future: associate meanings with words in context, not words in isolation.

  6. DTLR • The process of building and publishing the Thesaurus Dictionary of the Romanian Language (DTLR) took almost one century. • The last volume was finally published by the Editing House of the Romanian Academy at the beginning of 2009. • In all, DTLR has 33 volumes, more than 15,000 pages and about 175,000 entries. • The dictionary was created in the traditional pencil-and-paper way, with citations collected from more than 2,500 volumes of the written Romanian literature.

  7. eDTLR •  The digital form of DTLR, including its sources in digital form and the software to access them • Steps in Building eDTLR: • Preliminary processing of the paper version • Scanning • Image Processing • Automatic recognition of symbols - OCR • Correction phases • Parsing the entries • Correcting the structure • Linking the dictionary entries to sources

  8. Entry example in DTLR • VIVÁCE adj. invar., adv. I. Adj. invar., adv. 1. Adj. invar. (Livresc; despre oameni) Care are o vitalitate deosebită, manifestată prin rapiditate şi uşurinţă în mişcări. V. a g e r, a g i l, s p r i n t e n (1), v i o i (1). Cf. frollo,v. 623, lm,gheţie,r.m., barcianu,alexi,w.Era mică: abia întrecea umărul lui Dănuţ, cu tocuri cu tot – dar prea vivace pentru a da răgaz ochiului să o cuprindă. teodoreanu,m.ii, 15, cf. scriban,d., dl,dm,m.d.enc., dex,dn3, drev. ♦(Despre manifestările, fizionomia etc. oamenilor) Care dovedeşte, care exprimă vivacitate (1), însufleţire. Copilul îşi spune întîmplările, impresiunile, închipuirile foarte copilăreşte, adică în modul cel mai naiv şi vivace cîteodată. heliade,o.ii, 65.. • 2. Adj. invar. (Livresc; despre oameni) Care este înzestrat cu o minte ageră, pătrunzătoare; perspicace, subtil (3); (despre mintea, inteligenţa oamenilor) care dovedeşte agerime, subtilitate. Cf. frollo,v. 623. Spirit vivace.lm.Prin mijlocirea iubitului meu profesor, Aron Pumnul, avui fericirea să fac… cunoştinţă cu renumiţii fraţi Hurmuzachi, mai de aproape… cu talentosul, vivaciul şi vîşcătoriulAlesandru. sbiera,f.s. 106. Nime dintre contimporeni nu ar pute contesta că mişcările politice şi culturale din anii 1848 şi 1871 îşi dătoresc fiinţa şi decursul, în bună parte, spireteluivivaciu, scînteietoriu şi atîţătoriu al lui A. Hurmuzachi. id. ib. 238.. • 3. Adj. invar. (Despre tempoul unei bucăţi muzicale sau, p. e x t., despre ritmul versurilor) Foarte rapid, însufleţit. Cu cît se înmulţesc dactilii în exametru, cu atîta versul devine mai răpede, mai vivace şi mai uşor. heliade,o.ii, 164, cf. dsr. ◊ (Prin extensiune) Şantierul ardea în timp vivace, prin armonizarea tuturor focarelor într-un rug colosal. călinescu,s. 106. Dar dacă ar fi numai atît, – virtuozitatea absurdului rulat într-un tempo vivace, – în povestirile d-lui Mircea Damian, de bună seamă n-ar fi de ajuns. Şi ar fi mai ales periculos. perpessicius,m.iii, 201. • 4. Adv. (Indică modul de executare a unei bucăţi muzicale) În tempo foarte rapid între allegro şi presto; vivo. Cf. enc.rom., cade,dl,dm,m.d.enc., dex,dn3, d.muz., dsr. • II. Adj. invar. 1. (Astăzi rar; despre fiinţe) Care poate trăi mult timp; (învechit) vieţuielnic(2), (învechit, rar) vieţuial (v. vieţual2). V. r e z i s t e n t, r o b u s t. Cf. prot. – pop., n.d., pontbriant,d.Corbul este un animal vivace. costinescu, cf. lm,resmeriţă,d., şăineanu,d.u., cade,scriban,d. ◊ F i g. Naţionalitatea georgiană, din care mingrelianiisînt o simplă ramură, nu poate fi de aceeaşi ginte cu anticii fasiani,… ci derivă dintr-o altă tulpină mai vînoasă, mai vivace, mai rezistinte, a cării aşezare în văile Caucazului… este posterioară epocei lui Ipocrat. hasdeu,i.c.i, 173. • 2. (Despre plante, mai ales despre plantele ierbacee de cultură sau despre părţi ale acestora) Care trăieşte mai mulţi ani (fără a fi nevoie de o nouă însămînţare); care rodeşte timp de mai mulţi ani la rînd; peren. Streliţia reginei, plantă foarte mîndră…, vivace, deşi ierboasă, cere… dese udări vara. brezoianu,a. 432/4, cf. 448/7. Vizdeiul... este o plantă prea vivace; dăinirea ei… se întinde adesea pînă la doisprezece sau cincisprezece ani. id. r. 199/3, cf. 207/23. Sînt unele rădăcini care trăiesc numai un an (anuale); altele trăiesc doi ani (bisanuale), pe cîndrădăcinelearburilor care trăiesc mai mulţi ani se numesc vivace. barasch,i.n. 107/20, cf. costinescu. • 3. F i g. (Astăzi rar) Persistent, durabil. Cîteva din prejudicii sînt vivace, nu se pot lesne desfiinţa. costinescu.Suflarea trecutului e încă atît de vivace în inima ei, încît o va conduce mereu cu aripele destinse către viitor. odobescu,s.ii, 261. Prejudiciile sîntvivaci. şăineanu,d.u.O datină vivace. cade, cf. scriban,d. • – Pl.: (neobişnuit) vivaci. – Şi: (învechit, rar) viváci adj. • – Din (I) it. vivace, lat. vivax, -acis, (II) fr. vivace, lat.vivax, -acis.

  9. Entry example in DTLR • VIVÁCE adj. invar., adv. • I. Adj. invar., adv. • 1. Adj. invar. GlossExamples. • ♦GlossExamples • 2. Adj. invar. GlossExamples • 3. Adj. invar. GlossExamples • 4. Adv. GlossExamples • II. Adj. invar. 1.GlossExamples • 2.GlossExamples • 3. F i g. GlossExamples • – Pl.: (neobişnuit) vivaci. – Şi: (învechit, rar) viváci adj. • – Din (I) it. vivace, lat. vivax, -acis, (II) fr. vivace, lat.vivax, -acis.

  10. Dictionary entry parsing • The Dictionary is parsed using the following components: • A set of marker classes: a marker is a boundary for a specific linguistic category; • A hypergraph-like hierarchy that establishes the dependencies among the marker classes; • A searching (parsing) algorithm. • Once a configuration is defined, parsing implies: • identifying markers in the text to be parsed, • recognizing the marked text structures • classifying them according to the marker sequences within the pre-established hierarchy • settling the dependencies and correlations among the parsed textual structures.

  11. Marker classes • Sense tree parsing markers • The capital letter marker class (A., B., etc.) • The Roman numeral marker class (I., II., etc.) • The Arabic numeral marker class (1., 2., etc.) • The filled diamond and the empty diamond marker class • The lowercase letter markers a), b), c) • Definitions parsing markers • Morphological definitions; • Gloss definitions; • Phrase-based definitions; • Collocation definitions; • Examples supporting various specific meanings of a certain definition.

  12. Parsed entry

  13. Ubiquitous computing • The word "ubiquitous" can be defined as "existing or being everywhere at the same time," "constantly encountered," and "widespread." • When applying this concept to technology, the term ubiquitous implies that technology is everywhere and we use it all the time. • In ubiquitous computing (ubicomp), computers become a helpful but invisible force, assisting the user in meeting his or her needs without getting in the way. • Also described as pervasive computing, ambient intelligence, everyware, or physical computing.

  14. Common examples • A domestic ubiquitous computing environment might interconnect lighting and environmental controls with personal biometric monitors woven into clothing so that illumination and heating conditions in a room might be modulated, continuously and imperceptibly. • Another common scenario posits refrigerators "aware" of their suitably tagged contents, able to both plan a variety of menus from the food actually on hand, and warn users of stale or spoiled food.

  15. Context-Awareness • Computers are able to understand a user’s current situation and offer services, resources, or information relevant to the particular context. • The attributes of context may include the user’s location, past activity, affective state, current date and time, other objects, etc.

  16. Natural Interaction • The idea: to supply services, resources, or information to a user without the user having to think about the rules of how to use the computer to get them. • In this way, the user is not preoccupied with the dual tasks of using the computer and getting the services, resources, or information. • Contemporary devices that lend some support to this latter idea include mobile phones, digital audio players, RFIDs, GPS, and interactive whiteboards.

  17. Requirements (Team:max1person, Deadline: 3 May) • 1) Intelligent refrigerator: Plan a menu from the food on the hand • Input: set of recipes, list of available food • Output: possible recipes using available food • Bonus: offer also recipes which miss one ingredient. • 2) Create a list of patterns for at least 15 commands for an intelligent house monitoring. Examples: • open {window/lights} • close {tv/window/air conditioning}, etc. • Points are given for originality and task fitness.

  18. Further reading • Computational lexicography • Marius Răschip, Dan Cristea, Corina Forăscu. (2008). eDTLR – Dicţionarul tezaur al limbii române în format electronic. In Lucrările Seminarului Internaţional al Uniunii Latine "Instrumente pentru asistarea traducerii", Academia Română, Bucureşti, 28-29 februarie, 2008. • Neculai Curteanu, Alexandru-Mihai Moruz, Diana Trandabăț. (2008). Extracting Sense Trees from the Romanian Thesaurus by Sense Segmentation & Dependency Parsing. InProceedings of the COLING 2008 Workshop on Cognitive Aspects of the Lexicon, pp. 55–63, Manchester. • Ubiquitous computing • Abowd, G.D., & Mynatt, E.D. (March, 2000). Charting past, present, and future research in ubiquitous computing. ACM Transactions on Computer-Human Interaction, 7, pp. 29–58.

  19. Links • eDTLRrpoject web page: https://consilr.info.uaic.ro/edtlr/wiki/index.php?title=Despre_proiect • Natural Habitat: http://www.informatics.sussex.ac.uk/research/projects/nathab/scenario.htm

  20. Thanks!

More Related