1 / 25

Stefania Spina University for Foreigners Perugia, Italia

The Dictionary of Italian Collocations : Design and Integration in an Online Learning Environment. Stefania Spina University for Foreigners Perugia, Italia. The Dictionary of Italian Collocations. Part of APRIL project (“ Personalised web environment for language learning ”)

janet
Download Presentation

Stefania Spina University for Foreigners Perugia, Italia

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The DictionaryofItalianCollocations: Design and Integration in an Online LearningEnvironment Stefania Spina UniversityforForeigners Perugia, Italia

  2. The Dictionary of Italian Collocations • Part of APRIL project (“Personalised web environmentforlanguagelearning”) • NLP resourcesas a supportfor the lexicalcompetenceofstudentsofItalianwithin a VirtualLearningEnvironment(VLE). LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  3. Presentationoutline • background and motivation • reference corpus • methodology • dictionary compilation • integrationwithin VLE LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  4. Background • Complexityof MWU: • differentsyntactic and semanticprofiles • prototypicalfeatures: • semantic (non-)compositionality • (non-)substitutabilityofcomponentsbysemanticallysimilarwords • (non-)insertionofexternalitems • continuum ratherthan definite categories LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  5. Motivation: collocations in SLA • improvelearnersfluency • examplesfromItalianleanercorpora • preoccupata per l’esame vado a prendere una doccia (Vietnam) • Fare la doccia “take a shower” • ho dimenticato la macchina di fotografia (China) • Macchina fotografica “camera” • non-nativespeakers and L2 vocabulary: first single words, then more extendedchunks • trend tooveruse the creative combinationofisolatedwords • Sinclair’s open choiceprinciple LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  6. DICI • collocationsrequirespecificpedagogicalattention • DictionaryofItalianCollocations(DICI) • itiscorpus-based; • itis a learner-orientedtool: listof the most common Italiancollocations, classified on a frequencybasis; • itisalsobased on statisticalmethodologies (dispersion in the differenttextualgenresrepresented in the corpus). LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  7. Reference corpus • Perugia corpus: POS-tagged, lemmatized LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  8. Extractionbased on POS sequences • Analysisofexistinglistofcollocations: • 150 different POS sequences • 10 mostproductive (75%) LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  9. Experimentalmethodology: 4steps • extractionof candidate collocationsfrom corpus; • filteringof the candidate collocations: frequency; • filteringof the candidate collocations: dispersion; • filteringof the candidate collocations: manual • 6POS sequences • 12-million-word sample • 4 corpus sections LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  10. Collocationsextraction + frequency • IMS Corpus Workbench • removingall the candidateswithfrequency = 1 • 41643 collocations LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  11. Dispersion • Examples: • Aggrottare la fronte “tofrown” (fiction) • Vincere le elezioni “towin the elections” (press) • Dare una definizione “togive a definition” (academic prose) LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  12. Dispersion • Juilland’sDvalue (Juilland - Chang-Rodriguez, 1964) LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  13. Dispertion + frequency • Dvalue: combinedwithfrequency = usage • U = FD • Usage value ≥ 2: 2047 candidate collocations • Manualselection. Finalresult: • listof1553 word combinations = dictionaryentries LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  14. Collocationslist LREC 2010 - Stefania Spina - The DictionaryofItalianCollocations

  15. Compilation of the Dictionary • Lexical database enrichedwithtwokindsof data: • visibleto the learner (client output) • definition, examples, part-of-speech, syntacticcontextofoccurrenceofcollocations • tobeprocessedbyotherapplications (server) • internalsyntacticconfigurationforautomaticrecognition LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  16. DB integration in the VLE • VirtualLearningEnvironment: • web applicationspecificallydevotedtolanguagelearning • LELE (Linguistically-EnhancedLearningEnvironment) • providelanguagelearnerswithadditional NLP resources, in ordertoimprovetheirlinguisticcompetence • receptive and productivelearningactivitiesconcerning the recognition and the activeuseofcollocations LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  17. LELE Features • toautomaticallyrecognize and highlightmulti-wordunits in writtenItaliantexts; • to show additionallinguistic information about the selectedcollocations; • to generate collocationtestsforcollocationalcompetenceassessmentofsecondlanguagelearners. • … LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  18. LELE scheme VLE DB + tagger browser server client LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  19. Conclusions • Nextsteps: • samemethodologyto the whole corpus, forall the 10 selected POS sequences • test of LELE system withstudents: startingjanuary 2011 • Furtherresearch • refinestatisticalmeasures • assigncollocationstodifferentlevelsofcompetence • othertools (productivetasks) LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  20. Stefania Spina E-learning and Language Technologies UniversityforForeigners Perugia, Italy stefania.spina@unistrapg.it http://april.unistrapg.it LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  21. References • Juilland, A & Chang-Rodriguez, E. (1964). FrequencyDictionaryofSpanishWords. The Hague: Mouton & Co • Meunier, F. & Granger S. (2008). Phraseology in foreignlanguagelearning and teaching. Amsterdam: John Benjamins • Nesselhauf, N. (2005). Collocations in a learner corpus. Amsterdam: John Benjamins • PazosBretaña, M. & PamiesBertrán, A. (2008). Combinedstatistical and grammaticalcriteria. In S. Granger & F. Meunier (Eds), Phraseology. An interdisciplinaryperspective. Amsterdam: John Benjamins, pp. 391-406. LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  22. LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

  23. Backgroud: prototypicalfeatures • semantic (non)-compositionality Tagliare la corda “runaway” aprire la porta “open the door” • (non)-substitutability {fare|porre|rivolgere|formulare} una domanda “ask a question” Camera oscura “dark room” * Stanza oscura • (non)-insertionofexternalitems fare una lunga, calda, riposante doccia “take a long, hot, restfulshower” Sistema *molto operativo “operating system” LREC 2010 - Stefania Spina - The Dictionary of Italian Collocations

More Related