1 / 31

LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning

LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning. Lothar Lemnitzer GLDV AK eLearning, 11. September 2007. LT4eL - Language Technology for eLearning. Start date: 1 December 2005 Duration: 30 months Partners: 12 EU finacing: 1.5 milion Euro

finola
Download Presentation

LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LT4EL - Integrating Language Technology and Semantic Web techniques in eLearning Lothar Lemnitzer GLDV AK eLearning, 11. September 2007

  2. LT4eL - Language Technology for eLearning • Start date: 1 December 2005 • Duration: 30 months • Partners: 12 • EU finacing: 1.5 milion Euro • Type project: STREP IST 027391

  3. LT4eL - Partners • Utrecht University, The Netherlands (coordinator) • University of Hamburg, Germany • University “Al.I.Cuza” of Iasi, Romania • University of Lisbon, Portugal • Charles University Prague, Czech Republic • IPP, Bulgarian Academy of Sciences, Bulgaria • University of Tübingen, Germany • ICS, Polish Academy of Sciences, Poland • Zürich University of Applied Sciences Winterthur, Switzerland • University of Malta, Malta • Eidgenössische Hochschule Zürich • Open University, United Kingdom

  4. LT4eL- Objectives -1- • Scientific and Technological Objectives • Integration of language technology resources and tools in eLearning • Integration of semantic Knowledge in eLearning • Improve (multilingual) retrieval of learning material

  5. LT4eL - Languages • Bulgarian • Czech • Dutch • German • Maltese • Polish • Portuguese • Romanian • English

  6. LT4eL- Objectives -2- • Political objectives • Support multilinguality • Knowledge transfer • Awareness raising • Exploitation of resources • Facilitate access to education

  7. Tasks • Creation of an archive of learning objects • Semi-automatic metadata generation driven by NLP tools: • Keyword extractor • Definition extractor • Enhancing eLearning with semantic knowledge • ontologies • Integration of functionalities in the ILIAS Learning Management System; • Validation of new functionalities in the ILIAS Learning Management System; • Address Multilinguality

  8. LMS User Profile LING. PROCESSOR EN GE Lemmatizer, POS, Partial Parser Ontology CROSSLINGUAL RETRIEVAL Lexikon Lexikon Lexicon Lexikon Lexicon Lexikon Lexikon Lexikon Lexikon RO PT PL CZ BG DT MT PT GE PL RO DT MT EN CZ Documents SCORM Pseudo-Struct. Basic XML CONVERTOR 2 Documents SCORM Documents HTML Pseudo-Struct Glossary CONVERTOR 1 Metadata (Keywords) Ling. Annot XML BG EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY

  9. Creation of a learning objects archive • collection of the learning material (uploads & updates at http://consilr.info.uaic.ro/uploads_lt4el/ - passwd protected) • IST domains for the LOs: 1. Use of computers in education, with sub-domains: • 1.1 Teaching academic skills, with sub-domains: • 1.1.1 Academic skills • 1.1.2 Relevant computer skills for the above tasks (MS Word, Excel, Power Point, LaTex, Web pages, XML) • 1.1.3 Basic computer skills (use of computer for beginners) (chats, e-mail, Intenet) • 1.2 Impact of e-Learning on education 2. Calimera documents (parallel corpus developed in the Calimera FP5 project, http://www.calimera.org/ )

  10. Collection of learning materials and linguistic tools • normalization of the learning material • convertors from html/txt to basic XML format • Inventarization and classification of existing tools (http://consilr.info.uaic.ro/uploads_lt4el/tools/all.php?) relevant to: • the integration of language technology resources in eLearning • the integration of semantic knowledge • Inventarization and classification of existing language resources corpora and frequencies lists: http://consilr.info.uaic.ro/uploads_lt4el/menu/all.php • lexica: http://www.let.uu.nl/lt4el/wiki/index.php/Lexica_Joint_Table

  11. LMS User Profile LING. PROCESSOR EN GE Lemmatizer, POS, Partial Parser Ontology CROSSLINGUAL RETRIEVAL Lexikon Lexikon Lexicon Lexikon Lexicon Lexikon Lexikon Lexikon Lexikon RO PT PL CZ BG DT MT PT GE PL RO DT MT EN CZ Documents SCORM Pseudo-Struct. Basic XML CONVERTOR 2 Documents SCORM Documents HTML Pseudo-Struct Glossary CONVERTOR 1 Metadata (Keywords) Ling. Annot XML BG EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY

  12. Semi-automatic metadata generation with LT and NLP Aims: • supporting authors in the generation of metadata for LOs • improving keyword-driven search for LOs • supporting the development of glossaries for learning material

  13. Metadata • metadata is essential to make LOs visible for larger groups of users • authors are reluctant or not experienced enough to supply it • NLP tools will help them in that task • the project uses the LOM metadata schema as a blueprint

  14. Identification of keywords • Good keywords have a typical, non random distribution in and across LOs • Keywords tend to appear more often at certain places in texts (headings etc.) • Keywords are often highlighted / emphasised by authors

  15. Modelling Keywordiness • Residual Inverse document frequency used to model inter text distribution of KW • Term burstiness used to model intra text distribution of KW • Knowledge of text structure used to identify salient regions (e,g, headings) • Layout features of texts used to identify emphasised words and weight them higher

  16. Challenges • Treating multi word keywords (suffix arrays will be used to identify n-grams of arbitrary length) • Assigning a combined weight which takes into account all the aforementioned factors • Multilinguality

  17. Evaluation • Manually assigned keywords will be used to measure precision and recall of key word extractor • Human annotator to judge results from extractor and rate them

  18. Identification of definitory contexts • Empirical approach based on linguistic annotation of LO • Identification of definitory contexts is language specific • Workflow • Definitory contexts are searched and marked in LOs (manually) • Local grammars are drafted on the basis of these examples • Linguistic annotation is used for these grammars • Grammars are applied to new LOs • Extraction of definitory context performed by Lxtransduce (University of Edinburgh - LTG)

  19. LMS User Profile LING. PROCESSOR EN GE Lemmatizer, POS, Partial Parser Ontology CROSSLINGUAL RETRIEVAL Lexikon Lexikon Lexicon Lexikon Lexicon Lexikon Lexikon Lexikon Lexikon RO PT PL CZ BG DT MT PT GE PL RO DT MT EN CZ Documents SCORM Pseudo-Struct. Basic XML CONVERTOR 2 Documents SCORM Documents HTML Pseudo-Struct Glossary CONVERTOR 1 Metadata (Keywords) Ling. Annot XML BG EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY

  20. Ontology-based cross-lingual retrieval • Metadata can also be represented by ontologies • Creation of a domain ontology in the area of LOs • For consistency reasons we employ also an upper ontology (DOLCE) • Lexical material in all 9 languages is mapped on the ontology and on the upper ontology • Ontology will allow for multilingual retrieval of LOs

  21. Domain Ontology creation  lexicon (vocabulary with natural language definitions) • simple taxonomy  thesaurus (taxonomy plus related-terms) • relational model (unconstrained use of arbitrary relations) • fully axiomatized theory

  22. Domain Ontology • terminological dictionary in chosen domain - term in English, - a short definition in English - translation of the term • formalize the definitions to reflect the relations like is-a, part-of, used-for; • definitions translated in OWL-DL • not achieve a fully axiomatized theory, but relational model of the domain • connection to the upper ontology will enforce the inheritance of the axiomatization of the upper ontology to the concepts in the domain ontology

  23. Upper Ontology: DOLCE • the ontology should be constructed on rigorous basis • it should be easy to be represented as an ontological language such as RDF or OWL • there are domain ontologies constructed with respect to it • it can be related to lexicons - either by definition, or by already existing mapping to some lexical resource

  24. LMS User Profile LING. PROCESSOR EN GE Lemmatizer, POS, Partial Parser Ontology CROSSLINGUAL RETRIEVAL Lexikon Lexikon Lexicon Lexikon Lexicon Lexikon Lexikon Lexikon Lexikon RO PT PL CZ BG DT MT PT GE PL RO DT MT EN CZ Documents SCORM Pseudo-Struct. Basic XML CONVERTOR 2 Documents SCORM Documents HTML Pseudo-Struct Glossary CONVERTOR 1 Metadata (Keywords) Ling. Annot XML BG EN Documents User (PDF, DOC, HTML, SCORM,XML) REPOSITORY

  25. Integration in ILIAS • Integration of LT4eL functionalities for semi-automated metadata generation, definitory context extraction and ontology supported extended data retrieval into a learning management system (prototype based on ILIAS LMS) • Developing and providing documentation for a standard-technology-based interface between the language technology tools and learning management systems

  26. Integration of functionalities Development Server (CVS) Content Portal KW/DC Ontology ILIAS LOs Code Code/Data Code Migration Tool Nightly Updates Use functionalitiesthroughSOAP Java Webserver (Tomcat) ILIAS Server Application Logic Webservices nuSoap LOs Axis KW/DC/Onto JavaClasses/ Data Evaluate functionalitiesin ILIAS ThirdPartyTools User Interface Servlets/JSP Evaluatefunctionalities directly

  27. Validation of enhanced LMS. • Challenge is to answer these questions: • How does this compare with what can already be done with existing systems? • What added value is there? • What is the educational / pedagogic value of these functionalities? • Problem is to evaluate the functionality and separate from issues of usability or unfamiliarity with the LMS platform.How can we expect users to identify any benefit?

  28. How can we expect users to identify any benefit? • Present them with tasks to complete using LMS • With no project functionality • With project functionality • Partial • Full • Identify potential users • Course Creators • Content Authors or Providers • Teachers • Students • studying in their own language • studying in a second language

  29. Create outline User Scenarios • We define scenarios, in this context, as • a story focused on a user or group of users which provides information on • the nature of the users, • the goals they wish to achieve and • the context in which the activities will take place. • They are written in ordinary language, and are therefore understandable to various stakeholders, including users. • They may also contain different degrees of detail.

  30. Conclusions • Improve retrieval of learning material • Facilitate construction of user specific courses • Improve creation of personalized content • Support decentralization of content management • Allow for multilingual retrieval of content

  31. Contact • www.lt4el.eu • Contact for information: Paola.Monachesi@let.uu.nl

More Related