10 likes | 91 Views
Dynamic Content Discovery, Harvesting and Delivery, from Open Corpus Sources, for Adaptive Systems. Séamus Lawless Knowledge & Data Engineering Group, Trinity College Dublin slawless@cs.tcd.ie. The Problem.
E N D
Dynamic Content Discovery, Harvesting and Delivery, from Open Corpus Sources, for Adaptive Systems Séamus Lawless Knowledge & Data Engineering Group, Trinity College Dublin slawless@cs.tcd.ie The Problem Personalised elearning is being heralded as one of the grand challenges of next generation learning systems, in particular, its ability to support greater effectiveness, efficiency and student empowerment. However, a key problem with such systems is their reliance on bespoke content. The challenge for adaptive systems in scalably supporting personalised elearning is its ability to discover, harvest and deliver open corpus content to adaptive content services and personalised elearning systems. Research Goal and Scope Methodology The goal of this research is to provide dynamic contextual retrieval of open corpus content for eLearning systems. There are many sources of open corpus content, the primary examples of which are the Worldwide Web and digital content repositories. Open corpus content lacks consistency in its structuring. These inconsistencies are both semantic, relating to the vocabulary used to describe the content, and syntactic, relating to the metadata standards implemented. Metadata mappings to a canonical model using a fixed vocabulary ontology will be implemented to ensure consistency and accuracy in content discovery, analysis and description. Lexical analysis and metadata tag generation will be required for some open corpus content. The aesthetic qualities and presentation of retrieved content and IP/DRM are both deemed to be outside the scope of this research. It is proposed that a series of content requirements can be extracted from educational environments. These requirements are both technical, relating to content structure, metadata standards and taxonomies used, and semantic, relating to subject matter and pedagogical needs. A cache of candidate learning content is generated through a combination of web crawling and OAI harvesting of repositories. This content is then analysed and indexed to enable discovery. Upon creation of the content index a mapping will occur to a canonical metadata model using a fixed ontology of terms. Metadata tags will be created for content with insufficient descriptions. A query engine will take the content requirements and generate queries to run against the content index and identify suitable candidate content. Once suitable content is identified it will be harvested and delivered to a learning object generation environment for restructuring. System Architecture Status Work underway Focused Web Crawling System operational Initial attempts at generating an index of the content cache Soon to commence Initial attempt at requirements extraction from Learning Designs Metadata canonical model & vocabulary ontology creation The research is supported by IRCSET’s embark initiative and conducted in conjunction with the IBM Centre for Advanced Studies Student Program. For more information visit: www.cs.tcd.ie/~slawless