100 likes | 229 Views
MERLIN (Metadata Enrichment for Libraries in a London Institutional Network): short overview. Martin Moyle Digital Curation Manager UCL Library Services, UK m.moyle@ucl.ac.uk. MERLIN Steering Group 1, 26 January 2010. Background.
E N D
MERLIN (Metadata Enrichment for Libraries in a London Institutional Network): short overview Martin MoyleDigital Curation Manager UCL Library Services, UKm.moyle@ucl.ac.uk MERLIN Steering Group 1, 26 January 2010
Background • JISC funded (Information Environment 09-11, Resource Discovery strand) • Partners: UCL, ULCC, University of Nottingham • Aims • use off-the-shelf text-mining tools to enrich repository functionality • refine this work with user feedback and roll out • 18 months (April 2009 – September 2010)
Objectives • Integrate NacTEM tools in the SHERPA-LEAP consortial repository ‘LASSO’, deriving weighted keywords from source repositories. • Modify LASSO to expose these mined terms for discovery. • Work with users to ensure usefulness and user-friendliness of these modifications. • Experiment with thesauri: can mined terms be used to support/enhance structured navigation? • Roll out a MERLIN tool for plugging in to IRs. • Full project evaluation. • Dissemination.
Text-mining for repositories: why? • Improve discoverability of repository content • Cost-effective description – cheaper than subject cataloguing… • Uses researchers' own vocabularies • Supports interdisciplinarity • Adds selectivity and weight to full text indexing • May also be used as the basis for structured navigation • Examples: how MERLIN outputs might look…
LASSO: with terms mined from full texts Terms are weighted by frequency. Use slider to see more/fewer terms Select a mined term...
...new set of records associated with that term is retrieved
Progress / next steps • Done • Initial text mining • Integration of mined dataset with LASSO data • LASSO customisation under way • Tasks between now and Easter • User evaluation of LASSO; iterative improvement • Refinements to text mining; full integration into LASSO workflow • Begin thesaurus work...