1 / 37

Robina Clayphan Interoperability Manager, EDLF

Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities. Robina Clayphan Interoperability Manager, EDLF. ECDL Workshop – Harvesting Metadata: Practices and Challenges September 30 2009. Introduction.

kris
Download Presentation

Robina Clayphan Interoperability Manager, EDLF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities Robina Clayphan Interoperability Manager, EDLF ECDL Workshop – Harvesting Metadata: Practices and Challenges September 30 2009

  2. Introduction • A look at the metadata schema we use and the elements that must be in a standard form • The whole ingestion process • Summary of the aspects of and approach to aggregation

  3. Europeana Europeana brings together and makes available digital content from: • Four cultural heritage sectors • Museums, Archives, Libraries, Audio-visual archives • Twenty-nine countries • EU plus Norway and Switzerland • Twenty-six languages • Four types of material • Image, sound, video, text ….need for a metadata lingua franca…

  4. ESE V3.2 Europeana Semantic Elements (ESE) V3.2 developed for the prototype • A Dublin core-based application profile • Cross-domain schema for heterogeneous data • Not to capture the full semantics of provider’s data • 37 Dublin Core terms – used principally to describe the objects • 12 Europeana coined terms - used to support portal functionality • Needed to have consistent data for the portal to work

  5. The Dublin Core elements

  6. Europeana elements

  7. Normalised elements • Language • ISO 369-1 standard two character code. • Country • ISO 3166 standard • Year • Four digit year from Gregorian calendar (YYYY) • Generated where possible from date supplied in <dc:date> • Provider • Controlled list of names, in the language of provider • Type • Controlled list (in English) of four types: Text, Image, Sound, Video • mapped from the diverse types used in source data (by provider)

  8. Mapping and Normalisation Three key reference documents for providers: • ESE Specification V3.2 • Normalisation Guidelines V1.2 • ESE V3.2 XML schema + explanatory text All available from the “Provide Content” section of the Europeana Group pages: http://group.europeana.eu/web/guest/provide_content

  9. Content Ingestion ……starting right from the beginning

  10. Global Europeanaingestion workflow

  11. Activity diagram: Steps I5 to I8

  12. Content Ingestion • Europeana has provided a Content Checker tool which has two parts: • The Content Ingestor • Allows uploading of a data set • Validation against the ESE V3.2 XML schema • Importing the data into the database • Indexing of data • Caching of thumbnails • The Test Portal • Separate from the operational portal • Allows provider to search for uploaded data

  13. Content Ingestor Select “new data set” - the ingestor automatically creates a new ID – “null05” in this example

  14. Content Ingestor - upload

  15. Content Ingestor - validate

  16. Import

  17. Index

  18. Cache

  19. Test Portal - search

  20. Aggregation and the Content Strategy Move on to a look at various aspects of aggregation in Europeana – the need for it, the approach to it.

  21. Aggregation - terminology • A Content Provider • an organization that provides metadata that enables access to its digital objects • An Aggregator • collects metadata from a group of content providers • transmits them to Europeana, • helps content providers with guidance on conformance with Europeana norms • transforms metadata if necessary • supports the content providers with administration, operations and training

  22. Roles and benefits • Content providers • Know their content and data best – fewer mapping errors • Look at the results before ingested in operational system • Aggregators • Know the needs of the providers (domain, level) • Play a bridging role between providers and Europeana – single point of contact, conduit for information in both directions • Europeana • Supporting role for consultation, co-ordination, standardisation • Management of the 10 million objects • Offer the cross-domain and multi-lingual service

  23. Organisational Model

  24. Types of aggregator Matrix of aggregators: • cross-domain, single domain, thematic • level of operation – regional, national, European, global

  25. Why aggregation? • November 2008 – 5 million items in Europeana • July 2009 - content from over 1000 providers • July 2010 – target of 10 million items • Many individual organisations asking to contribute • Currently there are six projects that aggregate content for Europeana (amongst other objectives) • another three projects starting later this year • Europeana Group site at: http://group.europeana.eu/web/guest/home

  26. Why aggregation? • Labour-intensive administration and ingestion processes • Not due to the amount of data – but the number of organisations • Europeana is a small organisation! • Aggregation provides economies of scale allowing Europeana Office to remain relatively small Promoting aggregation and providing services and expertise to aggregators will be key to Europeana’s Content Strategy

  27. Aggregation activities • Aggregators survey • Establish shared issues and need for support • Formation of Aggregators group • Council of Content Providers and Aggregators is now part of Europeana Governance structure • Training for aggregators • Generic and bespoke training days as the need arises • Identifying potential aggregators • “EuropeanaLabs” for Aggregators • Test environment for content delivery and/or software development

  28. Aggregation activities • Handbook for aggregators. Content to be decided as part of survey but likely to cover: • Europeana source code, APIs, content checker etc • Technical documentation for participating in Europeana • Templates and documentation for budget planning, fundraising, revenue generation, sustainability • Templates and documentation for administrative and organisational aspects of running an aggregator • Templates and documentation on IPR and European Licensing framework • Documentation for establishing political and networks support • Templates and documentation for dissemination activities • Wiki for aggregator issues

  29. Thank you! robinaclayphan@kb.nl

  30. Thank you! robinaclayphan@kb.nl

  31. 1 isShownBy

  32. 2 isShownAt

More Related