470 likes | 877 Views
julie.verleyen@kb.nl , Operations Team Europeana Office, 4 February 2010 Information Day, Luxemburg. Joining EUROPEANA: Technical Requirements. Session content. Europeana.eu Service & Principles How to contribute content? Technical requirements in a nutshell Data ingestion workflow
E N D
julie.verleyen@kb.nl, Operations Team Europeana Office, 4 February 2010 Information Day, Luxemburg Joining EUROPEANA:Technical Requirements
Session content • Europeana.eu • Service & Principles • How to contribute content? • Technical requirements in a nutshell • Data ingestion workflow • Resources: docs, tools, team • Technical evolution and developments • Releases • EuropeanaLabs.eu • Europeana and the projects • Group of current and coming projects • Collaboration requirements
Session content • Europeana.eu • Service • Principles • How to contribute content? • Technical requirements in a nutshell • Data ingestion workflow • Resources: docs, tools, team… • Technical evolution and developments • Roadmap: Rhine & Danube • EuropeanaLabs.eu • Europeana and the projects • Group of current and coming projects • Collaboration requirements
Session content • Europeana.eu • Service • Principles • How to contribute content? • Technical requirements in a nutshell • Data ingestion workflow • Resources: docs, tools, team… • Technical evolution and developments • Roadmap: Rhine & Danube • EuropeanaLabs.eu • Europeana and the projects • Group of current and coming projects • Collaboration requirements
Europeana principles • Common access to digital cultural heritage objects across different domains • Complementing and not duplicating the access type provided in the different domains through specialised portals, locally-based • Neutral view of the digital objects compared to any of the traditional domains view • Digital objects at the lowest possible level of granularity • As direct as possible access to the digitised object • Minimum click distance between a description and the object • Search and discovery of objects • Common central index of the objects’ metadata • Object centric approach and not collection centric approach
Europeana content • Europeana stores representations of digital objects and not the digital objects themselves • On Europeana side: • Description and preview by means of metadata • On institution/aggregator side: • Repositories where digital objects are stored • Original website to view, play and reuse the objects
Europeana data model • Needs to interoperate with the data models used by contributing projects and aggregators when ingesting the contributed data • Needs to handle data created by Europeana at the ingestion stage & user generated content • Needs to support the Europeana content dissemination • to the user centered services proposed in the Europeana business model • to other applications building services based on the Europeana repository of Digital Objects metadata
Europeana ingested data • Europeana data includes: • Metadata (descriptive, administrative) describing a digital object • Preview (thumbnail) of the described object • Active links to the described digital object on the provider’s site • Same level of granularity between Europeana data and the digital object • To ensure for the user consistency and predictability for search and navigation across the domains and collections • Aggregators are here key players in preparing the data!
Digital object in Europeana • Definition • A unique single entity which can be viewed/played by users – e.g. mpeg movie, mp3 audio, jpeg photo, PDF text etc... that users view on their computer. • Digitised version of a physical/analogue cultural item/artifact • Ex of non-digital objects: museum description – even extensive, scanned catalogue card
Digital object in Europeana • Granularity • User-centric approach: Is useful/relevant to the user, level-specific value. Ex page level: Botanical plate with specific description • Can be at different levels: E.g. book/page, newspaper title/issue, music record/song, film/movie/audio/script/poster/..., etc... • Lowest possible level with • Direct access – one-click from Europeana to the object • Level-specific description
Is it interesting to provide direct access to this herbarium plate? • Probably yes, BUT: • No extracted species names… • No description of the plate… • No direct link to this image of the plate (address bar URL) points to the 1st image of this digitised book…
Session content • Europeana.eu • Service • Principles • How to contribute content? • Technical requirements in a nutshell • Data ingestion workflow • Resources: docs, tools, team… • Technical evolution and developments • Roadmap: Rhine & Danube • EuropeanaLabs.eu • Europeana and the projects • Group of current and coming projects • Collaboration requirements
Technical requirements in a nutshell • Digitised object available on a website at the object level through a permanent direct link • To the object and/or the object in context • To a thumbnail or a sample • Metadata at the digitised object level following the metadata elements set defined by Europeana for the general interface • Qualified Dublin Core (at the moment only simple DublinCore Indexed and displayed) • Preferably exposed in XML for harvesting on an OAI-PMH server
Session content • Europeana.eu • Service • Principles • How to contribute content? • Technical requirements in a nutshell • Data ingestion workflow • Resources: docs, tools, team… • Technical evolution and developments • Roadmap: Rhine & Danube • EuropeanaLabs.eu • Europeana and the projects • Group of current and coming projects • Collaboration requirements
Aggregators Handbook • Provide overall information and tools to aggregators in order for them to provide metadata to Europeana and related institutions. • Organised in 6 areas: • Organisation • Business • Technical & Operational • Dissemination • Aggregator Case Study • Services & Contact
Submission procedure • [Provider/Europeana] Partnership/contractual agreement • [Provider] Analysis and description of submission • Data sets, volumes, schedule, availability of links, thumbnails, rights information • [Europeana] Assessment of submission • Content strategy (type, geographical provenance, principles), technical strategy (harvesting policy) • [Provider/Europeana] Negotiation/agreement • [Provider] Data preparation, repository setup and population, testing • [Europeana] Formal ingestion
Tools & doc [2/2]Technical resources • Data requirements: • Europeana Semantic Elements Specifications v3.2.2 • ESE XML Schema • Data preparation aids: • Mapping and Normalisation Guidelines for Europeana • Specifications for Europeana thumbnails • Definition of Digital Object in Europeana context • Validation tool • Content Checker [demo]
The Content Checker • = Validation tool available to providers before submission to production environment. 2 parts: • The Content Checker Ingestor • Allows uploading of a data set • Validation against the ESE V3.2 XML schema • Importing the data into the database • Indexing of data • Caching of thumbnails • The Content Checker Portal • Separate from the operational portal • Allows provider to search for and view uploaded data • Ensure the appropriateness of the mapping decisions
Select “New Data Set” to create a new Data Set widget. The Ingestor automatically assigns an identifier to the Data Set, 999901 in this example
Description is saved. A data file with records can now be selected from the file system and uploaded
The file upload is followed by a validation step of the data against the ESE XML Schema
If it is ok the validation step is followed by the actual import of the records
Press “Start caching” to initiate caching the thumbnail images of the Data Set’s records. This step requires also a confirmation. The indexing job is started
The Data Sets records are indexed and available for search in the Content Checker portal. The caching job is started
http://contentchecker.isti.cnr.it:8080/portal/brief-doc.html?start=1&view=table&query=europeana_collectionName%3A999901http://contentchecker.isti.cnr.it:8080/portal/brief-doc.html?start=1&view=table&query=europeana_collectionName%3A999901 Search for europeana_collectionName:999901 gives results corresponding to all the records indexed for the Data Set 999901. The thumbnail images are visible europeana_collectionName:999901
Session content Europeana.eu Service Principles How to contribute content? Technical requirements in a nutshell Data ingestion workflow Resources: docs, tools, team… Technical evolution and developments Roadmap: Rhine & Danube EuropeanaLabs.eu Europeana and the projects Group of current and coming projects Collaboration requirements
Interaction with the Europeana Office Business Team for the contractual agreements (partnership, licensing), organisation (contacts), validation against content strategy, general schedule Operations Team for permanent liaison regarding an ingestion job, technical agreements & support, execution of ingestion operations (harvesting), detailed schedule (tests, feedback, ) Projects Coordination Team: Each project has a contact point at the Europeana office to facilitate the communication and coordination with Europeana: alignment of planning, identification of contacts, transmission of key information on both sides
Session content Europeana.eu Service Principles How to contribute content? Technical requirements in a nutshell Data ingestion workflow Resources: docs, tools, team… Technical evolution and developments Roadmap: Rhine & Danube EuropeanaLabs.eu Europeana and the projects Group of current and coming projects Collaboration requirements
Roadmap: Rhine & Danube Rhine, v1.0, summer 2010 Content, 10M Full-scale automated ingestion, processes End-user marketing: take up and sustainability Danube, v1.3, spring 2011 Technology, e.g. rich mobile clients Infrastructure, e.g. registries of semantic/multilingual resources API
EuropeanaLabs.euhttp://europeanalabs.eu Is a developers’ environment: R&D, tests, prototyping… Gives access to: Europeana software code base Europeana development practices Enables code contribution for integration in Europeanav1.0’s forthcoming releases Later: Engage the Open Source community to use the code & resources as basis for innovative applications for use within or independently of Europeana
Session content Europeana.eu Service Principles How to contribute content? Technical requirements in a nutshell Data ingestion workflow Resources: docs, tools, team… Technical evolution and developments Roadmap: Rhine & Danube EuropeanaLabs.eu Europeana and the projects Group of current and coming projects Collaboration requirements
What does Europeana.eu need from the projects? • Content compliant to ESE • Content tested in test sites provided by Europeana, normalised, quality checked and ready for ingestion. • Software Development that has full functional testing • A Distributed Development Community
What should projects plans include? • People able providing support/expertise in interoperability: • Able to understand technical Europeana requirements, expressing the projects’ need, • Tasks that will enable/facilitate contribution to Europeana: • Analysis of technical requirements • Setting up of necessary infrastructure: metadata transformation, repository, dissemination (OAI-PMH) • Testing: quality control • Survey of content in terms of provenance, volumes, delivery dates for initial submissions and updates
More info/questions? • Bilateral meetings with members of the Operations Team (today, 14:30-17:00) • Valentine Charles, Ingestion Specialist • Antoine Isaac, Scientific Coordinator • Julie Verleyen, Scientific Coordinator • Contact the Europeana Office • info@europeana.eu • Visit • Europeana portal http://europeana.eu/ • Europeana Group of projects http://europeana.group.eu