100 likes | 189 Views
local content in a Europeana cloud . Alternative methods of ingestion for small institutions ( Stein) Runar Bergheim Asplan Viak Internet as. LoCloud is funded by the European Commission's ICT Policy Support Programme. Overview of Presentation.
E N D
local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein) Runar BergheimAsplan Viak Internet as LoCloud is funded by the European Commission's ICT Policy Support Programme
Overview of Presentation • Characteristics of Europeana content providers • Present ingestion methods for Europeana • Alternative ingestion methods “out there” • Experiments that may be conducted as part of LoCloud • 7 slides • 284 words • 1 858 characters • 2 illustrations • (A seemingly endless stream of words)
Characteristics of Europeana content providers Those who are «in» Those who are «out» Very small collections Collections by individuals (tens to hundreds of objects) Independent institutions with strained funding «Non-conforming» online content structure 1 web page 1 object • Professional cultural heritage institutions • Capacity for investment in infrastructure & projects • Technical skills beyond what may be expected • Entities that fit into a hierarchy of aggregators • Patient
Weaknesses of presentEuropeana ingestion process • Puts great demands on content providers • Partly mitigated by the excellent MINT-MORE tools • Limited capacity at harvesting end • Partly mitigated by aggregator hierarchy • Low frequency of updates – each iteration takes a long time • Partly mitigated by modified content/aggregation architecture of Europeana Cloud
Considerations for alternativeingestion methods Difficult to create complete ESE/EDM from crawling • But... the typical Europeana record is not really all that «complete» • Schema.org. Microformats and other embedded semantics may help • Deep-content URLs hidden for crawlers • Simple «site-map» protocol may be applied • Increases capacity for small content providers • Decreases time-consumption of the content ingestion life-cycle • Will serve more than one publishing channel
Experiments that may be conducted as part of LoCloud • Content assessment • Assess quantity of «new» content that can be reached using alternative ingestion methods • Technology experiments • HTML embedded semantics based on open standards • Creating a test-spider for auto-extraction of metadata from web pages • Transformation of data to ESE/EDM • Design of processes • Embedding of spider into aggregator organizations business processes • Ingestion + Quality assurance
Funding LoCloudis funded by the European Commission's ICT Policy Support Programme The views and opinions expressed in thispresentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.