1 / 14

Integrating Data for Archaeology

Explore how the ARIADNE Infrastructure revolutionizes data integration in archaeology by providing a unified interface, improving search and retrieval, and enriching information through specialized metadata schemas.

Download Presentation

Integrating Data for Archaeology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Integrating Data for Archaeology Dimitris Gavrilis, EleniAfiontzi, Johan Fihn, Olof Olsson, AchilleFelicetti, Franco Nicollucci, Sebastian Cuy

  2. Introduction • Traditional projects in Archaeology focused on aggregating data into one single format / system • Provide users with a unified interface • Improve search and retrieval • Improve retrieval semantics through specialized metadata schemas • ARIADNE goes one step further : data integration • Try to model the domain information (ARIADNE Catalog Data Model) • Use a curation aware aggregator to enrich information using the above model • Improve user experience through more substantial and powerful queries

  3. Innovation • Why hasn’t anyone done this before ? • Complexity • Performance • Domain knowledge • Standard aggregation systems / architectures are insufficient.  ARIADNE Infrastructure

  4. ARIADNE Infrastructure • Flexibility • Ingest diverse and heterogeneous data • XML, RDF, Excel, CSV, … • Handle each datastream independently and according to it’s requirements • Adapting aggregation, validation, enrichment workflows • Add new curation services easily and on demand

  5. ARIADNE Infrastructure • Complexity • De-couple services complexity through a micro-service oriented architecture • Use loosely connecting services in a highly scalable environment. • Performance • Scalable technologies

  6. ARIADNE Infrastructure • Domain knowledge • Integrate the domain model (ACDM) into the infrastructure • Make extensive use of domain thesauri (e.g. AAT) and label every resource accordingly • Create specialized micro-services for curating content according to the domain needs

  7. Data Integration Overall Architecture Repository RDF Store (RDF) MORe Validation Integration Experiments RDF Store (CRM) Cleaning Excel Sheet ARIADNE Portal Enrichment Elastic Search ARIADNE Registry Integration Archive

  8. Use of RDF • Every resource is assigned a unique and persistent identifier that is resolved through a URI • Every resource has an RDF representation according to the ACDM schema

  9. Data Curation • Use of curation micro-services for enriching content • Geo-normalization (identify, extract and normalize places and coordinates) • Geo-coding (e.g. Geo-names) • Thesauri mappings (map native subject terms to a common thesauri : AAT) • Temporal normalization (identify, extract and normalize dates) • Gazetteers (e.g. DAI Gazetteer) • Historical & Ancient place names identification (Pelagios & Pleiades) • Temporal information mappings (Perio.do)

  10. Data Integration • Data Integration is based on a 3+1 dimensions • Subject • Space • Time • Resource type

  11. Identify & Link together Resource Types • Model individual information resource types (e.g. collections, bibliographic reports, databases, datasets, etc). • Identify each resources type during ingestion • Link / group different resource types • E.g. put all related heterogeneous resource types (reports, datasets,…) under the same collections

  12. Thematic integration • ARIADNE uses the AAT thesaurus to semantically label ALL aggregated information. • AAT terms act as a glue and when combined with spatial and temporal information can produce great results • Semantic expansion of terms is extensively being used in order to improve retrieval. • Expansion of multi-lingual terms facilitates cross-language search without requiring automatic translation.

  13. Spatial & Temporal • All resources with spatial information • Are assigned WGS84 projected coordinates • All resources with temporal information • Are normalized according the ACDM dates (that takes into account periods, period names and supports ISO date format).

  14. Subject Terms Curation Lifecycle AAT *nativeSubjects Native Subjects mappings Vocabulary Mapping Tool MORe Provider Native Repository *nativeSubjects *providedSubjects Excel Sheet XML Files Registry *nativeSubjects *providedSubjects ACDM / Subjects (JSON) **providedSubjects **derivedSubjects **broaderGenericSubjects *nativeSubjects ARIADNE Portal Elastic Search *mono-lingual (prefLabel only) ** multi-lingual (prefLabel & altLabel)

More Related