1 / 25

VI -SEEM Data Repository

Explore the VI-SEEM data repository, learn about the underlying technology, hardware implementation, benefits, features, types of data, and information model.

Download Presentation

VI -SEEM Data Repository

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VI-SEEM Data Repository Vladimir Dimitov IICT-BAS acknwloedgements to Vladimir SlavnićIPB The VI-SEEM project initiative is co-funded by the European Commission under the H2020 Research Infrastructures contract no. 675121

  2. Agenda • VI-SEEM Data Repository • Underlying Software Technology • Hardware Implementation • Benefits of VI-SEEM Repo • Features • Types of data • Information Model • The DSPACE (VI-SEEM Repo) Architecture • Repository Organization • Examples Total number of slides: 25 VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20172

  3. VI-SEEM Repository • The VI-SEEM Repository provides long term data preservation, suitable for data set sharing https://repo.vi-seem.eu/ • Use cases • To store curated data sets for long term preservation • To share those datasets with selected collaborators or open them up to whole communities, via web interface • To make such data sets searchable by means of associating meta data and then harvesting them • Enables scientific communities to capture and describe digital works using a custom submission workflow module VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20173

  4. Underlying Software Technology • Based on DSpace (http://dspace.org) • DSpace is a platform that allows you to capture items in any format – in text, video, audio, and data. It distributes it over the web. It indexes your work, so users can search and retrieve your items. It preserves your digital work over the long term. • Developed by the MIT Libraries with support from the HP-MIT Alliance • A platform to build an Institutional Repository VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20174

  5. Hardware Implementation • The VI-SEEM Repo is installed on a virtual machine with 8 GB RAM and 4 virtual cores. • The physical hosting server is an IBM 3650 M4, with 2 eight core CPUs and 128 GB RAM. • The storage array is formated with GPFS and it is connected over infiniband (56 Gbit/s), using IBM GSS and ESS storage servers. Failover issues are handled automatically by GPFS. • The storage capacity dedicated to the Repo is 50 TB. Currently around 16 TB are occupied with useful data. • Hosted and maintained by GRNET. VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 2017 5

  6. Benefits of VI-SEEM Repo • Some example benefits: • Getting your research results out quickly, to a worldwide audience • Reaching a worldwide audience through exposure to search engines such as Google • Storing reusable teaching materials that you can use with course management systems • Archiving and distributing material you would currently put on your personal website • Storing examples of students’ projects (with the students’ permission) • Showcasing students’ theses (again with permission) • Keeping track of your own publications/bibliography • Having a persistent network identifier for your work, that never changes or breaks • No more page charges for images. You can point to your images’ persistent identifiers in your published articles. VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20176

  7. Features • User Interface • Web based, for submission, end-user and System Administrators • Search and retrieval of items by browsing or searching the metadata • Workflow • Enables differing submission workflows for communities • Models "e-people" who have "roles" in the workflow of a particular Community in the context of a given collection • Persistent Identifiers (Handles) • Implements CNRI handles as the persistent identifier associated with each item • Soon to be integrated with the VI-SEEM PID service • Access Control • Allows contributors to limit access to items in the repository, at both the collection and the individual item level • Integrated with the VI-SEEM Login Service • Metadata Schema • UtilisesQualified Dublin Core VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20177

  8. Types of data in the VI-SEEM Repository • Articles • Preprints, e-prints • Technical Reports • Working Papers • Conference Papers • E-theses • Audio/Video • Lecture notes, Visualizations, simulations • Datasets in various formats • Experimental • Simulation • Input • Output • Images • Visual, scientific • Teaching material • Digitized library collections VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20178

  9. Information Model • Communities • Departments, Labs, Research Centers, Schools… • Collections (in communities) • Distinct groupings of like items • Items (in collections) • Logical content objects • Receive persistent identifier • Bitstreams (in items) • Individual files • Receive preservation treatment • Versioning- Item “versions” can be • All instances of a work in different formats • E.g. the XML, PDF, and PostScript versions • All editions of a work over time • Metadata lists all available versions of items VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 20179

  10. The DSpace (VI-SEEM Repo) Architecture VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201710

  11. Repository Organization • Each Dspaceservice is comprised of Communities – the highest level of the Dspace content hierarchy • Communities may be: • Departments • Labs • Research Centres • Schools • Each community contains descriptive metadata about itself and the collections contained within it VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201711

  12. Collections • Each community in turn have collections which contain items or files • Collectionscan belong to a single community or multiple communities (collaboration between communities may result in a shared collection) • As with communities, each collection contains descriptive metadata about itself and the items contained within it VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201712

  13. Example Structures • Structures may be based around organizational units: • Structures are hierarchical: VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201713

  14. Example: Home screen VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201714

  15. Example: Climate Sciences community VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201715

  16. Example: Browsing by title VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201716

  17. Example: Submissions and Workflow tasks VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201717

  18. Example: Item submission, first step VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201718

  19. Example: Item submission, description VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201719

  20. Example: Item submission, third step VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201720

  21. Example: File upload VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201721

  22. Example: Item review VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 2017 22

  23. Example: Add Creative Commons (CC) license VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201723

  24. Example: Distribution license VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 201724

  25. Conclusion • The VI-SEEM Repository is the main place for long term data preservation, suitable for dataset sharing. • The implementation is based otDSpacepopular open source technology for building large data repositories. • The VI-SEEM Repository is hosted on a high-performance infrastructure. • Types of data may include: • Measurements, Visualizations, Simulations, Audio/Video, Images, • Digitized library collections, Articles, Technical Reports, training materials, raw data etc. • The dataset items are described with detailed metadata records, which must be carefully and patiently filled in by the senders. • A carefully selected license must be assigned to each dataset item. • Thank you for your attention. Questions? VI-SEEM Regional Climate Training event - Belgrade, Serbia, 11-13 Oct 2017 25

More Related