290 likes | 387 Views
4TU.ResearchData data archive. Madeleine de Smaele. 4TU.ResearchData services. Organization. Collaboration in the Netherlands. 4TU.ResearchData support. Data management plan. Visibility and citation with DOI. Data publication. Search for research data. Data Lab.
E N D
4TU.ResearchDatadata archive Madeleine de Smaele
4TU.ResearchData support Data management plan Visibility and citation with DOI Data publication Search for research data Data Lab Deposit in 4TU.ResearchData http://blogs.bath.ac.uk/research360/about/
te Data archive • ‘Frozen’ dataset (version) for future use & long term storage • ‘Published’ data: visible • Open (max. 2 years embargo): shareable • Persistent identifiers: findable and citable • Sustainable formats: readable • Data Seal of Approval: safe and secure https://data.4tu.nl/repository https://data.4tu.nl/
A trusted digital repository… • …is one whose mission is to provide reliable, long-term access to managed digital resources to its designated community, now and in the future. • Meets organizational, curatorial and operational responsibilities outlined in paper.
Preservation policy • provide authentic and reliable instances of datasets to researchers; • maintain the integrity and quality of the datasets; • ensure that digital resources are managed throughout their lifecycle (e.g. when migrations or changes in metadata are carried out) in the medium that is most appropriate for the task they perform; • and so to be a “trusted digital repository”.
Primary preservation strategies • Hardware migration • File format migration http://researchdata.4tu.nl/fileadmin/editor_upload/pdf/File_formats/preffered_file_formats.pdf
How to deposit Do It Yourself: ‘simple’ setsStandard (self)upload form and descriptive information, single file per object (can be a ‘zipped’ collection), single DOI, …E.g.: Zandvliet, H.J.W. et al. (2010): Diffusion driven concerted motion of surface atoms: Ge on Ge(001). MESA+ Institute For Nanotechnology, University of Twentedoi:10.4121/uuid:3f71549c-6097-4bb8-bc00-6db77deb161d Do It Together: special collections Negotiate: deposit procedure, description (xml, picture, preview), data model, level of DOI assignment, query online, …E.g.: Otto, T., Russchenberg, H.W.J. (2010): IDRA weather radar measurements - all data. TU Delft - Delft University of Technologydoi:10.4121/uuid:5f3bcaa2-a456-4a66-a67b-1eec928cae6d
GEO Metadata https://data.4tu.nl/repository/
RDF vocabularies • Ontologies used for our metadata: • Dublin Core (dc, dcterms) • WGS84 (coordinates) • Geonames (geographic entities) • etc… • 4tu (our own) – for relations like measuredBy, calculatedFrom, measuredAtLocation, …
sample RDF graph Example relations (namespaces are omitted)
RDF in the UI Relevant metadata from related objects is includednot too little and not too much! This dataset has some metadata and is part of this dataset with these metadata It was calculated from this dataset measured by this instrument with these metadata
NetCDF • NetCDF: data format + data model • Comes with set of software tools / interfaces for • programming languages. • Binary format, but data can be dumped in asci or xml • Used mainly in geosciences (e.g. climate forecast • models) • BUT: fit for almost any type of numeric data + • metadata • Core data type: multidimensional array • >90% of 4TU.ResearchData is in NetCDF
OpeNDAP • OPeNDAP: protocol to talk toNetCDF (andsimilar) data over internet • THREDDS: server that speaks OPeNDAP • Internal metadata directlyvisible on site • APIs for all main programming languagese.g. your Python program directly interacts with the data without downloading a data file!(N.B. Datafiles in 4TU.RD are up to 100GB) • API supports queriestoobtain: • theinternalnetCDF metadata, info aboutdimensions, variables. • - cross-sections of data (slices, blocks) • samples of data (coarse-graining, forquickevaluation) • aggregated datasets (e.g. gluetogetherconsecutive • time series)
OPeNDAP / Fedora UI metadataenrichment Fedora OPeNDAP
Example: Majorana fermions data publication
FAIR data original guiding principles at https://www.force11.org/node/6062
How we support FAIR Findable Reusable Accessible & Reusable
Licences Data that can be shared need licencesso that other people know explicitly what they are allowed to do with them. • Creative Commons for data • Open Source software licences for code https://openworking.wordpress.com/2018/03/12/licenses-in-4tu-researchdata/ /
Questions 4TU.ResearchDataTU Delft LibraryPrometheusplein 12628 ZC DelftT +31 (0)15 27 88 600E researchdata@4tu.nlhttp://researchdata@4tu.nl