390 likes | 403 Views
Anthony Beitz Technical Manager, Monash e-Research Centre (MeRC) 3 December, 2008. ARCHER e-Research Tools for Research Data Management. “Conventional” Research Data Management. Research Data Management Challenges. Storing and managing datasets, with exponential growth
E N D
Anthony BeitzTechnical Manager, Monash e-Research Centre (MeRC)3 December, 2008 ARCHER e-Research Tools for Research Data Management
Research Data Management Challenges • Storing and managing datasets, with exponential growth • Sharing research resources and work spaces between institutions • Publishing large datasets and related research artifacts • Security and privacy • Push to streamline the workflows of e-research by providing centralised, persistent, and reliable storage
Research Data ManagementA Crystallographers’ Vision Crystal exposed to X-rays & diffraction pattern detected Workflow automates analysis Researcher grows crystal Detector generates raw data Analysis performed on the grid Analysis begins during data generation Data stored to SRB Metadata associated to raw data Iterative analysis saved to SRB Analysis accessed by collaborators Monitor telemetry during file generation Results published in PDB & other repositories
ARCHER Research Data Management Expose Publish Analyse Collaborate Experiment Design Conceive Publish Analyse Collaborate Experiment Research Data Management Lifecycle
Building generic tools for a secure, seamless, and collaborative e-Research space • Dataset Acquisition • Dataset Management (Web) • Dataset Management (Desktop) • Collaborative Workspaces • Workflow Automation • Metadata Management Computational Grids Institutional Repositories Instruments Publication Repositories Publish Acquire Manual
ARCHER • Production-ready open-source research data management infrastructure: • ARCHER Research Repository • Concurrent Data Capture and Telemetry • Scientific Dataset Manager (Web and Desktop Client) • Metadata Editing Tool • Collaborative and Adaptable Research Portal Development Environment • Funded by DIISR, through the SII (Systemic Infrastructure Initiative) • Developed by Monash, JCU, UQ • Software and documentation now available at http://www.archer.edu.au/
ARCHER Research Repository A place for Researchers to store their research data • Easily Accessible • Capable of managing large datasets • Rich metadata • Core metadata based on STFC’sScientific Metadata Model • Flexible metadata available for samples, datasets, and datafiles • Secure
Distributed Integrated Multi-Sensor & Instrument Middleware (DIMSIM) Concurrent data capture & analysis • Allows multiple sensors to be easily integrated • Enables instruments to be more easily accessible over a network • Automatically deposits instrument datasets into a designated research repository • Easily accessible telemetry • Enables concurrent analysis
XDMS: Scientific Dataset Manager (Web) A web tool for Researchers to manage and curate their research data • Formalised research data management • Automatic metadata extraction from research datafiles • Rich metadata editing capabilities (via MDE) • Persistent identifiers generated for each dataset • Powerful search capabilities • Secure • Scratch area for ingestion of instrument and personal data • Can directly publish into an institutional repository • Customised for Crystallography, and may be easily adapted to other research disciplines
Metadata Editing Tool (MDE) Schema driven metadata editing for e-Research • Schema-driven editor • uses the schema to build a Web 2.0 form layout for the metadata • When the user decides to save the metadata record, it undergoes complete validation against the schema • Can support relatively complex schemas
Hermes: Scientific Dataset Manager(Desktop Client) A desktop tool for Researchers to transfer/manage their research data • Principally a data transfer tool • Doesn’t have timeout issues for large data transfers that web apps experience • Platform-independent • Dock-able file browser • Supports many different types of file systems (gftp, srb,cifs etc.) • Supports plugins, which interface to the institutions metadata repository
ARCHER Enhanced Plone: Collaborative and Adaptable Research Portal Development Tool Bringing Researchers together • Based on Plone • Simplifies research portal development • Easy to author and manage own web content • Powerful search capabilities • Enables sharing, management, and discussions of documents • Open source Content Management System (CMS) • Secure and accessible • Access to the ARCHER Research Repository
ARCHER Expected Tool Usage Level of user of e-Research Infrastructure High-end Users Low-end Users Own Tools ARCHER Enhanced Plone (Collaborative/Adaptable Research Portal Dev Tool) ARCHER Research Repository Hermes (desktop client research data manager and file transfer agent) XDMS (web based research data manager and curator) DIMSIM (Distributed Integrated Multi-sensor and Instrument Middleware)
Synergies with ANDS ARCHER’s tools increase the amount of curated data being stored in secure, reliable, and sustainable repositories; facilitating the sharing of Australian research data.
Synergies with ARCS • ARCHER Research Repository being used for ARCS Data Fabric • Hermes tool of choice for transferring data into the ARCS Data Fabric • Offering ARCHER’s enhanced version of Plone for better collaboration around research data • Access to Shibboleth protected apps from the desktop • Conversion of Shibboleth credentials into a short term certificate
ARCHER at Monash University:Deployment for Protein Crystallography
Research Data Manager Community Portal Institutional Repository Publish Data Management Plan Expose Publish Analyse Collaborate Experiment Design Conceive Publish Expose Publish Analyse Collaborate Experiment Institutional Data Store Design Research Data Management Lifecycle
Expose Publish Analyse Collaborate Experiment Design Conceive LaRDS – an Institutional Data Store LaRDS
Large Research Data Storage (LaRDS) • Very large,1.5 PB (ie 1.5 Million GB) and growing • Institutional resource • Secure • Reliable • Accessible via various file access protocols
Expose Publish Analyse Collaborate Experiment Design Conceive ARROW Publish ARROW – an Institutional Repository
ARROW – an Institutional Repository • Generalised institutional repository solution for research information management • Initial focus on managing and exposing traditional “print equivalent” research outputs • Employing open standards where possible • Delivering a blend of open source and commercial software tools • Based on Fedora
TARDIS Expose Publish Analyse Collaborate Experiment Design Conceive Expose TARDIS – a Community Portal Publish
A Protein Crystallography community portal Currently, allows sharing of raw datasets Developed tools for curating and depositing research data Harvests OAI-PMH metadata and PURL from repositories Later, will enable collaboration between Protein Crystallographers See: http://www.tardis.edu.au TARDIS – The AustralianRepository for Diffraction ImageS
ARCHER TARDIS Publish Data Management Plan Expose Publish Analyse Collaborate Experiment Design Conceive ARROW Publish Expose Analyse Collaborate Experiment LaRDS Design Publish Research Data Management @ Monash Uni
Future of ARCHER • Expecting that the partners will continue to develop the tools they created • New enhanced versions already being worked on • Some components now being offered and supported by ARCS
What researchers in general can expect from ARCHER • Open source software tools focused on deposition, management and curation of research data • A secure place to collect, store and manage experimental data • Easier collaboration and sharing of research datasets • Being able to easily customise a collaborative portal web site relevant to their research field
For more information… See: http://www.archer.edu.au/ Contact: Anthony Beitz Technical Manager, Monash e-Research Centre Ph: +613 9902-0584 Anthony.Beitz@adm.monash.edu.au