1 / 9

Data Architecture Final Report

Data Architecture Final Report. Chris Jordan Assistance from Kelly Gaither, Phil Andrews, J Ray Scott, and a cast of dozens. Process. Over a year of discussion, interaction with various communities Campus Champions Science Gateways Data Collections and Data WG Teragrid conference BOFs

viho
Download Presentation

Data Architecture Final Report

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Architecture Final Report Chris Jordan Assistance from Kelly Gaither, Phil Andrews, J Ray Scott, and a cast of dozens

  2. Process • Over a year of discussion, interaction with various communities • Campus Champions • Science Gateways • Data Collections and Data WG • Teragrid conference BOFs • Team discussions and conference calls • Kelly Gaither, Phil Andrews, J Ray Scott

  3. What is the Data Architecture • Providing a conceptual view of data • Understanding how data is used in individual projects and in a larger lifespan • Methodology for identifying gaps in data infrastructure • Not: • A specific set of hardware or software recommendation • A set of standards to be utilized by TeraGrid • A static view of the community and its needs

  4. Data Architecture “Spectrum” View • Communities of Use • From individual projects to the global public • Length of Value • Scratch data (days to weeks) to irreproducible data (to infinity and beyond) • TeraGrid must support the full breadth of data needs across both axes in some fashion

  5. Implementation Goals • Minimal number of new technologies • Minimally intrusive to users • TeraGrid and RPs don’t have to provide everything directly • TeraGrid must have a strategy to support all needs

  6. Arch Recommendations 1 • Software Support for Data Management and Replication • iRODS Multi-Site deployment • Global Namespace/File System Service • Distributed Storage(J-WAN) and HSM/File system integration • Policy and Documentation support for Data Collections • Improved GIG and RP support for Data Collections WG

  7. Arch Recommendations II • Policy and Allocations Support for Data Lifecycle Management • Ideally: Data Allocations • Minimally: Requirement to express needs in requests • Coordination with Data Infrastructure Providers • Coordination group formed, 1 DataNet Partner, Reddnet, HathiTrust members • User Services Support for Data Management • Chris working with Amit, Presentation to AUS team 11/5 • Workflow and Portal tools to support expression of data lifecycle needs

  8. Implementation Issues • Are there any major issues or concerns? • Is there anything described that RPs would NOT participate in/support through existing resources and Data-WG staff time? • How to proceed with allocations process and policy changes?

  9. References • Wiki Page: • http://teragridforum.org/mediawiki/index.php?title=Data_Architecture • Implementations Notes: • http://teragridforum.org/mediawiki/index.php?title=Data_Architecture_Recommendations_Implementation

More Related