1 / 21

“DataWay” Towards a National Infrastructure for Heterogeneous Data

“DataWay” Towards a National Infrastructure for Heterogeneous Data. Presentation at WebEx Meeting June 15, 2012. DataWay Webinar Outline. Context Challenge Anticipated Outcomes Framework Timeline & Guidance Comment and Questions. The Context and the Challenge:.

seda
Download Presentation

“DataWay” Towards a National Infrastructure for Heterogeneous Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “DataWay”Towards a National Infrastructure for Heterogeneous Data Presentation at WebEx Meeting June 15, 2012

  2. DataWay Webinar Outline • Context • Challenge • Anticipated Outcomes • Framework • Timeline & Guidance • Comment and Questions

  3. The Context and the Challenge: Science is being Transformed by Data and Computation • Integrative and multi-scale • Not bound by organizational limits • Multi-disciplinary collaboration • Data Forests • Heterogeneous • Distributed, diverse • Central repositories • Data needs to be accessible and interoperable across funding agency and national boundaries

  4. Heterogeneous Data • Diverse sources and sizes. • Simulations, experiments, observations • From a variety of disciplines • Connected, distributed, or centralized. • Across a range of length and time scales, resolutions, and accuracies.

  5. Examples of Heterogeneous Data Multi-scale approaches

  6. Discovery of new materials and phenomena Computational tools Digital Data Experimental tools Digital Data • A modeling paradigm: solving complex nanostructures. • No single data set contains sufficient information by itself to constrain a unique solution.

  7. Optimizing the Physical Internet Leveraging internet models, to optimize the global “Physical Internet” for distribution of goods worldwide to minimize time, energy consumption and cost. Efficient real-time access, processing, interpretation and management of global sensory and event data is a fundamental element of this work at the (CELDi) Center for Excellence in Logistics and Distribution Universities: Arkansas, Clemson, Berkeley, Oklahoma, Missouri, Arizona State, Oklahoma State, Virginia Tech Members: Over 25 members from industry and government

  8. DataWay Strategies Supporting Collaborative Data Use in Research Across the Sciences • A charrette is a collaborative session in which a group of designers drafts a solution to a problem. • Often refers to activities that are focused on producing actionable plans for future funding. • The DataWay charrette is the beginning of a process. • To engage the community in developing strategies that identify and support the emergence of broadly useful ideas for a data infrastructure that facilitates and promotes efficient data utilization and management across the research communities.

  9. Anticipated Outcomes of DataWay Charrette • An engaged community, increasingly contributing to an interactive website. • A growing repository of white papers defining the elements of key issues. • Iterative discovery process leading to consensus on the best approaches. • Integrated and sustained connections among elements of the discovery process.

  10. Cyberinfrastructure Framework for 21st Century Science, Engineering and Education Major NSF investment area for FY 13 and beyond • Comprehensive and integrated CI • To transform research, innovation and education • Focus on computational and data-intensive science to address complex problems • Four major components • Data-enabled science • New computational infrastructure • Community research networks • Access and connections to CI facilities

  11. Current NSF Funding for “Data” • DataNet • Long-term preservation and access of data • Data Infrastructure Building Blocks (DIBBs) • Software Infrastructure for Sustained Innovation (SI2) • Cyber-Enabled Discovery and Innovation (CDI) • Data enabled science and engineering • Core Techniques and Technologies for Advancing Big Data Science & Engineering (BIGDATA)

  12. Anticipated Outcomes of DataWay Build strategies that support the development of infrastructure that will: • Facilitate the emergence of broadly useful tools that can be used by investigators in many fields. • Support the evolution of collaborative communities around the use of data infrastructure tools by promoting better communication, exchange and cross-education.

  13. Framework Elements • An infrastructure framework that supports data • Collection • Curation • Analysis • Visualization • Integration • Searching • for data sets from experiments and simulations from many sources and at many scales. • Salient issues include • Validation • Annotation • Interoperability • Ontology

  14. An open heterogeneous data infrastructure network will evolve Over Time http://sydney.edu.au/engineering/it/~shhong/temporal.htm

  15. Framework Elements A common architecture is needed to produce an effective, interworking, sustainable system to: • Support the development of integrated and interactive services that transcend fields and accelerate discovery in complex, multi-scale problems. • Create interoperable digital-access infrastructures, providing open, extensible and sustainable networks. • Foster collaborations and the sharing of observations, simulations, and other relevant scientific information. • Facilitate data transfer between individual researchers and data systems & applications. • Integrate research and education.

  16. Timeline Proposed framework approaches developed Short-term enabling awards Two WebEx events Charrette DCL Released Jun 2012 Jul-Sept 2012 Oct 2012 Nov/Dec 2012 Nov/11-Apr/13 May 2013

  17. Guidance for the Charrette • The charrette planning process is open to all. • We welcome a wide range of ideas and strategies • Participant selection will consider the collective: • Expertise in relevant cyberinfrastructure, data management, and software fields • Representation of a broad range of scientific domains

  18. Guidance for the Charrette • NSF will work with the community to prepare for the charrette. • An emphasis on engaging and connecting research: both in communities where data-enabled science is already a focus, as well as other communities on the cusp of data science. • We seek participation from other government agencies (e.g. DOE, NIH, NASA) and the scientific communities they support. • We seek participation from other countries, particularly in Europe and Asia, and projects they support, especially those that need to share data across countries. • Expected outcomes from the charrette include multiple enabling awards to design framework(s) and build community involvement.

  19. Next Steps • A DataWay website will provide information on the charrette. • Guidance for the community • Preparation instructions • FAQs • The charrette will be in October. • A final date will be announced on the website and in a Dear Colleague Letter. • A second webinar will be held in August. • Questions/comments to DataWay@nsf.gov

  20. Commentsand/orQuestions Where discoveries begin

More Related