1 / 25

datacommons.psu

http://www.datacommons.psu.edu. Overview of Today’s session. DataCommons@PSU background Overview of capabilities Case studies & data partners Findings. Why Develop a DataCommons?. Data management plans and curation are now a requirement of funding agencies like NSF and NIH .

cato
Download Presentation

datacommons.psu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. http://www.datacommons.psu.edu

  2. Overview of Today’s session • DataCommons@PSU background • Overview of capabilities • Case studies & data partners • Findings

  3. Why Develop a DataCommons? • Data management plans and curation are now a requirement of funding agencies like NSF and NIH. • The issue of data has been featured in journals such as Science and discussed and supported in international scientific research societies such as the Royal Society in the UK (Science as an Open Enterprise) and in organizations such as the European Commission.

  4. Why Develop a DataCommons? • The issues related to data acquisition, collection, curation, and access have not only become of central importance to funding agencies, they have been recognized as vital to research, collaboration, and teaching. • Science February 2011 special issue highlights the importance of these issues: “Scientific innovation has been called on to spur economic recovery; science and technology are essential to improving public health and welfare and to inform sustainability; and the scientific community has been criticized for not being sufficiently accountable and transparent. Data collection, curation, and access are central to all of these issues.” Furthermore: “Most scientific disciplines are finding the data deluge to be extremely challenging, and tremendous opportunities can be realized if we can better organize and access the data.”

  5. Science: The State of Research Data

  6. Background on the DataCommons@PSU • 2005 • Concept first presented by PSIEE as an environmental data library and repository for geospatial information created by PSU faculty, researchers, and collaborators. • Received a $5K grant from PSU to explore the concept and acquire data. • 2010 • PSIEE presented this idea to the PSIEE director, the director of the Institute for Cyberscience, and the director of the High Performance Computing Center in spring 2010. • They recognized that this was a common need and had similar goals and interests currently underway. • Growing interest and support in the next six months led to a PSU data community meeting sponsored by the Institute for CyberScience and PSIEE in September 2010. • 2011 • Over the next few months the DataCommons site and search/retrieval mechanism were developed and tested and the first new research data was acquired. • The DataCommons@psu site was officially launched in April 2011 has grown to include a wide array of data including geospatial, tabular data and databases, documents, models, and protocols.

  7. Why is this important to Penn State? • Access to information is vital to much of the research, teaching, and outreach conducted by the Penn State Community. • Data also demonstrates research productivity which is usually only represented in $. • The DataCommons@PSU provides a picture of PSU research.

  8. Purpose • The purpose of the datacommons@psu is to serve as a portal to data, applications, and resources that support efforts across the Penn State community. • The datacommons@psu facilitates interdisciplinary collaboration by connecting people and resources through: • Data Discovery & Access • Data Archiving and Preservation • Support of Data Sharing • Development of Data and Application Documentation (Metadata) • Support for development of agency required data management plans • Metadata development seminars (new) The datacommons@psu does not replace existing programs or projects but highlights those by making information and their websites/data accessible via the datacommons@psu search engine.

  9. Purpose… • Highlight data, applications, models, and projects created by members of the university community. • Support collaboration and data sharing across those efforts and communities. • Support the development of large scale research proposals and provide the data infrastructure to build research gateways. • Reduce costs by providing widespread access to data needed by multiple projects and programs and reduce redundant data acquisition efforts and storage of data—Core Data • Enhance the ability to develop research proposals, publish results, and aid in supporting the educational/outreach component of major funders. • Provide a unifying tool that promotes cooperation and the development of cross college/cross campus initiatives by linking individuals and groups with similar interests and information needs together.

  10. What are other universities doing?

  11. Capabilities • Data storage • Metadata development • Data search, retrieval, and access • Visualization of compatible data • Core data • Documentation and access to apps created by PSU • Documentation of models and protocols • Creation of Digital Object Identifiers (DOIs) • Links to existing data repositories with PSU data • References and links to publications based on the data

  12. Search Engine & Data Discovery Portal Enhanced Data Discovery Options Search by PSU College/Dept/Center/Institute

  13. Enhanced Data Discovery Options Search by PSU Researcher

  14. Enhanced Data Discovery Options Search by Research Theme

  15. Search Results

  16. Researcher: Gabrielle Alpirez de Davie, Education Data: Validity of *ONET Work Importance Profile web version for Spanish speaking populations • Researcher: John Reichendorfer, OPP, Tom Flynn, OPP Landscape Data: Aerial Photography, Tree Database, PSU vector data • Researcher: Dennis Decoteau, Horticulture, Data: Ambient air monitoring for Pennsylvania • Researcher: Marc Abrams, Department of Ecosystem Management Data:Impacts of contrasting land-use history on composition, soils, and development of mixed-oak, coastal plain forests on Shelter Island, New York • Researcher: Kim Steiner, Department of Ecosystem Management Data: Oak Forest Regeneration • Researcher: Dr. Robert P. Brooks, Geography Department, Riparia Data: Pocono Birds--Presence & Proportion on Lakes • Researcher: Dr. Eric Post, Department of Biology Data: Trophic Mismatch—Caribou Phenology • Researcher: Andrew Patterson, Huck Life Sciences Institute Data: Metabolomics, Ant Tissue

  17. Data Summary Page:Downloadable Data

  18. Data Summary Page: App

  19. Links to Data in Application Links to Data in Thematic Database Apps & Tools

  20. Data Summary Page: Multiple Options Multiple Data Viewing/Download Options GIS Enabled Data

  21. Case Studies: Metabolomics Data Currently hosting data for the Center for Molecular Toxicology and Carcinogenesis. Data and protocols can be accessed from Data Commons and from the Metabolomics Explorer.

  22. Case Studies: Arboretum at Penn State Collaboration across departments DataCommons is working with the PSU Arboretum and OPP to acquire and provide access to data via the DataCommons as well as an interactive application. Goals are to provide both access to data and usable apps for the public, for teaching at PSU, and for research. The PSU campus as a living lab!

  23. Findings • Need for preservation long term. • Need for plan to transition or upgrade to new versions of software. • Need for metadata education. • Curation of sensitive data. • Large datasets—video, astronomical observations, remotely sensed data need to be housed and preserved.

  24. Findings continued… • Data storage for projects that already provide public access to data but need a centralized permanent home. • Need for data to be accessed by multiple interfaces. • Identity can be important to some providers. • Cross campus workgroups that have common data, platform/software needs but no place to store the data.

  25. Questions?

More Related