380 likes | 788 Views
New Roles for Librarians: the Application of Library Science to Scientific/Technical Research -- Purdue University - a Case Study International Council for Science and Technology – ICSTI Ottawa, Ontario, Canada James L. Mullins, PhD Dean of Libraries & Professor Purdue University
E N D
New Roles for Librarians: the Application of Library Science to Scientific/Technical Research -- Purdue University - a Case Study International Council for Science and Technology – ICSTI Ottawa, Ontario, Canada James L. Mullins, PhD Dean of Libraries & Professor Purdue University June 9, 2009
e-Science • What is meant by e-Science? • E-Science is a complex interdisciplinary, data, and computationally-intensive, and often multi-institutional and many times international research process that is changing the methodology of science. • Such collaborative scientific enterprise requires access to large data collections, very large scale computing resources and high performance visualization back to the individual user scientists • Requires large scale storage, retrieval and transfer
What is Data Curation? • Archival Science - manage implies short term, archive implies long term preservation with control (dependent) • Library Science - organization, description, discovery, navigation and access are critical for identifying, finding, accessing, and using/re-using data
Gregor Mendel – an example James L. Mullins – Purdue University
Innovative Research Concepts National Science Board. Long-lived digital data collections: Enabling research and education in the 21st century.
Innovative Research Concepts • Data Authors – benefit from their own work, broadly disseminated, safely archived. • Data Managers -- collaborates by insuring successful retention and dissemination through technical infrastructure • Data Scientists – conduct creative inquiry and analysis, enhance the research of data authors National Science Board, Long-lived digital data collections: Enabling research and education in the 21st century, p. 27.
Innovative Research Concepts Data Scientists: … crucial to the successful management of a digital data collection – lie in having their contributions fully recognized National Science Board, Long-lived digital data collections: Enabling research and education in the 21st century, p. 27.
National Science Foundation Recognition of the Challenge for Data Curation Dr. Christopher Greer Former Program Director Office of Cyberinfrastructure, NSF, USA
Why curate/manage research outputs? Re-use of data for new research, including collection-based research to generate new science. Retention of unique observational data which is impossible to re-create. More data is available for research projects. Compliance with legal requirements. Ability to validate research results. Use of data in teaching. For the public good. From: e-Science Curation Report. Data curation for e-Science in the UK: an audit to establish requirements for future curation and provision. Philip Lord and Alison Macdonald. 2003
Computational & Information Sciences Cyber Infra- structure Computer Science Lib/Info Sciences Archival Sciences Conceptualization by Chris Greer, NSF – 2007 Domain Science I-Center
Data Context unpublishedresearchtraditional/non “published”researchnon-traditional secondarytertiaryresources “published”data/datasets publishedresearchtraditional analyzeddata/datasets The changing nature of research and scholarly communication in cyber enabled environments allows for discovery of and access to research of small research groups and unorganized, disparate and heterogeneous data further upstream than previously imagined… processeddata/datasets “raw”data/datasets Modified from: Brandt, D.S. “Scholarly Communication” (in To Stand the Test of Time: Long-Term Stewardship of Digital Data Sets in Science and Engineering.: Final Report of Workshop New Collaborative Relationships: Academic Libraries in the Digital Data Universe. ARL, Washington, DC, September 2006.)
Purdue University • Founded 1869 by gift from John Purdue • Premier programs: engineering; agriculture; hospitality and tourism; business; computer science; communications. Alumni include Neil Armstrong & Gene Cernan, first and last men on the moon. • 40,105 students 2008/2009, third largest international student enrollment in U.S. – 6,057 for 2008/09. Fall 2009 12% of first year undergraduate class will be international students, 60% graduate school international.
Purdue University Nine Colleges: Agriculture, Consumer & Family Sciences, Education, Engineering, Liberal Arts, Management, Pharmacy/ Nursing/Health Sciences, Technology, Vet Medicine 73 Departments, several cross-disciplinary: e.g. Agricultural & Biological Engineering
DiscoveryPark Interdisciplinary collaboration Discovery Park: Eleven interdisciplinary centers designed to facilitate and promote leading edge research Bindley Bioscience Center - Birck Nanotechnology Center Burton D Morgan Center for Entrepreneurship Center for Advanced Manufacturing Center for the Environment - Cyber Center Discovery Learning Center - E-Enterprise Center Energy Center - Oncological Sciences Center Regenstrief Center for Healthcare Engineering
Purdue’s HUBzero • HUBzero™ allows creation of dynamic web sites that connect a community in scientific research and educational activities, e.g. nanoHUB (from: http://hubzero.org/) • 2009 – Google lists top 200 most trafficked university sites in the world on the web – Purdue ranked 6th.
Envisioning New Interdisciplinary Collaborations Associate Dean for Research, D. Scott Brandt, Professor of Library Science Facilitates individual and interdisciplinary research efforts of the fifty Libraries faculty
And remember Data Scientists? Data Scientists: … crucial to the successful management of a digital data collection – lie in having their contributions fully recognized National Science Board, Long-lived digital data collections: Enabling research and education in the 21st century, p. 27.
And remember Data Scientists? Jacob (Jake) Carlson, appointed Data Research Scientist Purdue Libraries April, 2007
Determine need for collaboration • Hypothesized that researchers have data management needs and that librarians can help meet them • Employed top-down and bottom-up investigation for data collection • Verified: PU researchers said they need help in collecting, organizing and providing access to their data
Outside of the library • Attended research seminars, call-outs, etc., to identify collaboration and funding opportunities • Built relationships - found researchers who understood that collecting, organizing and providing access to data and information are not only important, but critical • Found problems to solve, then collaborated on solutions • Talked about what we know—organizing data and information (different meanings to different groups) • Brought something to the table. Had to be prepared to demonstrate something tangible (initially a proof-of-concept or a prototype).
Motivation (library participants) • Directly related to work, and makes something difficult easier • It’s an extension of “everyday job” • Something new and exciting to do • Breaking new ground, want to contribute to interdisciplinary initiative • Force the issue of how it gets done (i.e., more people added to help out)
Motivation (non-participants) • Articulation of what is expected by the Dean • Partly determined on a case-by-case basis • Has to be “interesting to me” • Something that uses “the skills I can bring to it” • Need to get credit for it (recognition, reward) • Important to allow individual to define what interdisciplinary research is • Should be opportunities to "stick your toe in the water" before making big commitment • Need time to do it, and to do the “things I want to do”
Purdue University Libraries Since 2004, initiative for Libraries faculty to collaborate with other faculty across campus—apply library science knowledge and expertise to research problems: collect, organize, describe, curate, archive, disseminate data/information
Discovery Learning Center Earth & Atmospheric Sciences Economics English IT at Purdue Mechanical Engineering Technology Regenstrief Center Graduate School Oncological Sciences Agricultural Economics Agronomy Biology Cancer Center Center for the Environment Chemical Engineering Chemistry Civil Engineering Cyber Center Past & Current Areas of Collaboration
Distributed Data Curation Center – D2C2 • Sustainability for data curation repositories • Ontological and taxonomic organization of disciplinary datasets • Metadata to facilitate access to data • collections • Data curation profiles for archiving and preserving datasets • http://d2c2.lib.purdue.edu/
D2C2 Sponsored Research 2008 Awarded—$618,383 (includes cont’ing) INDURE—aggregating dissertation metadata (Indiana Economic Development Corporation), Witt Enabling end-to-end geospatial data modeling workflows via INPort: The Isotope Networks Portal (NSF), Miller Integrating Spatial Educational Experiences (ISEE) into Crop, Soil, and Environmental Science Curricula (USDA), Miller, M. Bracke INTEROP: Developing Community-based DRought Information NetworkProtocols and Tools for Multidisciplinary Regional Scale Applications(DRInet) (NSF) Carlson Pending (or in process for submission) Nitrogen Science Network (Packard Foundation) M. Bracke, Witt lisHUB: Investigating Community-based LIS Continuing Education in a Cyberinfrastructure-enabled Environment (IMLS) P. Bracke, et al Object Reuse and Exchange for HUBZero (IMLS) Witt, et al AfricaHUB (Bill and Melinda Gates Foundation) Nelson Kollêma DataNet Project (NSF) Mullins, Carlson, Brandt
Grants(all Librariessince 2005) 2007 – 44 2008 – 57
D2C2 Activities Lab in STEW G64 D-Space -> Fedora Storage Resource Broker -> iRODS Sun StorageTek 5800 Current Research Information Systems (e.g., INDURE) – SRU, Z39.50 HUB integration (OAI-PMH, Handle, ORE) Metadata management and services
Purdue e-Data Task Force • Chartered by the Purdue Libraries Research Council • apply research from D2C2: what an institutional data repository service in context of Purdue Libraries • task force chartered: July 2008 to March 2009 • three tasks: • complete a data repository prospectus • work with faculty and subject-specialist librarians in six different areas to ingest six different datasets into the current Purdue e-Data prototype • report findings and recommendations back to the Research Council
Working Group Methodology interview faculty selection & appraisal create or enhance metadata obtain and ingest dataset determine and create appropriate points of access policies: submission, use, and preservation follow-up with faculty examine role of librarians and the Libraries vis-à-vis an institutional data repository service
Ten Questions to Begin a Conversation With Faculty About Data Curation What is the story of your data? What form and format are the data in? What is the expected lifespan of your data? How could your data be used, reused, and repurposed? How large is your dataset, and what is its rate of growth? Who are potential audiences for your data? Who owns the data? Does the dataset include any sensitive information? What publications or discoveries have resulted from the data? How should the data be made accessible? Witt, M. & Carlson, J. (2007). Conducting a data interview. http://docs.lib.purdue.edu/lib_research/81/.
Examples “Indiana Water Quality” Group member = Chris Miller Doctoral Student = Cristina Carbajo Subject Librarian = Marianne Stowell Bracke “Survey of Indiana Assistive Technology Professionals” Group member = Jake Carlson Faculty = Bart Bishop, Center for Assistive Technologies Subject Librarian = Jane Kinkus “Controversial Facilities in Japan” Group member = Michael Witt Faculty = Daniel Aldrich, political science Subject Librarian = Bert Chapman "Vehicle Signatures" Group member = Mark Newton Faculty = Darcy Bullock, civil engineering Subject librarian = Megan Nelson
Emerging issues include... Reward structure, role of data in scholarly communication Trust Sustainability (both economic and technological) Roles Long-term preservation Access (presenting data in an appropriate context) Metadata (organization and description of data) Persistence Provenance Ingest and scale Intellectual property and permissions Policies
“100 conversations, lead to 20 discussions, lead to 5 grants, lead to 1 award
Thank you! Questions and Answers? James L. Mullins – Purdue University jmullins@purdue.edu