130 likes | 230 Views
Symposium on Digital Curation in the Era of Big Data: Career Opportunities and Educational Requirements: A Data Scientist Perspective. Dr. Vicki Lynn Ferrini Lamont-Doherty Earth Observatory. Background ( What I do ). Data Documentation (Metadata) Data Management
E N D
Symposium on Digital Curation in the Era of Big Data: Career Opportunities and Educational Requirements:A Data Scientist Perspective Dr. Vicki Lynn Ferrini Lamont-Doherty Earth Observatory
Background (What I do) • Data Documentation (Metadata) • Data Management • Data Discovery & Access Tools • Develop/Implement QA/QC • Data Syntheses • Data Compliance Tools • Education Materials • Delivery to National Data Centers, Libraries • Data Publication & Links to Scientific Literature • Data Integration, Visualization & Analysis Tools • Best Practice Guidelines for Optimizing Acquisition “Support, sustain, and advance the geosciences by providing data services for observational solid earth data from the Ocean, Earth, and Polar Sciences.” rvdata.us
Scientific Data Continuum THEN Data Consumers Data Consumers Data Producers Data Producers Scientific Literature Data Providers Scientific Literature NOW Varying Goals/Perspectives/Needs
Perspective of Data Producers • Goal: Scientific Discovery • Data Acquisition& Reduction • Data Assembly • Visualization, Integration & Interpretation • Scientific Standards • Technical & Operational Limitations • Data documentation • Varies by domain • Often difficult • Heterogeneous Domain Specialists
Perspective of Data Consumers • Goal: Discovery • Data Discoverability & Access • Cross-disciplinary • Scientific Standards • Interpretation • Increased importance of documentation • Data not self-generated • Data Quality/Reliability • Data Use/Misuse Domain Specialists & Public
Perspective of Data Providers • Goal: Access/Preservation/Re-Use • Data Formats & Standards • Data Documentation & Preservation Techniques • Scientific & Metadata Standards • Data Citation • Data Transfer Mechanisms • System Usability • Interoperability/Linked Data • Needs of Diversity of User Community • Knowledge of Content Human & Digital Bridge between Producers & Consumers
At the Intersection:The Data Scientist Data Producers Data Consumers Data Providers
Data Stewardship Continuum Data Scientist DATA PRODUCERS DATA PROVIDERS DATA CONSUMERS
Key Attributes of Data Scientists • Knowledge spanning full scientific data stewardship continuum • Domain Experience • Content & applications • Data acquisition & reduction practices • Nuances of Data • Technical knowledge • Evolving Technologies • Data Acquisition & Management • Metadata
Key Attributes of Data Scientists • Other skills (seldom taught) • Communication & Organization • Understand cultural aspects of user community • People/Project Management • Balance between micro- and macro- perspectives
Key Attributes Tech Team Members • Basic knowledge of content OR interest/curiosity • Experience with Data Production/Consumption • Technical skills: • web development & technology • geospatially enabled data management tools • experience with data analysis tools • ability to work in a variety of tech environments • Complementary skill sets • Innovation & creativity • Willingness to ask questions – assumptions can be dangerous
Challenges & Opportunities • Difficult to find right balance between technical skills and interest in content • Team dynamics, management approaches evolving • Increasing opportunities to engage/educate computer scientists in domain science • Data producers are slow to join the digital era • Educational opportunities • Scientific benefits continue to grow • New generation incorporating data sharing into scientific workflow • Difficult to keep pace with evolving technologies • Educational & Professional Development opportunities
The Future? Data Scientists Data Producers Data Consumers Data Providers