110 likes | 203 Views
Data Management Needs and Challenges for Telemetry Scientists. Josh M London Wildlife Biologist, Polar Ecosystems Program National Marine Mammal Laboratory NOAA NMFS Alaska Fisheries Science Center. Temptation to identify biologists as the source for the raw data.
E N D
Data Management Needs and Challenges for Telemetry Scientists Josh M London Wildlife Biologist, Polar Ecosystems Program National Marine Mammal Laboratory NOAA NMFS Alaska Fisheries Science Center
Temptation to identify biologists as the source for the raw data
The Tip of a Complex Iceberg Publications Contract reports Status/Listing Review Narrowing Bottleneck Many biologists lack the skills and training for effective, scalable database design and data management practices derived products movement model synthesis data quality control Data Management Deployment of tags (location, age/sex, time) tag design/vendor tag programming Field Work and Study Design opportunistic vs. planned hypothesis agency needs/mandates funding initiatives
Field Work & Tag Deployment • When? Where? • Which Tag/Vendor? • Which Age? Which Sex? (Do we have a choice?) • Tag Programming • Deployment Length (attachment type)
Limited Tools for Managing Raw Telemetry Data needs Explore ‘raw’ data Address hypotheses Visualize movement/use Synthesize w/ dependent (e.g. health, age) and independent data (e.g. other animals, remote sensed) ‘raw’ data • via Argos as CSV/Text • Process w/ Vendor Software (behavior data) • Typically output as CSV • Field data about animal (e.g. ID, species, sex, age, health)
Biologists Not Trained in Large Scale Data Management Data Manager Postgres/PostGIS, Oracle, MySQL, SQL Server Normalization and Efficient Design Scripting, Jobs, Transactions Data Integrity Automation, Reproducible Biologists • Excel and/or Access • ESRI ArcMap (shapefiles) • Google Earth • Mouse Click Interaction • Programming (visual basic, R, python) recipe driven … not developers
My Perspective To address complex questions related to marine mammal telemetry and understanding animal ecology, I had to become more of a data manager …And, in the process, I’ve become less of a biologist Current System Start (2006) • Argos Monthly CDs • SatPack Access Database • Excel Files (limited to 56k) • Large, Flat Tables • No Central Repository • Nightly FTP Argos Push • Nightly Data Processing • CSV/External Oracle Table • PL/SQL Procedures • Developed/Designed with Training via Google Search
My Perspective Current Limitations • Data access requires a minimum level of technical skills (basic SQL, Oracle framework, Oracle APEX, R spatial tools, ArcMap) • Single Point of Access/Failure (me) • Limited Documentation of Design • Design May Not be Optimal/Appropriate • Main Objective to Provide Data to Analysts – Not necessarily designed for providing data to public
My Perspective Greatest Needs – Research Program • Data Management and Design Consultation • Data Design & Documentation Portal(user-friendly metadata) • Low Tech Exploration Tools • Database and Application Developers (data flow and data input) • Training Opportunities
My Perspective Greatest Needs – External to Program? • Provide Meaningful Public Access to Data • A Clear Data Sharing Policy w/ Best Practices • Encourage/Facilitate Scientific Collaboration • Meet Agency Needs and Requirements • How to Communicate Scientific Knowledge in the Modern/Digital Age–sharing knowledge/expertise just as important as sharing data • Publish Data Once
My Perspective Challenges / Road Blocks • Limited Funds and Priorities – appropriate resources for doing the priority analysis and science not available, let alone the resources to distribute data responsibly • Database design/management often in the hands of the least skilled users • IT Policies, Investments, and Infrastructure Varied Across Institutions • No standard(s) for communicating and sharing ‘raw’ animal telemetry data. What is ‘raw’ data?