10 likes | 96 Views
NPP. Codes. Species. Location. USDA Plants. Subsite. Integrating Grasslands ANPP Data. Lessons Learned. Jincheng Gao (KNZ), Nicole Kaplan (SGS), Judith Kruger (KNP), Ken Ramsey (JRN) , Mark Servilla (NET), Kristin Vanderbilt (SEV) LTER Information Managers.
E N D
NPP Codes Species Location USDA Plants Subsite Integrating Grasslands ANPP Data Lessons Learned Jincheng Gao (KNZ), Nicole Kaplan (SGS), Judith Kruger (KNP), Ken Ramsey (JRN) , Mark Servilla (NET), Kristin Vanderbilt (SEV) LTER Information Managers Judy Cushing, Carri LeRoy, Juli Mallett, Lee Zeman The Evergreen State College, Olympia, WA Computer Scientists & Data Analysts Christine Laney (JRN), Alan Knapp (SGS), Daniel Milchunas (SGS), Esteban Muldavin (SEV) LTER Ecologists The Goal: Integrate ANPP data, with its drivers (contextual data), for cross-site comparisons, past and future. What can we say? What We've Done Why do it? • Common ANPP data model and database, incl. structures for provenance for 4,126,700 grams measured over 20 years in 1697 plots. • Scripts to auto-process site data for JRN, SEV, SGS; nearly done for KNZ, Kruger. • Preliminary provision of contextual data. • Preliminary ecological analysis. • Planning for future…. A = 0.1112 p < 0.0001 1 JRN 2 SEV 3 SGS ANPP Drives Ecology Processes ! Climate, Soils, etc. Carbon Intake & Outflow, Species Distribution Plant communities (based on ANPP) differ among LTER sites (duh?) These differences are correlated with temperature (r = 0.819) & precipitation (r = 0.548) CART Model Classification and Regression Tree Model R2 = 0.642!! Preliminary Results A = 0.1821 p < 0.0001 Variables included in model: LTER, year, PDSI, NH4, NO3, absTmax, asbTmin, Tmax, Tmin, Tmean, Precip • It’s not a just technology problem; need ecology & statistics • a. Differences in experimental design really matter, e.g., experimental replicates, regression through zero (or not). • b. It’s essential not to confuse statistical with ecological outliers. • Species tables and plant associations are (rightfully) highly site-specific & change over time, but not all are yet digitally processable. In any case, we need common coding of these concepts. • Data issues that are easy to resolve with a few data sets are prohibitive with many, or when re-processing; more sites, more years mean trouble. Shear volume of data makes qualitative difference. • Need to provide error coding and both static provenance and dynamic backtracking to original data, incl. species codes. • Contextual data provision (weather, climate, soils, etc.) neither obvious nor trivial. Why it's not easy • Possibly ‘new’ ecology results. • Environmental drivers of ANPP. • Changes in ANPP-based grassland community composition over time. • Preliminary definition of contextual data – Ecotrends. • Information Management: site-specific species tables, ideas for better experimental design documentation, scripting for data integration, Plants DB. • Data integration at the physical level is necessary, but not sufficient, for ecological synthesis (to be written and submitted to DILS 2008). Next Steps • Ecology • a. Write up and publish two ecology papers (outlined) • b. More work on ANPP drivers, wit more contextual data. • Information Management • a. Add more sites (KNZ, Kruger, …) • b. Set up process for future data integration & distribution • c. Refine error code reporting, change logs • Write up and publish Computer Science results as a case study in data integration (DILS 2008). • Make a Project Web Site • Seek funding for further work? Funded by NSF DBI-0417311, CISE 01-31952, BIR 99-75510,and the LTER Network Office. For more information contact: judyc@evergreen.edu