400 likes | 496 Views
Draft Ambient Water Quality Guidelines for Sulfate In British Columbia. Water Protection & Sustainability Branch October 13 th , 2011. Overview. Review process Background concentrations WQGs vs. WQOs Data requirements Process for Sulfate guideline Maximum likelihood estimates
E N D
Draft Ambient Water Quality Guidelines for Sulfate In British Columbia Water Protection & Sustainability Branch October 13th, 2011
Overview • Review process • Background concentrations • WQGs vs. WQOs • Data requirements • Process for Sulfate guideline • Maximum likelihood estimates • Model Averaging • Questions sent from MABC
Table 1. Summary of ambient dissolved sulfate concentrations in BC freshwaters. Sulfate background concentrations in BC
Water Quality Guidelines • Science-based and intended for generic provincial application • Protect most sensitive species and life stage during indefinite exposure • All higher components of the aquatic ecosystem (e.g. algae, macrophytes, invertebrates, amphibians, fish) are considered if the data are available. • Maximum – protect against lethal effects • 30-day average – protect against sub-lethal effects
Water Quality Objectives Refinement of BC-approved & working WQGs Developed to protect the most sensitive water use, at a specific location, taking local circumstances into account
What are WQGs and WQOs used for? Input to permits, licenses, orders and regulations As benchmarks To report to the public on water quality To determine if remedial action is necessary To promote water stewardship
Data Requirements • Acute (max) & chronic (30-d avg) • Fish • ≥ 3 freshwater species resident in BC, ≥ 2 cold-water (e.g. trout) • Invertebrates • ≥ 2 invertebrates from different classes, 1 planktonic species resident in BC (e.g. daphnid) • Plants • ≥ 1 freshwater vascular plant or algal species resident in BC • Amphibians • Highly desirable
Preferred endpoints • Most appropriate ECx/ICx representing a low-effects threshold > EC15-25/IC15-25 > LOEC > MATC > EC26-49/IC26-49 Uncertainty factors (typically between 2-10) • Decided on case-by-case basis based on data quality and quantity, toxicity of the contaminant, severity of toxic effects, bioaccumulation potential and scientific judgement.
Sulfate WQG development • Literature review • Conduct chronic toxicity tests (PESC) • Rainbow trout • Chinook salmon • Fathead minnow • algae • Hyalella • Freshwater mussel • Bullfrog • *note all species were tested at 50, 100, and 250 hardness
Sulfate WQG development • Later we received Elphick et al. (2011) data • Rainbow trout (15 hardness) • Coho salmon (15 hardness) • Ceriodaphnia(40, 80, 160, 320 hardness) • Rotifer (40, 80, 160, 320 hardness) • Hyalella(80 hardness) • Fathead minnow (40, 80, 160, 320 hardness) • Tree frog (15, 80 hardness) • Algae(10, 80, 320 hardness) • Moss (15 hardness)
Sulfate WQG development • Statistical analysis (MLE) of toxicity data from PESC and Elphick et al. (2011) • Reviewed hardness and sulfate toxicity data with species and endpoints tested • No consistent relationship between water hardness and sulfate toxicity (similar results reported in Elphick et al. 2011). • MLE results sent to MABC for review – concern with model choices and sensitivity of model choice at low-effect concentrations – model averaging
Maximum Likelihood Estimates Method to fit curves to data (e.g. dose-response of sulfate vs. mortality) Quantal responses (mortality) - use Probit model Continuous responses (e.g. growth) - Isotonic regression (ICPIN), or 3-p log-logistic After fitting model, assess goodness-of-fit (residual plots, etc) Use fitted curve to estimate LCxxvalues. Caution advised if extrapolating to very low effect LCxx endpoints, e.g. LC01 or LC001 values. MLE extracts all information from data but must choose appropriate models
Model Averaging • Problem • Different models can lead to (greatly) different estimates of LCxx • Relying on a single model's estimates can be misleading. • Solution: • Fit several models • Find relative support for fit of models to data using AIC = trade-off of fit and complexity • Weighted average of model estimates of LCxx incorporates model uncertainty
BC Draft Sulfate WQG New chronic guideline 65 mg/L - based on 28-day LC10 of rainbow trout of 127 (47-342) mg/L with minimum uncertainty factor of 2. Increased maximum guideline from 100 to 250 mg/L – based on LC50 data (C. dubia, D. Magna, Hyalella) with minimum uncertainty factor of 2.
BC Draft Sulfate WQG Water hardness may decrease the toxicity of sulfate for some endpoints and species; however, no consistent relationship was found. Site-specific water quality objectives using site water would be more appropriate for determining if the ion composition decreases, increases or has no-effect on sulfate toxicity to organisms in a particular water body. The development of site-specific water quality objectives to take local conditions into consideration is done with consultation with Ministry of Environment staff.
MABC Questions received 1) “Early lifestage tests using trout Test validity: We would like to discuss control performance of the early life stage rainbow trout tests. We do not believe that the control in the soft water test passed a reasonable test performance criterion for this type of test. Furthermore, we are concerned that the poor survival in the soft water, and the high degree of variability in this test is indicative of a stressed population of test organisms, and that these tests should be rejected because of QA/QC concerns.”
Response (Craig Buday - Pacific & Yukon Laboratory for Environmental Testing (PESC)) The acceptable cumulative control mortality cannot be > 35% (65% survival or better is OK.) For the R. trout eyed egg test there was 27% cumulative mortality or 73% survival which passes the validity criteria. Environment Canada 1998. Biological Test Method: Toxicity Tests Using Early Life Stages of Salmonid Fish (Rainbow Trout), EPS 1/RM/28 second edition.
MABC Questions received 1) “Early lifestage tests using trout Statistical analysis: We would like to discuss the statistical power associated with the early life stage tests using rainbow trout. Specifically, we do not believe that the test had sufficient power to detect a 10% deviation from the control with a reasonable degree of confidence, and that the LC10 value reported does not make sense in the context of the dataset.”
Response (Carl Schwarz, P. Stat., SFU) The sulfate levels vary considerably from control levels, e,g. up to around 2000. Over the entire range of measured sulphate level, the probit model (allowing for overdispersion) had a statistically significant slope (p=.0016 for hardness 50; p=.0252 for hardness 100; p=.0247 for hardness 200) so an effect of sulfate level over the range of doses was detected. Once model is fit using ENTIRE dataset, you can extrapolate back to LC10 values despite there being large variability in raw data.
MABC Questions received 3) “Tests using the freshwater mussel Statistical analysis: We would like to discuss the effect levels reported for the soft water test using the freshwater mussel. Specifically, we do not believe that the LC10 reported in the Draft document is supported by the data because the test does not have sufficient power to detect that level of effect with a reasonable degree of confidence, and the statistics used did not account for background mortality in the control from that test. The LC10 value reported does not make sense in the context of the dataset, and we believe that the LOEC, MATC, LC25 or LC50 might be a more appropriate value.”
Response (Carl Schwarz, P. Stat., SFU) • Mussel "controls" had sulfate levels of 151 (at hardness 250); 60 for hardness 100; 30 for hardness 50. "Control" mortality was 10% at hardness 50 (1 of 10 died); 10% at hardness 100 (1 of 10 died); and 0 (0 of 10 died) at hardness 250. • We did not fit a model with a natural response because the data are simply too sparse to fit such a model with very small sample sizes, the control doses are not at 0 sulphate and virtually no changes over the sulphate levels presented. Consequently, it is extremely difficult to know if the mortality observed at the "control" doses are effects of the "sulfate" or actual mortality. This is reflected in the very wide confidence limits for the LC10 which indicates that it cannot be estimated very well. • The CETIS printouts fitted the linear interpolation model which assumes that the mortality rate at the control dose is known with "certainty", but again, this is likely not true because of the very small sample sizes (only 10 organisms on test).
Response (MOE) • Preferred endpoints: • Most appropriate ECx/ICx representing a low-effects threshold > EC15-25/IC15-25 > LOEC > MATC > EC26-49/IC26-49 • We report the confidence intervals for all the estimates in the guideline.
MABC Questions received 4) Tests using the Pacific Tree frog Statistical analysis: We would like to discuss the effect levels reported for moderate hardness test (80 mg/L) using the Pacific Tree frog. Specifically, the analyses used did not account for the background mortality in the control from that test and the LC10 value reported does not make sense in the context of the dataset. We believe that the point estimates reported by Elphick et al. (2011) for this test were calculated appropriately for this test.
Response (Carl Schwarz, P. Stat., SFU) Only 2 hardness levels used. At hardness 15, the control sulfate is given as "1" - was this real or merely a placeholder for data entry? No observed mortality observed (0 out of 15 on test). At hardness 80, the control sulfate level is 93 and 2/15 mortalities observed. We fit a model where no threshold effect was observed at either hardness level, and the 2/15 mortalities at sulfate level 93 is not unreasonable with the fitted dose response curve. A model with a control threshold was not fit because of the sparseness of the data. If you look at the raw mortality numbers, the effect of sulfate at hardness 80 appears to be "worse" than at hardness 15 as the total mortalities tend to be higher in general at comparable sulfate levels. This might be the result of a threshold taking effect at the higher hardness levels, but is difficult to discern because the control doses are too different between the 2 studies. This is also the species where the CETIS printouts use a control threshold in 1 hardness level and not the other hardness level. Need to consider all hardness levels simultaneously. For example, is it biologically reasonable to have no natural response at hardness 15 and a natural response at hardness 80?
MABC Questions received 5) “Tests using Hyalellaazteca Statistical analysis: We would like to discuss the effect levels reported for growth of Hyalellaazteca. We do not believe that these tests are sufficiently robust to calculate an IC10 with a reasonable degree of confidence. It should be noted that the data are contradictory to information from other tests performed by Nautilus, which indicate a lower sensitivity to sulphate in higher hardness using both growth and survival endpoints.”
Response (Carl Schwarz, P. Stat., SFU) In these studies the observed response "increased" from baseline and then decreased. We tried a variety of models, but the most suitable was separate Isotonic Regression model for each hardness level which gave estimates of 1326 at hardness 50, 645 at hardness 100; and 333 at hardness 200. CETIS fit a 3P log-gompertz (IC10=683) at hardness 200; a ICPIN=Isotonic Regression (IC10=638) at hardness 100; a ICPIN=Isotonic Regression (IC10=1321) at hardness 50. Our results are identical to CETIS except for the very hard water, but all our estimates have very wide confidence limits which are reported in the guideline. We didn't have access to this other dataset, but it could be integrated into the analysis if available.
MABC Questions received 6) “Tests using fathead minnows (Nautilus data) Statistical analysis: We would like to discuss the recalculated LC10 value for 80 mg/L water hardness (LC10 of 426 mg/L sulphate); this value makes no sense in the context of the dataset, in which there is no deviation from the control response at up to 1300 mg/L sulphate.”
Response (Carl Schwarz, P. Stat., SFU) Raw data has control dose of 37 for sulfate at hardness 40 with 1/30 mortality; control dose of 74 for sulphate at hardness 80 with 3/30 mortality; control dose of 130 for sulfate at hardness 160 with 1/30 mortality and control dose of 300 for sulphate at hardness 320 with 0/30 mortality and EC10 1555; Two top models are the separate probit model (model weight 0.53) and the monotonic effect of hardness model (model weight of 0.47). Neither has a natural response. Estimated LC10 were 301 for hardness 40; 426 for hardness 80; 1074 for hardness 160; 2318 for hardness 320. CETIS EC10 were 2450 for hardness 320 (with a 0% threshold); 3231 for hardness 160 (with a 10% threshold) 1555 for hardness 80 (with a 10% threshold) 558 for hardness 40 (with a 3% threshold); The key differences was the use of the threshold by CETIS and no threshold by us. The CETIS thresholds don't vary in a consistent fashion, i.e. threshold goes from 3% to 10% and down to 0%. Is this a sensible thing for thresholds? Given the non-zero sulphate levels at "control" doses the observed mortality is consistent with an effect of sulphate rather than a threshold.
MABC Questions received 7) “Selection of point estimates - General We would like to discuss problems with calculation and use of tenth percentile effect levels from tests that allow 10 or 20 percent effect in the control as acceptable, and in which the minimum significant difference that is statistically detectable is typically in the range of 20 to 30%. In such tests, variability precludes calculation of 10th percentile estimates with a reasonable degree of confidence.”
Response (Carl Schwarz, P. Stat., SFU) • Yes, there can be problem where the threshold effects are large and the control "doses" are not zero. Are these effects of the sulfate or natural responses? • Yes, trying to estimate small changes from an ill-determined baseline is problematic and likely highly model dependent, but don't forget that the ENTIRE dataset is used to fit the curve and so a reasonably well fitted dose-response curve does provide some (but not overwhelming) information about small effects. The alternative is larger experiments. • Caution would be advised if estimating very small effects, e.g. LC01 or LC001, but LC10 is well within the range of observed data. • Wide confidence limits would indicate poor estimates. Model averaging would account for estimates that are highly model dependent.
MABC Questions received 8) “We would also like to discuss inconsistencies between MoE guidance on deriving water quality guidelines, which suggest that "the lowest observed-effect concentration (LOEC) or EC(low-effect generally thought to be EC15-20) from a reliable chronic exposure study, preferably on sensitive native BC species, are selected.".
Response (MOE) • Preferred endpoints: • Most appropriate ECx/ICx representing a low-effects threshold > EC15-25/IC15-25 > LOEC > MATC > EC26-49/IC26-49 • We will clarify in the Derivation document
MABC Questions received 9) “Selection of statistical methods - General We would like to discuss the use of linear interpolation for determination of point estimates from data sets with continuous data, since this approach is considered less appropriate than non-linear regression by Environment Canada".
Response (Carl Schwarz, P. Stat., SFU) • Yes, there are problems in using the linear interpolate (isotonic regression) method. Note that CETIS often chooses this type of model as well (!). One problem is that there is no possibility to extrapolate below the smallest observed dose nor above the largest observed dose. • Another problem is that the method implicitly chooses the observed response at the lowest dose as the baseline against which effects are to be determined, i.e. the EC10 is 10% below baseline. If the control dose is far from "0" this may not be appropriate. • We modelled continuous measurements (such as weight) using several different models (the 3 parameter logistic) but these may not fit as well. There are hundreds of other models that could be fit, but regardless of the model, extrapolation before the smallest observed dose must be taken with a grain of salt as there is no data available. • Estimates of moderate effects, e.g. LC25 or LC50 likely don’t depend very much on model (assuming doses cover the endpoint). Estimate of LC10 is more model dependent, but still within the range of observed doses.
MABC Questions received 10) “Role of water hardness in modifying acute toxicity: We would like to discuss the role that water hardness plays in altering the acute toxicity of sulphate. Toxicity test results from acute toxicity tests are consistent with the conclusion that water hardness reduces toxicity of sulphate (with the sole exception being Chironomids, which are insensitive to sulphate), and we would like to discuss why water hardness was not incorporated into the maximum guideline for sulphate".
Response (MOE) BC Water Quality Guidelines intended for generic provincial application Protect most sensitive species and life stage during indefinite exposure We did not find a consistent relationship with water hardness and sulfate toxicity
MABC Questions received 11) “Role of water hardness in modifying chronic toxicity: We would like to discuss why it is necessary that decreasing toxicity with increasing water hardness is observed with all test species. As long as sensitive test organisms are protected across the full range of hardnesses, it should not matter is some less sensitive species do not show less sensitivity to sulphate at higher hardness.".
Response (MOE) • BC Water Quality Guidelines intended for generic provincial application • Protect most sensitive species and life stage during indefinite exposure • We did not find a consistent relationship with water hardness and sulfate toxicity
MABC Questions received 12) “Role of total dissolved solids in toxicity to Ceriodaphnia: We would like to discuss adverse effects observed with Ceriodaphnia in 320 mg/L water hardness, and the relevance of this datapoint to setting water quality guidelines for sulphate".