The Test for Significant Toxicity (TST) – A “New” Hypothesis Testing Approach for Aquatic Bioassay Testing

The Test for Significant Toxicity (TST) – A “New” Hypothesis Testing Approach for Aquatic Bioassay Testing Philip Markle Environmental Scientist pmarkle@lacsd.org

History of the TST • June 2010 – EPA released WET TST guidance (EPA 833-R-10-003) • Also referred as: • Bioequivalence Testing • Alternative Null Hypothesis Testing • Accepted for FDA drug trials and evaluations • Originally proposed for use in toxicity testing in 1995 (Erickson and McDonald) • Recently proposed for CA’s WET Policy

Limitations of the TST • It is still a statistical hypothesis test • Not very useful for comparing results spatially or temporally • Pass/Fail test, provides no information on magnitude • Requires knowledge/use of a “threshold” response – “b” or bioequivalence factor • Probably (and debatably) best suited for regulatory purposes

Statistical Hypothesis Testing 101 • Statistical speaking; • You can’t “prove” anything with a hypothesis test – we only “disprove” • The “White Swan” Parable:

Statistical Hypothesis Testing 101 • You can’t prove that “all swans are white” • If we see 10,000 white swans and no non-white swans, we fail reject our hypothesis • In the absence of evidence to the contrary, we then assume the hypothesis is true

“Proving” with Statistics • However, after observing just one non-white swan, we can then confidently reject or disprove our hypothesis that all swans are white

Statistical Hypothesis Testing - Background • Null or “Initial” Hypothesis (Ho) • Mean(sample) Mean(control) • Conduct statistical analyses to try to reject this hypothesis • If unable to reject, we assume the null or “Initial” hypothesis is correct • Type I and Type II error

Type I and Type II Errors • Type I Error • Probability of rejecting when the null or “Initial” hypothesis when it is “true” • Controlled directly by setting alpha () • Type II Error • Probability of accepting the null or “Initial” hypothesis when it is “false” • Also called “power” () • Controlled indirectly

Standard Hypothesis Testing (NOEC) • With the NOEC: • The initial hypothesis is mean (sample)  mean (control) In other words, the sample is non-toxic! • If we don’t/can’t “prove” this to be incorrect statistically, we assume it is true • Type I error = Identifying a non-toxic sample as toxic

TST Hypothesis • With the TST: • The hypothesis is mean(effluent) =/< 0.75 * mean(control) In other words, the sample is toxic! • If we don’t/can’t “prove” this to be incorrect statistically, we assume it is true – we assume the sample is toxic • Type I error = Identifying a toxic sample as non-toxic

Bioequivalence Factor (b) • In the EPA Guidance • Set as an unacceptable or “toxic” threshold • For Chronic: • B = 0.75 = 25% Effect • For Acute • B = 0.80 = 20% Effect

Regulatory Management Decisions (RMDs) • Setting the Type I Error Rate–alpha () • How frequent will you reject the Ho when it is true? • EPA desires that no more than 25% of the tests with a 25% effect or more are identified as “non-toxic” • Alpha () is then set at 0.05 to 0.25, depending on the test

Test/Species-Specific Alpha

Why the Different Alphas? • EPA’s Second Regulatory Management Decision • No more than 5% of tests with effects less than 10% should be identified as toxic • Type II Error Rate – not really a “false positive” • Alpha adjusted down until no more than 5% of tests with effects less than 10% were identified as “toxic” • Monte Carlo simulations

TST Equation (Welch’s t-test) • t = • t (calculated) < t (table/critical) = toxic • t (calculated) > t (table/critical) = non-toxic

Factors That Impact Ability to Statistically Reject the Hypothesis • Magnitude of Effect • Number of Replicates • Within Test Variability

TST Equation (Welch’s t-test) • t = • All tests (100%) with an effect of 25% will be identified as “toxic” • The greater the within test variability, the harder or less likely it will be to identify a sample as being statistically different (non-toxic). • The more replication, the more likely it will be to identify a sample as being statistically different (non-toxic).

Effect of Variability:Standard t-test

Example:TST test

Controllable Factors That Impact Ability to Statistically Reject the Hypothesis • Variability • The greater the within test variability, the harder or less likely it will be to identify a sample as being statistically different. • For the “regular” hypothesis test • Less frequent identification of “toxicity” • For the TST • Less frequent identification of “no toxicity” • Replication

Procedures That May Reduce Variability • Maximize Mean Response • CV = S.D. / Mean From EPA Test of Significant Toxicity (TST) Document EPA 833-R-10-003

Impact of Control Mean • At the 10th Percentile (17.7) - a 25% effect is reduction of 4.4 neonates • At the 50th Percentile (25.5) - a 25% effect is reduction of 6.4 neonates • At the 95th Percentile (35.6) - a 25% effect is reduction of 8.9 neonates

Procedures That May Increase Mean Response • Dilution Water Selection • Match sample condition as much as possible • Food Supplements, Combinations • Specifically allowed (13.6.16.9.2) • Feeding Rates • Twice or three times per day • Amount of food

Fathead Minnow Feeding Rate Example

Impact of Growth on CV

Procedures That May Decrease Variability • Set Internal Control CV Criteria

Procedures That May Decrease Variability • Set Internal Control Mean Criteria

Statistical andNon-statistical Error • False Determinations of Toxicity

Dose Response Evaluation • Eliminating multiple concentrations may limit ability to evaluate spurious results.

Conclusions • Same limitations as any hypothesis test • Implications associated with variability and “power” shifted • Not a magical “black box” • You need to be aware of the impact variability, QA/QC, and test design may have • May be useful for regulation • NPDES Permits • Possible use for remediation goals?

Questions? Contact info: pmarkle@lacsd.org

The Test for Significant Toxicity (TST) – A “New” Hypothesis Testing Approach for Aquatic Bioassay Testing

The Test for Significant Toxicity (TST) – A “New” Hypothesis Testing Approach for Aquatic Bioassay Testing

Presentation Transcript

Hypothesis Testing

Hypothesis Testing and Comparing Two Proportions

Xilinx SRAM Based FPGA Testing, Testability, and Reliability Issues

Hypothesis Testing

Chapter 7 Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Chapter 5: Hypothesis Testing and Statistical Inference

Hypothesis Testing

HYPOTHESIS TESTING

Hypothesis Testing

Hypothesis testing in the REAL world

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Software testing basic

Hypothesis Testing IV

Different Expression Multiple Hypothesis Testing

HYPOTHESIS TESTING

Testing (1)

Hypothesis testing