930 likes | 941 Views
This presentation by Dr. William J. Price covers the use of applied dose-response models in weed science, including statistical programs, estimation procedures, response distributions, alternative models, and parameterization. The presentation also explores treatment structure and comparison, random effect mixed models, and Bayesian methods.
E N D
Applied Dose-Response Models in Weed Science Presented by William J. Price, Ph.D. Statistical Programs College of Agricultural and Life Sciences University of Idaho
Acknowledgments • Research partially funded by USDA-ARS Hatch Project • IDA01412,Idaho Agricultural Experiment Station. • Collaborators: • Bahman Shafii, Ph. D., Director, Statistical Programs, • University of Idaho. • Steven Seefeldt, Ph. D., USDA -ARS, • University of Alaska Fairbanks.
Outline • General Considerations • Available Software Estimation: • SAS Procedures • PROBIT, NLIN, and NLMIXED • Response Distributions • Normal, Poisson, Binomial • Alternative Models • Logistic vs Exponential • Extensions to Models and Techniques • Parameterization • Treatment Structure and Comparison • Random Effect Mixed Models • Bayesian Methods • MCMC
Outline Calibration: • Parametric • Non-Parametric Conclusions References
Statistical Estimation Software • SAS • S+ • R • Statistica, etc. • Sigma Plot, AXUM, etc.
Probit Analysis • Data description • Vernalization study. • Fixed number of wheat plants • 6 to 10 wheat plants per replication and temperature. • (SAS: plants). • Five temperatures (doses): • 0, -10, -12, -14, and -16 degrees celcius • (SAS: temp = temperature + 17). • Number of wheat plants alive after 2 weeks recorded • (SAS: alive2wk).
1.00 .75 .50 .25 -16 0 0 Vernalization Data Proportion Alive Temperature
Code: proc probit data=freeze log optc lackfit inversecl; model alive2wk/plants = temp/distribution=logistic ; predpplot var=temp; Probit Analysis • SAS Procedure: PROC PROBIT. proc probit data=freeze log optc lackfit inversecl; model alive2wk/plants = temp/distribution=logistic ; predpplot var=temp; proc probit data=freeze log optc lackfit inversecl; model alive2wk/plants = temp/distribution=logistic; predpplot var=temp; proc probit data=freeze log optc lackfit inversecl; model alive2wk/plants = temp/distribution=logistic ; predpplot var=temp;
PROC PROBIT Output Probit Procedure Model Information Data Set WORK.FREEZE Events Variable alive2wk Trials Variable plants Number of Observations 20 Number of Events 122 Number of Trials 195 Name of Distribution Logistic Log Likelihood -83.4877251 Number of Observations Read 20 Number of Observations Used 20 Number of Events 122 Number of Trials 195 Missing Values 0 Algorithm converged. Goodness-of-Fit Tests Statistic Value DF Pr > ChiSq Pearson Chi-Square 18.7054 17 0.3457 L.R. Chi-Square 22.6138 17 0.1623 Response-Covariate Profile Response Levels 2 Number of Covariate Values 20 Type III Analysis of Effects Wald Effect DF Chi-Square Pr > ChiSq Ln(temp) 1 15.2620 <.0001
PROC PROBIT Output (cont) Analysis of Parameter Estimates Standard 95% Confidence Chi- Parameter DF Estimate Error Limits Square Pr > ChiSq Intercept 1 -6.9144 2.0126 -10.8590 -2.9699 11.80 0.0006 Ln(temp) 1 4.5094 1.1543 2.2470 6.7717 15.26 <.0001 _C_ 1 0.2258 0.0623 0.1037 0.3480 Probability temp 95% Fiducial Limits 0.01 1.67252 0.48881 2.54641 0.02 1.95482 0.66670 2.83256 . . . . . . . . . . . . 0.40 4.23519 3.00063 4.96946 0.45 4.43197 3.25697 5.16920 0.50 4.63365 3.52170 5.38443 0.55 4.84451 3.79739 5.62420 0.60 5.06959 4.08622 5.90119 . . . . . . . . . . . . 0.99 12.83734 9.46539 30.51920
1.00 .75 .50 .25 0 PROC PROBIT Output (cont) Proportion Alive OPTC = 0.23 -16 0 Temperature (C)
PROC PROBIT (cont) • Probit Advantages • Automatic Goodness of Fit test. • Easily computed percentiles. • Ability to do treatment comparisons. • Graphic output. • Probit Limitations • Proportional Data. • Maximum set to 1.0. • Limited number of response models.
Code: proc nlin data=freeze ; parms I = 8 B = -4.5 C = .12; bounds B<0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model per2 = mu; output out=pred p=pred; Nonlinear Least Squares • SAS Procedure: PROC NLIN proc nlin data=freeze ; parms I = 8 B = -4.5 C = .12; bounds B<0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model per2 = mu; output out=pred p=pred; proc nlin data=freeze ; parms I = 8 B = -4.5 C = .12; bounds B<0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model per2 = mu; output out=pred p=pred;
PROC NLIN Output The NLIN Procedure Dependent Variable per2 Method: Gauss-Newton Iterative Phase Sum of Iter I B C Squares 0 4.6000 -4.6000 0.2200 0.3767 1 4.6147 -4.6464 0.2268 0.3765 2 4.6119 -4.6366 0.2264 0.3765 3 4.6125 -4.6388 0.2265 0.3765 4 4.6123 -4.6383 0.2265 0.3765 5 4.6124 -4.6384 0.2265 0.3765 NOTE: Convergence criterion met. Estimation Summary Method Gauss-Newton Iterations 5 R 3.628E-6 PPC(B) 4.559E-6 RPC(B) 0.000021 Object 2.25E-10 Objective 0.37647 Observations Read 20 Observations Used 20 Observations Missing 0
PROC NLIN Output (cont) Sum of Mean Approx Source DF Squares Square F Value Pr > F Model 3 9.6353 3.2118 145.03 <.0001 Error 17 0.3765 0.0221 Uncorrected Total 20 10.0117 Approx Parameter Estimate Std Error Approximate 95% Confidence Limits I 4.6124 0.4511 3.6607 5.5640 B -4.6384 1.7176 -8.2622 -1.0147 C 0.2265 0.0723 0.0739 0.3791 Approximate Correlation Matrix I B C I 1.0000000 -0.4991207 0.6394756 B -0.4991207 1.0000000 -0.5126022 C 0.6394756 -0.5126022 1.0000000
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 PROC NLIN Output (cont) Proportion Alive 0 1 2 3 Ln (temp)
PROC NLIN (cont) • NLIN Advantages • Not restricted to proportional data. • Maximum may be any value. • Response models not limited. • NLIN Limitations • Assumes normally distributed response. • Approximate tests. • Treatment comparisons not automatic.
Code: proc nlmixed data=freeze corr; parms I = 4.5 B = -4.8 C = .228; bounds B<0, I>0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model alive2wk ~ binomial(plants, mu); predict plants*mu out=pred1; predict mu out=pred2; Maximum Likelihood • SAS Procedure: PROC NLMIXED proc nlmixed data=freeze corr; parms I = 4.5 B = -4.8 C = .228; bounds B<0, I>0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model alive2wk ~ binomial(plants, mu); predict plants*mu out=pred1; predict mu out=pred2; proc nlmixed data=freeze corr; parms I = 4.5 B = -4.8 C = .228; bounds B<0, I>0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model alive2wk ~ binomial(plants, mu); predict plants*mu out=pred1; predict mu out=pred2; proc nlmixed data=freeze corr; parms I = 4.5 B = -4.8 C = .228; bounds B<0, I>0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model alive2wk ~ binomial(plants, mu); predict plants*mu out=pred1; predict mu out=pred2; proc nlmixed data=freeze corr; parms I = 4.5 B = -4.8 C = .228; bounds B<0, I>0; mu = C + (1-C)/(1 + exp(B*(ltemp-log(I)))); model alive2wk ~ binomial(plants, mu); predict plants*mu out=pred1; predict mu out=pred2;
PROC NLMIXED Output Dependent Variable alive2wk Distribution for Dependent Variable Binomial Optimization Technique Dual Quasi-Newton Dimensions Observations Used 20 Observations Not Used 0 Total Observations 20 Parameters 3 Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 4 28.3612297 0.034097 1.623059 -27.1026 2 5 28.2115599 0.14967 0.572223 -3.97671 3 10 28.1847135 1.433E-8 0.000016 -2.67E-8 NOTE: GCONV convergence criterion satisfied. Fit Statistics -2 Log Likelihood 56.4 AIC (smaller is better) 62.4 AICC (smaller is better) 63.9 BIC (smaller is better) 65.4
PROC NLMIXED Output (cont) Parameter Estimates Standard Parm Estimate Error DF t Value Pr > |t| Alpha Lower Upper Gradient I 4.6337 0.4151 20 11.16 <.0001 0.05 3.7677 5.4996 4.629E-6 B -4.5094 1.1543 20 -3.91 0.0009 0.05 -6.9172 -2.1016 -2.62E-6 C 0.2258 0.06231 20 3.62 0.0017 0.05 0.09585 0.3558 0.000016 Correlation Matrix of Parameter Estimates Row Parameter I B C 1 I 1.0000 -0.5276 0.5848 2 B -0.5276 1.0000 -0.3896 3 C 0.5848 -0.3896 1.0000
10 9 8 7 6 5 4 3 2 1 0 1 2 3 Ln(temp) PROC NLMIXED Output (cont) Number of surviving plants
1.0 0.9 0.8 0.7 0.6 Probit 0.5 NLin NLMixed 0.4 0.3 0.2 0.1 Procedure Comparisons Proportion Alive 0 1 2 3 Ln(temp)
Procedure Comparisons • All three procedures can produce similar results. • Binomial or proportional data. • Maximum response of 1.0. • PROBIT limited in models and response types. • NLIN and NLMIXED provide nonlinear solutions. • NLMIXED most flexible for responses and models.
NLMIXED : Normal Data • NLMIXED probability distributions: • Binomial - yes/no data. • Normal - continuous data. • Poisson - discrete count data. • User defined - any data. • Example: Seefeldt, et al. 1995 • Wild oat resistance • Treated with fenoxaprop/2,4-D/MCPA(SAS: dose). • Dry weights at 2 weeks (SAS: adj_wt).
0.28 0.26 0.24 0.22 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.010 0.100 1.000 10.000 0.00 NLMIXED : Normal Data Biotype C Dry Weight (g) Dose (kg ai/ha)
NLMIXED : Normal Data • Assume dry weight to be normally distributed with • mean mu and variance sig2. • Must model sig explicitly. proc nlmixed data=seefeldt; parms C=.04 D=.2 B=3 I=.1 sig=.021; bounds C>0, D>0, B>0, sig>0; if dose = 0 then mu = D; else mu = C + (D-C)/(1 + exp(B*(ldose-log(I)))); model adj_wt ~ normal(mu, sig**2); predict mu out=fitted;
NLMIXED : Normal Data Iteration History Iter Calls NegLogLike Diff MaxGrad Slope 1 14 -63.769085 17.90341 925.7938 -4745664 2 19 -90.116746 26.34766 3490.083 -1912.68 3 66 -122.58572 4.511E-8 0.040557 -1E-7 NOTE: GCONV convergence criterion satisfied. Fit Statistics -2 Log Likelihood -245.2 AIC (smaller is better) -235.2 AICC (smaller is better) -233.9 BIC (smaller is better) -225.1 Parameter Estimates Standard Parm Estimate Error DF t Value Pr > |t| Lower Upper C 0.04823 0.01893 51 2.55 0.0137 0.01030 0.08616 D 0.1836 0.00794 51 23.12 <.0001 0.1677 0.1996 B 1.3283 0.4510 51 2.95 0.0047 0.4246 2.2321 I 1.1669 0.3539 51 3.30 0.0017 0.4576 1.8762 sig 0.02937 0.00280 51 10.49 <.0001 0.02376 0.03498
0.28 0.26 0.24 0.22 0.20 0.18 0.16 0.14 0.12 0.10 0.08 0.06 0.04 0.02 0.010 0.100 1.000 10.000 0.00 Dose (kg ai/ha) NLMIXED : Normal Data Biotype C Dry Weight (g)
NLMIXED : Poisson data • Data description • Simulated injury study. • Harmony sprayed on pea plants. • measured the number of branches/plant. • (SAS: branches). • Ten doses: • 0 to 0.125 lbs ai/A. • (SAS: trt).
100 90 80 70 60 Number of Branches 50 40 30 20 10 0 1.0000 0.0001 0.0010 0.0100 0.1000 NLMIXED : Poisson data Variety C Harmony Dose (lb ai/A)
NLMIXED : Poisson data • Assume the number of branches to be distributed as • a Poisson variable. • In the Poisson distribution, mean = variance = mu. proc nlmixed data=pea; parms D=10 C=70 B=.8254 I=.01; bounds D>0, B>0; if trt = 0 then mu = D; else mu = C + (D-C)/(1 + exp(B*(ltrt-log(I)))); model branches ~ poisson(mu); predict mu out=pred;
NLMIXED : Poisson data Parameter Estimates Standard Parm Estimate Error DF t Value Pr > |t| Alpha Lower Upper C 11.6834 1.0136 38 11.53 <.0001 0.05 9.6315 13.7352 I 0.01075 0.0013 38 8.41 <.0001 0.05 0.0081 0.0133 B 1.9860 0.3246 38 6.12 <.0001 0.05 1.3289 2.6432 D 69.1586 3.3894 38 20.40 <.0001 0.05 62.2972 76.0201
100 90 80 70 60 50 40 30 20 10 1.0000 0.0001 0.0010 0.0100 0.1000 0 NLMIXED : Poisson data Variety C Number of Branches Harmony Dose (lb ai/A)
Common Dose-response Models • Normal:yij = (1/2) exp((x-)2/2 • Logistic:yij = 1 / (1 + exp( -b1( dosei - b0 )) • Modified Logistic:yij = C + (D-C) / (1 + exp( -B(dosei - I))) • (e.g. Seefeldt et al. 1995) • Logistic:yij = 1 / (1 + exp( -b1( dosei - b0 )) • Modified Logistic:yij = C + (D-C) / (1 + exp( -B(dosei - I))) • (e.g. Seefeldt et al. 1995) • Gompertz: yij = b0 (1 - exp(exp(-b1(dose)))) • yij = b0 exp(-b1(dose)) • Exponential: • yij = b0 [1 - exp(-b1(dose))]
Log-logistic Model yij = C + (D - C) / (1 + exp(B(dosei - I))) • Exponential Model yij = (a-c) exp(-bdose) + c NLMIXED: Alternative Models Example:
Exponential Model for Pea Biomass • A linear pattern of data on a log scale. • Implies an exponential model, e.g. Biomass = (a-c) exp(-bdose) + c where a is an intercept term, c is a lower limit and b is a rate parameter. • The 50th percentile for this model is given by: • I50 = ln(((a/2) - c)/(a - c))/(-b)
NLMIXED: Alternative Models • Example: Pea Data • Fit log-logistic model to biomass measurements. proc nlmixed data=pea corr maxiter=2000; parms D=.5966 I=0.01 B=.51 C=.04 sig=.09; bounds D>0, B>0; if trt = 0 then mu = D; else mu = C + (D-C)/(1 + exp(B*(ltrt-log(I)))); model bio ~ normal(mu, sig**2);
Log-logistic Model for Pea Biomass Parameter Estimates Standard Parm Estimate Error DF t Value Pr > |t| Alpha Lower Upper D 1.2751 4.3499 38 0.29 0.7710 0.05 -7.5308 10.0810 I 0.02792 0.4771 38 0.06 0.9536 0.05 -0.9379 0.9937 B 0.1261 0.6405 38 0.20 0.8449 0.05 -1.1704 1.4227 C -0.7546 5.9311 38 -0.13 0.8994 0.05 -12.7614 11.2522 sig 0.09240 0.01060 38 8.72 <.0001 0.05 0.07094 0.1139 Correlation Matrix of Parameter Estimates Row Parameter D I B C sig 1 D 1.0000 0.6108 -0.9903 -0.9568 -0.00074 2 I 0.6108 1.0000 -0.7133 -0.8146 -0.00306 3 B -0.9903 -0.7133 1.0000 0.9873 0.001188 4 C -0.9568 -0.8146 0.9873 1.0000 0.001668 5 sig -0.00074 -0.00306 0.001188 0.001668 1.0000
0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 Log-logistic Model for Pea Biomass Biomass (g/plant) 0.0001 0.0010 0.0100 0.1000 Harmony Dose (lb ai/A)
Log-logistic Model yij = C + (D - C) / (1 + exp(B(dosei - I))) • Exponential Model yij = (a-c) exp(-bdose) + c NLMIXED: Alternative Models Example:
Exponential Model for Pea Biomass proc nlmixed data=pea corr; parms a=.5217 b=106.5 c = .2026 sig=.09; mu =(a-c)*exp(-b*trt) + c; model bio ~ normal(mu, sig**2); predict mu out=pred; estimate ’I50' log(((a/2)-c)/(a-c))/(-b);
Exponential Model for Pea Biomass Parameter Estimates Standard Parm Estimate Error DF t Value Pr > |t| Alpha Lower Upper a 0.5260 0.03375 38 15.59 <.0001 0.05 0.4577 0.5943 b 106.50 51.3184 38 2.08 0.0448 0.05 2.6113 210.39 c 0.2026 0.03622 38 5.59 <.0001 0.05 0.1293 0.2760 sig 0.09861 0.01131 38 8.72 <.0001 0.05 0.07571 0.1215 Correlation Matrix of Parameter Estimates Row Parameter a b c sig 1 a 1.0000 0.6473 0.2864 0.000060 2 b 0.6473 1.0000 0.6696 0.000091 3 c 0.2864 0.6696 1.0000 0.000062 4 sig 0.000060 0.000091 0.000062 1.0000 Additional Estimates Standard Label Estimate Error DF t Value Pr > |t| Alpha Lower Upper I50 0.01576 0.006829 38 2.31 0.0265 0.05 0.001937 0.02958
0.8 0.7 0.6 0.5 Biomass (g/plant) 0.4 0.3 0.2 0.1 0.0001 0.0010 0.0100 0.1000 Exponential Model for Pea Biomass Harmony Dose (lb ai/A)
Estimation: Extensions Reparameterization
Estimation: Reparameterization • The log-logistic model can be generalized to • estimate any percentile as (Schabenberger, 1999): yij = C + k(D - C) / (k + exp(B( dosei – I(1-Q)))) where I(1-Q) is the dose required to reach the Qth percentile, and k is given by : k = Q/(1 - Q)
Q = 0.9 k = Q/(1 - Q) = 0.9/(1.0 - 0.9) = 9.0 Estimation: Reparameterization • Example: • Seefeldt data, biotype C. • Estimate the 90th percentile, e.g. I10