190 likes | 316 Views
This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement. # CR - 829095. State-Space Models for Within-Stream Network Dependence. William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt. Disclaimer.
E N D
This research is funded by U.S.EPA – Science To Achieve Results (STAR) Program Cooperative Agreement # CR - 829095 State-Space Models for Within-Stream Network Dependence William Coar Department of Statistics Colorado State University Joint work with F. Jay Breidt
Disclaimer • The work reported here was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA. The views expressed here are solely those of the presenter and STARMAP, the Program (s)he represents. EPA does not endorse any products or commercial services mentioned in this presentation.
Outline • Introduction to the problem • Evolution of state-space models • Likelihood • Missing data • Kalman recursions • EM algorithm • Simulation example • Future work
Consider a simple stream network Y1 Y2 Y3 Y4 • Two upstream reaches merge together to create downstream reaches. • Suggests a natural dependency on upstream reaches. • Autocorrelation can arise from water flowing from reach to reach. • Logical ordering in space. Y5 Y6 Downstream Y7
The Beginnings • Expressing a measurement on a reach in terms of its upstream contributors such that where .
The Beginnings • This is also the modified Cholesky decomposition of S-1 • For any Y~(µ,), there exists a unit lower triangular matrix T with corresponding diagonal D such that TY=Z where Z~(0,D). • Simplifying T can allow for dependencies similar to autoregressive structures in time series. • ie, a measurement depends only on its two immediate upstream neighbors. • in the simple example. • Suggestive of a more general state-space model. Y1 Y2 Y3 Y4 Y5 Y6 Y7
u2 u1 t State-Space Model • Define a state-space representation by with {W(t)}~N(0,{R(t)}), {V(t)}~N(0,{Q(t)}), and V(s) uncorrelated with W(t) for all s and t. Further assume that W(t) and V(t) are uncorrelated with all X(s1), where s1 is any first order reach.
u2 u1 t Downstream Filter • Best mean square predictors under Normality are • Predict X(t) given upstream information • Update with observed information from Y(t) where .
Likelihood • Use the innovations and variances from the downstream filter • In the case where data are available for every reach in the network, the likelihood is easily expressed in terms of these innovations where n is the total number of reaches in the stream network.
EM Algorithm • The likelihood for missing data can be difficult to express. • E-Step • Predict, update, smooth based on current estimates of model parameters. • Form an approximation to the likelihood by filling in the missing values with smoothed estimates. • The M-Step • Maximization of the approximation to the likelihood in order to obtain new parameter estimates for the next iteration. • Iterate until revised parameter estimates stabilize. • Since log-likelihood decreases with each iteration, estimates should converge to MLE.
u2 u1 t Upstream Smoother • Start with the very last reach in the network. • Smooth two at a time using information from the filtered as well as smoothed downstream values. • Estimate based on observations from the entire network with the conditional expectation . • Recursive relationship results in smoothed estimates with variance where .
Other Tree Type Smoothers • Each reach as a parent that creates two children • Existing work Huang & Cressie (1997) and Chou (1994) for uptree filtering (fine-coarse) and downtree smoothing (coarse-fine) • Model different resolutions • Assumption that children are independent conditioned on the parent. • Violated in the stream network model considered. Parent Child Child
x x x x x x Example First order reaches up in the mountains x=missing value Fifth order reach on the plains
Example • Consider a network that has 39 different reaches • 20 first order,19 higher order • Let k be the Strahler order of reach t created by two reaches of order iand j. • State-Space representation of with . • Assumptions about V(t) • Cov(V(s),V(t))=0 for s ≠ t • Cov(V(t),X(s1))=0 for any first order reach s1
Parameter Estimation • Total of 12 parameters to estimate based on 33 stream segments (6 missing values). • 6 different parameters to estimate in this model. • 5 different (conditional) variances to estimate. • 1 variance parameter from first order. • Most parameters will be estimated with few observations. • Only a few reaches will contribute to estimating each . • Suggests looking at parametric models for . • Need a much larger stream network to achieve more reasonable parameter estimates.
Kalman Recursions • Downstream Filter (Y(t)=X(t)) • The filtered value is either the observed Y(t), or its conditional expectation given the two immediate upstream filtered values. • Variance is either 0 (if Y(t) is observed) or the prediction error variance of Y(t) given the two immediate upstream filtered values. • Upstream Smoother • Smooth two at a time, Y(u1)and Y(u2). • Either the observed value or the conditional expectation of Y(ui) given all reaches with observed measurements. • Need to know the logical order of flow
Smoothed Data Values 1 2 3 4 6 5 • More iterations in the EM algorithm • Better model for the coefficient parameters • Plot estimates from regression against covariates (regressogram) • Re-compute MLE based on new parametric model suggested by the regressogram
Future Work • Work with real data from larger networks. • Obtain better initial estimates. • Investigate EM convergence. • Use reach-specific covariate information such as location within a reach, inflow from upstream reaches, etc. • State space representations that allow for larger classes of models than the AR structure considered here. • Allow for upstream measurements on the same reach.