720 likes | 872 Views
NRCSE. Modelling non-stationarity in space and time for air quality data Peter Guttorp University of Washington peter@stat.washington.edu. Outline. Lecture 1: Geostatistical tools Gaussian predictions Kriging and its neighbours The need for refinement
E N D
NRCSE Modelling non-stationarity in space and time for air quality dataPeter GuttorpUniversity of Washingtonpeter@stat.washington.edu
Outline • Lecture 1: Geostatistical tools • Gaussian predictions • Kriging and its neighbours • The need for refinement • Lecture 2: Nonstationary covariance estimation • The deformation approach • Other nonstationary models • Extensions to space-time • Lecture 3: Putting it all together • Estimating trends • Prediction of air quality surfaces • Model assessment
Research goals in air quality modeling • Create exposure fields for health effects modeling • Assess deterministic air quality models • Interpret environmental standards • Enhance understanding of complex systems
The geostatistical setup • Gaussian process • m(s)=EZ(s) Var Z(s) < ∞ • Z is strictly stationary if • Z is weakly stationary if • Z is isotropic if weakly stationary and
The problem • Given observations at n sites Z(s1),...,Z(sn) • estimate • Z(s0) (the process at an unobserved site) • or (a weighted average of the process)
A Gaussian formula • If • then
Simple kriging • Let X = (Z(s1),...,Z(Sn))T, Y = Z(s0), so • mX=m1n, mY=m, • SXX=[C(si-sj)], SYY=C(0), and SYX=[C(si-s0)]. • Thus • This is the best linear unbiased predictor for known m and C (simple kriging). • Variants: ordinary kriging (unknown m) • universal kriging (m=Ab for some covariate A) • Still optimal for known C. • Prediction error is given by
The (semi)variogram • Intrinsic stationarity • Weaker assumption (C(0) need not exist) • Kriging can be expressed in terms of variogram
Estimation of covariance functions • Method of moments: square of all pairwise differences, smoothed over lag bins • Problem: Not necessarily a valid variogram
Least squares • Minimize • Alternatives: • fourth root transformation • weighting by 1/g2 • generalized least squares
Maximum likelihood • Z~Nn(m,S) S = a [r(si-sj;q)] = a V(q) • Maximize • and maximizes the profile likelihood
Effect of estimating covariance structure • Standard geostatistical practice is to take the covariance as known. When it is estimated, optimality criteria are no longer valid, and “plug-in” estimates of variability are biased downwards. • (Zimmerman and Cressie, 1992) • A Bayesian prediction analysis takes proper account of all sources of uncertainty (Le and Zidek, 1992)
General setup • Z(x,t) = m(x,t) + n(x)1/2E(x,t) + e(x,t) • trend + smooth + error • We shall assume that m is known or constant • t = 1,...,T indexes temporal replications • E is L2-continuous, mean 0, variance 1, independent of the error e • C(x,y) = Cor(E(x,t),E(y,t)) • D(x,y) = Var(E(x,t)-E(y,t)) (dispersion)
Geometric anisotropy • Recall that if we have an isotropic covariance (circular isocorrelation curves). • If for a linear transformation A, we have geometric anisotropy (elliptical isocorrelation curves). • General nonstationary correlation structures are typically locally geometrically anisotropic.
The deformation idea • In the geometric anisotropic case, write • where f(x) = Ax. This suggests using a general nonlinear transformation . Usually d=2 or 3. • G-plane D-space • We do not want f to fold.
Implementation • Consider observations at sites x1, ...,xn. Let be the empirical covariance between sites xi and xj. Minimize • where J(f) is a penalty for non-smooth transformations, such as the bending energy
SARMAP • An ozone monitoring exercise in California, summer of 1990, collected data on some 130 sites.
Transformation • This is for hr. 16 in the afternoon
Thin-plate splines Linear part
A Bayesian implementation • Likelihood: • Prior: • Linear part: • fix two points in the G-D mapping • put a (proper) prior on the remaining two parameters • Posterior computed using Metropolis-Hastings
Other applications • Point process deformation (Jensen & Nielsen, Bernoulli, 2000) • Deformation of brain images (Worseley et al., 1999)
Isotropic covariances on the sphere Isotropic covariances on a sphere are of the form where p and q are directions, gpq the angle between them, and Pi the Legendre polynomials. Example: ai=(2i+1)ri
A class of global transformations • Iteration between simple parametric deformation of latitude (with parameters changing with longitude) and similar deformations of longitude (changing smoothly with latitude). • (Das, 2000)
Global temperature • Global Historical Climatology Network 7280 stations with at least 10 years of data. Subset with 839 stations with data 1950-1991 selected.
Gaussian moving averages • Higdon (1998), Swall (2000): • Let x be a Brownian motion without drift, and . This is a Gaussian process with correlogram • Account for nonstationarity by letting the kernel b vary with location:
Kernel averaging • Fuentes (2000): Introduce orthogonal local stationary processes Zk(s), k=1,...,K, defined on disjoint subregions Sk and construct • where wk(s) is a weight function related to dist(s,Sk). Then • A continuous version has
Simplifying assumptions in space-time models • Temporal stationarity • seasonality • decadal oscillations • Spatial stationarity • orographic effects • meteorological forcing • Separability • C(t,s)=C1(t)C2(s)
SARMAP revisited • Spatial correlation structure depends on hour of the day (non-separable):
Bruno’s seasonal nonseparability • Nonseparability generated by seasonally changing spatial term • Z1 large-scale feature • Z2 separable field of local features • (Bruno, 2004)
A non-separable class of stationary space-time covariance functions • Cressie & Huang (1999): • Fourier domain • Gneiting (2001): f is completely monotone if (-1)n f (n) ≥ 0for all n. Bernstein’s theorem : for some non-decreasing F. • Combine a completely monotone function and a function y with completely monotone derivative into a space-time covariance
A particular case a=1/2,g=1/2 a=1/2,g=1 a=1,g=1/2 a=1,g=1
Uses for surface estimation • Compliance • exposure assessment • measurement • Trend • Model assessment • comparing (deterministic) model to data • approximating model output • Health effects modeling
Health effects • Personal exposure (ambient and non-ambient) • Ambient exposure • outdoor time • infiltration • Outdoor concentration model for individual i at time t
Seattle health effects study • 2 years, 26 10-day sessions • A total of 167 subjects: • 56 COPD subjects • 40 CHD subjects • 38 healthy subjects • (over 65 years old, non-smokers) • 33 asthmatic kids • A total of 108 residences: • 55 private homes • 23 private apartments • 30 group homes
Ogawa sampler PUF HPEM pDR
T/RH logger CO2 monitor Ogawa sampler CAT Nephelometer HI Quiet Pump Box