520 likes | 735 Views
Groundwater. Notes on geostatistics. Monica Riva, Alberto Guadagnini Politecnico di Milano , Italy Key reference : de Marsily, G. (1986), Quantitative Hydrogeology. Academic Press, New York, 440 pp.
E N D
Groundwater. Notes on geostatistics Monica Riva, Alberto Guadagnini Politecnico di Milano, Italy Key reference: de Marsily, G. (1986), Quantitative Hydrogeology. Academic Press, New York, 440 pp
In practice: random spatial variability of hydrogeologic medium properties, and stochastic nature of corresponding flow (hydraulic head, fluid flux and velocity) and transport (solute concentration, solute flux and velocity) variables, are often ignored. Instead, the common approach has been to analyse flow and transport in multiscale, randomly heterogeneous soils and rocks deterministically. Yet with increasing frequency, the popular deterministic approach to hydrogeologic analysis is proving to be inadequate. Modelling flow and transport in heterogenous media motivation and general idea
Understanding the role of heterogeneity Jan 2000 editorial "It's the Heterogeneity!“ (Wood, W.W., It’s the Heterogeneity!, Editorial, Ground Water, 38(1), 1, 2000): heterogeneity of chemical, biological, and flow conditions should be a major concern in any remediation scenario. Many in the groundwater community either failed to "get" the message or were forced by political considerations to provide rapid, untested, site-specific active remediation technology. "It's the heterogeneity," and it is the Editor's guess that the natural system is so complex that it will be many years before one can effectively deal with heterogeneity on societally important scales. Panel of experts (DOE/RL-97-49, April 1997):As flow and transport are poorly understood, previous and ongoing computer modelling efforts are inadequate and based on unrealistic and sometimes optimistic assumptions, which render their output unreliable.
Flow and Transport in Multiscale Fields (conceptual) Field & laboratory-derive conductivities & dispersivities appear to vary continuously with the scale of observation (conductivity support, plume travel distance). Anomalous transport. Recent theories attempt to link such scale-dependence to multiscale structure of Y = ln K. Predict observed effect of domain size on apparent variance and integral scale of Y. Predict observed supra linear growth rate of dispersivity with mean travel distance (time). Major challenge: develop more powerful/general stochastic theories/models for multiscale random media, and back them with lab/field observation.
Neuman S.P., On advective transport in fractal permeability and velocity fields, Water Res. Res., 31(6), 1455-1460, 1995. Shed some light Conceptual difficulty: Data deduced by means of deterministic Fickian models from laboratory and field tracer tests in a variety of porous and fractured media, under varied flow and transport regimes. Linear regression: aLa 0.017 s1.5 Supra-linear growth
Natural Variability. Geostatistics revisited • Introduction: Few field findings about spatial variability • Regionalized variables • Interpolation methods • Simulation methods
AVRA VALLEY Clifton and Neuman, 1982 Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982. Regional Scale
Columbus Air Force [Adams and Gelhar, 1992] Aquifer Scale
Mt. Simon aquifer Bakr, 1976 Local Scale
Summary: Variability is present at all scales But, what happens if we ignore it? We will see in this class that this would lead to interpretation problems in both groundwater flow and solute transport phenomena Examples in transport: - Scale effects in dispersion - New processes arising Heterogeneous parameters: ALL (T, K, , S, v (q), BC, ...) Most relevant one: T (2D), or K (3D), as they have been shown to vary orders of magnitude in an apparently homogeneous aquifer
Variability in T and/or K Summary of data from many different places in the world. Careful though! Data are not always obtained with rigorous procedures, and moreover, as we will see throughout the course, data depend on interpretation method and scale of regularization Data given in terms of mean and variance (dispersion around the mean value)
Variability in T and/or K Almost always σlnT (or σlnK ) < 2 (and in most cases <1) This can be questioned, but OK by now Correlation scales (very important concept later!!)
But, what is the correct treatment for natural heterogeneity? First of all, what do we know? - real data at (few) selected points - Statistical parameters - A huge uncertainty related to the lack of data in most part of the aquifer. If parameter continuous (of course they are), then the number of locations without data is infinity Note: The value of K at any point DOES EXIST. The problem is we do not know it (we could if we measured it, but we could never be exhaustive anyway) Stochastic approach: K at any given point is RANDOM, coming from a predefined (maybe known, maybe not) pdf, and spatially correlated ------ REGIONALIZED VARIABLE
Regionalized Variables T(x,ω) is a Spatial Random Function iif: • If ω = ω0 then T(x,ω0) is a spatial function (continuity?, differentiability?) • If x =x0 then T(x0) (actually T(x0, ω)) is a random function Thus, as a random function, T(x0) has a univariate distribution (log-normal according to Law, 1944; Freeze, 1975)
Hoeksema & Kitanidis, 1985 Log-T normal, log-K normal Both consolidated and unconsolidated deposits
Now we look at T(x), so we are interested in the multivariate distribution of T(x1), T(x2), ... T(xn): Most frequent hypothesis: Y=(Y(x1), Y(x2), ... Y(xn))=(ln T(x1), ln T(x2), ... ln T(xn)) Is multinormal with But most important: NO INDEPENDENCE
What if independent? and then we are in classical statistics But here we are not, so we need some way to characterize dependency of one variable at some point with the SAME variable at a DIFFERENT point. This is the concept of the SEMIVARIOGRAM (or VARIOGRAM)
Classification of SRF • Second order stationary E[Z(x)]=const C(x, y) is not a function of location (only of separation distance, h) Particular case: isotropic RSF; C(h) = C(h) Anisotropic covariance: different correlation scales along different directions Most important property: if multinormal distribution, first and second order moments are enough to fully characterize the SRF multivariate distribution
Relaxing the stationary assumption 1. The assumption of second-order stationarity with finite variance, C(0), might not be satisfied (The experimental variance tends to increase with domain size) 2. Less stringent assumption: INTRINSIC HYPOTHESIS The variance of the first-order increments is finite AND these increments are themselves second-order stationary. Very simple example: hydraulic heads ARE non intrinsic SRF E[Y(x + h) – Y(x)] = m(h) var[Y(x + h) – Y(x)] = (h) Independent of x; only function of h Usually: m(h) = 0; if not, just define a new function, Y(x) – m(x), which satisfies this consition Definition of variogram, (h) E[Y(x + h) – Y(x)] = 0 (h) = (1/2) var[Y(x + h) – Y(x)] = (1/2) E[(Y(x + h) – Y(x))2]
Variogram v. Covariance 1. The variogram is the mean quadratic increment of Y between two points separated by h. 2. Compare the INTRINSIC HYPOTHESIS with SECOND-ORDER STATIONARITY E[Y(x)] = m = constant (h) = (1/2) E[(Y(x + h) – Y(x))2] = = (1/2) ( E[Y(x + h)2] + E[Y(x)2] – 2 m2 – 2 E[Y(x + h) Y(x)] + 2 m2) = = C(0) – C(h) variogram covariance h
The variogram The definition of the Semi-Variogram is usually given by the following probabilistic formula When dealing with real data the semi-variogram is estimated by the Experimental Semi-Variogram. For a given separation vector, h, there is a set of observation pairs that are approximately separated by this distance. Let the number of pairs in this set be N(h). The experimental semi-variogram is given by:
Some comments on the variogram If Z(x) and Z(x+h) are totally independent, then If Z(x) and Z(x+h) are totally dependent, then One particular case is when x = x+h. Therefore, by definition In the stationary case:
Variogram Models • DEFINITIONS: • Nugget • Sill • Range • Integral distance or correlation scale • Models: • Pure Nugget • Spherical • Exponential • Gaussian • Power
Correlation scales: Larger in T than in K. Larger in horizontal than in vertical. Fraction of the domain of interest
Additional comments • Second order stationary E[Z(x)]=constant (h) is not a function of location Particular case: isotropic RSF(h) = (h) Anisotropic variograms: two types of anisotropy depending on correlation scale or sill value Important property:(h) = 2 – C(h) Most important property: if multinormal distribution, first and second order moments are enough to fully characterize the SRF multivariate distribution
Estimation vs. Simulation Problem: Few data available, maybe we know mean, variance and variogram Alternatives: (1) Estimation (interpolation) problems: KRIGING Kriging – BLUE Extremely smooth Many possible krigings Alternative: cokriging http://www-sst.unil.ch/research/variowin/
The kriging equations - 1 We want to predict the value, Z(x0), at an unsampled location, x0, using a weighted average of the observed values at N neighboring locations, {Z(x1), Z(x2), ..., Z(xN)}. Let Z*(x0) represent the predicted value; a weighted average estimator be written as The associated estimation error is In general, we do not know the (constant) mean, m, in the intrinsic hypothesis. We impose the additional condition of equivalence between the mathematical expectation of Z* and Z0.
The kriging equations - 2 Unknown mathematical expectation of the process Z. This condition allows obtaining an unbiased estimator.
The kriging equations - 3 We wish to determine the set of weights. IMPOSE the condition
The kriging equations - 4 We then use the definition of variogram THEN: Which I will use into:
The kriging equations - 5 By substitution Noting that: We finally obtain:
The kriging equations - 6 This is a constrained optimization problem. To solve it we use the method of Lagrange Multipliers from the calculus of variation. The Lagrangian objective function is To minimize this we must take the partial derivative of the Lagrangian with respect to each of the weights and with respect to the Lagrange multiplier, and set the resulting expressions equal to zero, yielding a system of linear equations
The kriging equations - 7 Minimize this: and get (N+1) linear equations with (N+1) unknowns
The kriging equations - 8 The complete system can be written as: A = b
Hint: just replace into The kriging equations - 9 We finally get the Variance of the Estimation Error
Estimation vs. Simulation (ii) (2) Simulations: try to reproduce the “look” of the heterogeneous variable Important when extreme values are important Many (actually infinite) solutions, all of them equilikely (and with probability = 0 to be correct) For each potential application we are interested in one or the other
Estimation. 1 AVRA VALLEY. Regional Scale - Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982.
Estimation. 2 AVRA VALLEY. Regional Scale - Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982.
Estimation. 3 AVRA VALLEY. Regional Scale - Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982.
Estimation. 4 AVRA VALLEY. Regional Scale - Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982.
Estimation. 5 AVRA VALLEY. Regional Scale - Clifton, P.M., and S.P. Neuman, Effects of Kriging and Inverse Modeling on Conditional Simulation of the Avra Valley Aquifer in southern Arizona, Water Resour. Res., 18(4), 1215-1234, 1982.
Monte Carlo approach CONDITIONAL CROSS-CORRELATED FIELDS Y = lnT h1 Statistical CONDITIONAL moments, first and second order h2 . . . . . . h2000 2000 simulations
NUMERICAL ANALYSIS - MONTE CARLO Evaluation of key statistics of medium parameters (K, porosity, …) Synthetic generation of an ensemble of equally likely fields Solution of flow/transport problems on each one of these Ensemble statistics Simple to understand Applicable to a wide range of linear and nonlinear problems High heterogeneities Conditioning Heavy calculations Fine computational grids Reliable convergence criteria (?)
Problems: reliable assessment of convergence – Ballio and Guadagnini [2004] Hydraulic head variance Number of Monte Carlo simulations