460 likes | 789 Views
Sampling and monitoring the environment. Marian Scott Sept 2006. Outline. Variation General sampling principles Methods of sampling Simple random sampling Stratified sampling Systematic sampling How many samples (power calculations) Spatial sampling Grid, transect and cluster sampling.
E N D
Sampling and monitoring the environment Marian Scott Sept 2006
Outline • Variation • General sampling principles • Methods of sampling • Simple random sampling • Stratified sampling • Systematic sampling • How many samples (power calculations) • Spatial sampling • Grid, transect and cluster sampling
Variation • Natural variation in the attribute of interest, might be due to • feeding habits if measuring sheep, rainfall patterns if measuring plants • Also variation/ uncertainty due to analytical measurement techniques. • Natural variation may well exceed the analytical uncertainty • Expect therefore that if you measure a series of replicate samples, they will vary and if there is sufficient you may be able to define the distribution of the attribute of interest.
What is statistical sampling? Statistical sampling is a process that allows inferences about properties of a large collection of things (commonly described as the population), to be made from observations made on a relatively small number of individuals belonging to the population (the sample). In conducting statistical sampling, one is attempting to make inferences to the population.
Statistical sampling The use of valid statistical sampling techniques increases the chance that a set of specimens (the sample, in the collective sense) is collected in a manner that is representative of the population. Statistical sampling also allows a quantification of the precision with which inferences or conclusions can be drawn about the population.
Statistical sampling • the issue of representativeness is important because of the variability that is characteristic of environmental measurements. • Because of variability within the population, its description from an individual sample is imprecise, but this precision can be described in quantitative terms and improved by the choice of sampling design and sampling intensity (Peterson and Calvin, 1986).
Good books The general sampling textbooks by Cochran (1977) and Thompson (1992), the environmental statistics textbook by Gilbert (1987), and papers by Anderson-Sprecher et al. (1994), Crepin and Johnson (1993), Peterson and Calvin (1986), and Stehman and Overton (1994).
Know what you are setting out to do before you start • ·describing a characteristic of interest (usually the average), • ·describing the magnitude in variability of a characteristic, • ·describing spatial patterns of a characteristic,mapping the spatial distribution, • ·quantifying contamination above a background or specified intervention level • ·detecting temporal or spatial trends, • ·assessing human health or environmental impacts of specific facilities, or of events such as accidental releases, • assessing compliance with regulations
Rules • Rule 1: specify the objective
Use your scientific knowledge • ·the nature of the population such as the physical or biological material of interest, its spatial extent, its temporal stability, and other important characteristics, • ·the expected behaviour and environmental properties of the compound of interest in the population members, • ·the sampling unit (i.e., individual sample or specimen), • the expected pattern and magnitude of variability in the observations .
Rules • Rule 1: specify the objective • Rule 2: use your knowledge of the environmental context
Other approaches • Nearest neighbour methods - G-function The empirical distribution function of event-to-event nearest neighbours distances, G(·). • Nearest neighbour methods - F-function The empirical distribution function of point-to-event nearest neighbour distances, F(·) • Tests for CSR For a Poisson process (ie CSR) then the theoretical distribution functions G(s) = F(s) = 1 - exp(-s2)
Further models for Spatial point processes • Poisson cluster process • Inhomogeneous Poisson process • Cox process • Inhibition process
The problem of geostatistics Given observations at n sites Z(s1),…, Z(sn) What is our estimate of Z(s0) where s0 is an unobserved site
The autocovariance function The autocorrelation function
The (semi)variogram In terms of the autocovariance
Kriging Assuming that the mean is zero And the prediction error is …
Kriging in terms of the covariance function Assuming that the mean is zero Meteorologists and oceanographers know this as optimal interpolation (OI) or objective analysis. They usually work relative to a ‘first guess’.
A taxonomy of kriging Simple kriging mean known Ordinary kriging mean unknown Universal kriging mean linear function of covariates There are others
Kriging in R There are routines to do kriging in the R libraries:- geoR fields gstat sgeostat spatstat spatdat
Isotropy and Stationarity • An isotropic process is one whose properties (in particular the variogram) do not vary with direction • A stationary process is one whose properties do not vary with space • See Richard’s definition of stationarity in time series.
Steps in a geostatistical analysis • Exploration • Estimating the variogram • Kriging
Estimating the variogram • The ‘obvious’ estimator is • An alternative is the ‘robust’ estimator
Fit a variogram model Rather than look at the empirical variogram we can fit a model. See table The previous examples are a spherical and an exponential variogram
Kriging • Once we have a estimate of the variogram we can perform kriging
An example Wave period in the North Atlantic measured by the radar altimeter on TOPEX/POSEIDON We will concentrate on the area in red
Exploratory Phase Either cut and paste from spat0.r or source(‘spat0.r’,echo=T) load('EMS.rda') par(mfrow=c(2,2)) plot(periodsmall$lon,periodsmall$lat,pch='.') hist(periodsmall$Tz) hist(log(periodsmall$Tz)) periodsmall$lnTz<-log(periodsmall$Tz) tiny.period<-data.frame(periodsmall[seq(1,length(periodsmall$Tz),100),]) plot(tiny.period$lon,tiny.period$lat,pch='.')
Exploratory phase - 2 (spat1.r) library(akima) par(mfrow=c(2,1)) int.Tz<-interp.old(tiny.period$lon,tiny.period$lat,tiny.period$lnTz) image(int.Tz,xlim=range(tiny.period$lon),ylim=range(tiny.period$lat)) contour(int.Tz,add=T) persp(int.Tz,xlim=range(tiny.period$lon),ylim=range(tiny.period$lat),xlab='lon',ylab='lat',zlab='log period',phi=35)
Estimating the variogram (spat2.r) library(geoR) tiny.geo<-as.geodata(tiny.period,coords.col=c(3,2),data.col=9) # create a variogram tiny.var<-variog(tiny.geo,estimator.type='classical') # the robust estimator tiny.var.robust<-variog(tiny.geo,estimator.type='modulus') par(mfrow=c(2,1)) plot(tiny.var) plot(tiny.var.robust)
Fitting a variogram model (spat3.r) tiny.var.fit<-variofit(tiny.var.robust,ini.cov.pars=c(0.04,25.0),cov.model='exponential', fix.nugget=FALSE,nugget=0.005) lines(tiny.var.fit)
Kriging par(mfrow=c(1,1)) loci<-expand.grid(seq(0,20)-50,seq(0,20)+25) kc<-krige.conv(tiny.geo,loc=loci,krige=krige.control(type.krige='ok',obj.model=tiny.var.fit)) image(kc,loc=loci) contour(kc,add=TRUE)
persp(kc, loc = loci,phi=45,xlab='lon',ylab='lat',zlab='log Tz')
If you have time • Try different forms for the variogram: • gaussian • spherical • ?cov.spatial for details