260 likes | 476 Views
Kriging, variograms, term project discussion/ definition. Peter Fox GIS for Science ERTH 4750 (98271) Week 6, Tuesday, February 28, 2012. Contents. Reading review More geostatistics Kriging Variograms Projects Continued lab on Friday, query, etc. Next weeks. Reading for last week.
E N D
Kriging, variograms, term project discussion/ definition Peter Fox GIS for Science ERTH 4750 (98271) Week 6, Tuesday, February 28, 2012
Contents • Reading review • More geostatistics • Kriging • Variograms • Projects • Continued lab on Friday, query, etc. • Next weeks
Reading for last week • Three papers: • Twelve Different Interpolation Methods • A Comparison of Thiessen-polygon, Kriging, and Spline Models of UV Exposure • Geostatistical interpolation of daily rainfall at catchment scale: the use of several variogram models in the Ourthe and Ambleve catchments, Belgium • Chapter 8 in MapInfo User Guide (10.5): Selecting and Querying Data (p. 193-228) -> Friday
Kriging • .. is a group of geostatistical techniques to interpolate the value of a random field (e.g., the elevation, z, of the landscape as a function of the geographic location) at an unobserved location from observations of its value at nearby locations. • Characterize spatial data • Optimize sampling
Many ‘types’ • Depending on the stochastic properties of the random field different types of kriging apply. • The type of kriging determines the linear constraint on the weights wi implied by the unbiasedness condition; i.e. the linear constraint, and hence the method for calculating the weights, depends upon the type of kriging.
Kriging computes the best linear unbiased estimator of z(x0) based on a stochastic model of the spatial dependence quantified either by • the variogram γ(x,y) or by • Expectation m(x) = E[Z(x)] and the covariance function c(x,y) of the random field. • Note contrast between stochastic = non-deterministic ~ ‘random’ and known models, e.g. Gaussian = deterministic
Notes • If the data locations are fairly dense and uniformly distributed throughout the study area, you will get fairly good estimates regardless of interpolation algorithm. • If the data locations fall in a few clusters with large gaps in between, you will get unreliable estimates regardless of interpolation algorithm. • Almost all interpolation algorithms will underestimate the highs and overestimate the lows; this is inherent to averaging and if an interpolation algorithm didn’t average we wouldn’t consider it reasonable
Advantages of Kriging • Helps to compensate for the effects of data clustering, assigning individual points within a cluster less weight than isolated data points (or, treating clusters more like single points) • Gives estimate of estimation error (kriging variance), along with estimate of the variable, Z, itself (but error map is basically a scaled version of a map of distance to nearest data point, so not that unique) • Availability of estimation error provides basis for stochastic simulation of possible realizations of z(x0)
Kriging • The property field is given by • z(x, y) = m(x, y) + F'(x, y) + F’’ • where m is the deterministic part of the field, F' is the stochastic spatially dependent part not in m, and F'' is the Gaussian part (spatially uncorrelated, mean = 0, variance = s2). • If m is taken as the average of the field then • E [m(x, y) - m(x+dx, y+dy) ] = 0 • where E is the expected value (the mean) and (dx, dy) = h represents an offset in the position. In other words, a constant field does not change with position.
Letting x represent the position vector (x, y) and assuming the variances in the differences between any 2 measurements depends only on the distance between them: • E [ { z(x) – z(x+h) }2] = E [ { F'(x) – F'(x+h) }2] = 2 G(h). • G(h) is known as the semivariance • G(h) = (2n)-1 SUM i=1,n { z(xi) – z(xi + h) }2
Semivariance This represents a sum of the differences squared for all pairs of points that are a distance h apart. Typically one calculated G (h) for a range of distances and plots G (h) vs. h.
Variograms • The shape of the variogram is diagnostic of the spatial variation present in the measurements. • The nugget is an estimate of s2 which is the Gaussian (uncorrelated) noise in the data plus any spatial correlations at distances too small to be resolved. If the nugget is large, the data are noisy and largely Gaussian – interpolation is not indicated, just use m (x,y).
Variograms • Within the range (a distance) the closer points are more correlated and farther points are less correlated. • From this we learn how far out to extend our search area in IDW, for example. • In the region of the sill, the semivariance is independent of inter-site distances.
Kriging interpolation • Kriging is a weighted average type interpolation where the weights are based on the semivariance estimates. • zj = [SUM i=1,n (wi zi )] / [ SUM i=1,n wi ], • w is related to G(h) in a manner that we won't discuss in detail here. • Simply, the more points at some distance are correlated, the more weight that distance will have. • Weights are estimated by a minimization procedure and subject to the constraint that their sum is unity.
Summary • Topics for GIS (for Science) • Kriging – weighting using aspects of the spatial characteristics of the data • Goal to begin to know when to use this technique.. • For learning purposes remember: • Demonstrate proficiency in using geospatial applications and tools (commercial and open-source). • Present verbally relational analysis and interpretation of a variety of spatial data on maps. • Demonstrate skill in applying database concepts to build and manipulate a spatial database, SQL, spatial queries, and integration of graphic and tabular data. • Demonstrate intermediate knowledge of geospatial analysis methods and their applications.
Projects • Projects should fall within the scope of GIS. • This means that the work must go beyond just making a map. • You should think about the spatial relationships of your data. • A good approach is to start with a hypothesis, think of a way to test the hypothesis, find or collect the necessary data, and do the analysis. • You do not have to come up with a positive result, disproving the hypothesis is just as good.
Projects • The topic of the project is up to you, within the guidelines above. • Grading will be according to the following: Introduction (10%); Data description, including uncertainties (15%); Analysis (20%); Error analysis (20%); Graphical presentation (20%); Conclusions and recommendations (15%)
Some example topics: • Rate of drug arrests vs. number of officers in different regions • Superfund sites vs. economic conditions • Lake George sedimentation and organic carbon • Crime areas on campus • Regional variation of DDT in fish • Toxic releases in Albany Co. • Analysis of earthquake insurance risks • Geology, radon, and lung cancer • Beach water quality in San Diego • Landfills vs. Income levels • Solar energy in NY state • Use of GIS in paleontology
Friday Mar. 2 • Lab session – continued – query, interpolation, etc. • Discussion of your initial project ideas • And MapInfo examples • Thematic maps MapInfo master: thematic maps while U wait
Reading for this week • Geostatistics References • Kriging (wikipedia) • Bohling on Kriging • Bohling on Variograms • For Friday • Help with SQL (Standard Query Language) • Chapter 8 in MapInfo User Guide (10.5): Selecting and Querying Data (p. 193-228)
Next classes • Next Tuesday, March 6, Analysis of continuous surfaces (filtering, slopes, shading) • Friday, March 9 • Spring break…