1 / 55

Statistics in WR: Session 20

Statistics in WR: Session 20. Introduction to Spatial Statistics Ernest To. Outline. Basics of spatial statistics Kriging Application of spatial-temporal statistics (Gravity currents in CCBay). Basics. Consider the following scenario.

sanne
Download Presentation

Statistics in WR: Session 20

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics in WR: Session 20 Introduction to Spatial Statistics Ernest To

  2. Outline • Basics of spatial statistics • Kriging • Application of spatial-temporal statistics (Gravity currents in CCBay) Ernest To 20090408

  3. Basics

  4. Consider the following scenario • Two river stations, A and B, measure dissolved oxygen (DO). • At station A • mean DO = µA = 5 mg/L • std dev at Station A= σA = 2 mg/L • At station B • mean DO = µB = 5 mg/L • std dev at Station A= σB = 2 mg/L • Correlation between measurements at stations A and B = ρAB = 0.5. A B Ernest To 20090408

  5. New data! • We collected a DO measurement of 2 mg/L at Station A. • What is the updated mean (µB|XA ) and standard deviation (σB|XA) at Station B? • (assume that the DO distributions are normal) • µA = 5 mg/L • σA = 2 mg/L • New sample • X A = 2 mg/L A • µB = 5 mg/L • σB = 2 mg/L • µB|XA = ? • σB|XA = ? B Ernest To 20090408

  6. Let’s sketch out the distributions • Distributions at A and B (assume normal) • Joint distribution at A and B f(xA) f(xB) XA XB • µA = 5 mg/L, σA = 2 mg/L • µB = 5 mg/L, σB = 2 mg/L f(xA,xB) XA Ernest To 20090408 XB

  7. Marginal and joint distributions f(xA) f(xA,xB) XA f(xB) XA XB Ernest To 20090408 XB • µA = 5 mg/L, σA = 2 mg/L • µB = 5 mg/L, σB = 2 mg/L

  8. How does ρAB affect the shape of the joint distribution? Scatter plots of XA vs XB • ρAB = 0.99 • ρAB = -0.99 • ρAB = 0.5 • ρAB = 0 XA XA XA XA XA XA XA XB XB XB XB XB XB XB f(xA,xB) XA XB Joint distribution of XB and XA Ernest To 20090408

  9. Bayesian conditioning Prior pdf (joint distribution) XA PRIOR STAGE XB CONDITIONALIZATION STAGE Observed data is used to update the distribution. xA = 2 mg/L XA XB POSTERIOR STAGE A conditional pdf for XB is generated. Prior pdf xA = 2 mg/L XA Conditional pdf Ernest To 20090408 XB

  10. Conditional pdf Prior pdf If the prior pdf is binormal, the conditional pdf is also normal with: Mean = Variance = xA = 2 mg/L XA XB Conditional pdf XB|XA (The variance is independent of XA or XB Homoscedasticity) Ernest To 20090408 Expected value of conditional pdf is a linear function of the conditioning data

  11. Back to the problem Updated mean and std. dev at Station B Mean Std. dev • µA = 5 mg/L • σA = 2 mg/L • New sample • X A = 2 mg/L A • µB = 5 mg/L • σB = 2 mg/L • µB|XA = 3.5 mg/L • σB|XA = 1.7 mg/L B Ernest To 20090408

  12. Can we do the same for any two points on the river? Yes we can…. But under following conditions • Normality • 2nd order stationarity: • Mean does not change with location • Variance does not change with location • Know the mean and variance. • Have a function that determines the correlation between two locations A • µ = 5 mg/L • σ = 2 mg/L B Ernest To 20090408

  13. Modeling correlation In spatial statistics, correlation is modeled as a function of the separation distance between two points Where h = separation distance (aka lag). Most of the time, correlation decreases with distance. (Things that are closer together tend to be more correlated with each other). Ernest To 20090408

  14. Estimating correlation model from data Imagine the case where we have a smattering of data along an axis. Any given pair of data points, i and j, will have two properties: • The semivariance = γ = 0.5*(Zi-Zj )2 2. The separation distance = hij hij = separation distance Data point j Measured value =Zj Data point i Measured value =Zi Ernest To 20090408

  15. Estimating correlation model from data We can plot the semivariance, γ , of all possible pairs against the lag, h. This gives us a variogram. Ernest To 20090408

  16. Estimating correlation model from data We can fit a curve through the semivariogram to model the semivariance as a function of the lag. This is the variogram model. Ernest To 20090408

  17. Estimating correlation model from data We can fit a curve through the semivariogram to model the semivariance as a function of the lag. This is the variogram model. sill range Ernest To 20090408

  18. Estimating correlation model from data Assuming that mean and variance do not change with location (assumption of stationarity), the variogram model is related to the covariance model by the equation: C(h) Where σ2 is the variance Ernest To 20090408

  19. Estimating correlation model from data Assuming that variance does not change with location (assumption of stationarity), the correlation model is related to the covariance model model by the equation : ρ(h) 1 .8 .6 .4 .2 Ernest To 20090408

  20. How does the correlation model affect the estimation • ρAB = 0 • ρAB = 0.5 • ρAB = 0.99 Scatter plots of XA vs XB XA XB XA XA f(xA,xB) XA XA Joint distribution of XA and XB XB XB XB XB XA XB Conditional distribution of XB|XA XB|XA Increasing h Ernest To 20090408

  21. Kriging

  22. Multivariable case What if we have more than one location that provide conditioning data? (Assume distributions are STILL normal at all locations). • At station A1, A2, A3, A4 • µA1 = µA2 = µA3 = µA4 = 5 mg/L • σA1 = σA2 = σA3 = σA4 = 2 mg/L • At station B • mean DO = µB = 5 mg/L • std dev at Station A= σB = 2 mg/L • ρ =f(h)= 0.0125h2 - 0.225h + 1 A1 A2 A3 A4 B Ernest To 20090408

  23. Modeling correlation ρ =f(h)= 0.0125h2 - 0.225h + 1 Distance along river (in hundred meters) 2 2 2 2 B A4 A3 A2 A1 From correlation model: ρA1B = 0.0, ρA2B = 0.1, ρA3B = 0.3, ρA4B = 0.6; ρA1A2 = 0.6, ρA1A3 = 0.3, ρA1A4 = 0.1, ρA2A3 = 0.6, ρA2A4 =0.3 , ρA3A4 = 0.6 Ernest To 20090408

  24. Dealing with multiple variables Divide locations into two groups: • The vector, , representing the set of random variables at the locations contributing the conditioning data. • The variable, ,representing the random variable at the point of estimation. A1 A2 A3 A4 B Ernest To 20090408

  25. Concept 1. If individual distributions are normal, joint pdf is multi-normal. 2. Group variables into two: one for points with data, one for the point of estimation. XB XA1 XA4 XA2 XA3 Prior pdf 3. Intersect pdf with conditioning data to get conditional pdf. Ernest To 20090408 Conditional pdf

  26. Dealing with multiple variables The updated mean and variance of the distribution at Station B are given by: Mean: Variance: Where: A1 A2 A3 A4 B Ernest To 20090408

  27. Equations in multivariable case are more generalized Recall two variable case • Multivariable case takes into account • Correlation between data locations and estimated location ( ). • Correlation among data locations ( ). • This is the most fundamental form of kriging, i.e. Simple Kriging. Multivariable case Conditional pdf Ernest To 20090408

  28. Plug and Chug • Recall that Cov(A,B) = ρAB σA σ B • Compute data to data correlation: Ernest To 20090408

  29. Plug and Chug • Compute data to estimation point correlation: Ernest To 20090408

  30. Plug and Chug weights Note: The weights attributed to each station are determined by the prior (joint distribution) among them. Ernest To 20090408

  31. Weights = [λ1, λ2, λ3,… λn] Plug and Chug weights Note: The weights attributed to each station are determined by the prior (joint distribution) among them. Ernest To 20090408

  32. Plug and Chug Ernest To 20090408

  33. Plug and Chug Ernest To 20090408

  34. Results from Simple Kriging The updated mean and standard deviation of the distribution at Station B are: Mean: Standard deviation: A1 A2 A3 A4 B Ernest To 20090408

  35. Other forms of kriging • Ordinary kriging (OK) • Does not require mean to be known • Assumes that mean is constant and is somewhere in the range of the conditioning data • Universal kriging (UK) • Does not require mean to be known nor require it to be constant • User specifies a model for the trend in mean. UK will then fit the model to the data. • Indicator kriging (IK) • handles binary variables (0 or 1) • has ability to take care of non-normality in data through iterative application. • Co-kriging (CK) • takes into account a related secondary variable to help estimate the primary variable. Ernest To 20090408

  36. Extension to 2D, 3D • The lag can be represented by the euclidean distance between 2 points • So the covariance model of the form, C = f(h), can still be used • Variables may be more correlated in one direction than the other (anisotropy) • linear transformation can be performed to transform the distances so the correlation distance is the same in all directions (isotropy) Ernest To 20090408

  37. Extension to space-time • For space and time, there is no standard space-time metric. • The form: • is not always correct because the temporal and spatial axes are not always orthogonal to each other. • Processes that happen in time usually have some dependency on processes that happen in space. • (They are not independent). • A separate temporal lag term is usually used • The covariance function takes on the form: Ernest To 20090408

  38. Application(Gravity currents in Corpus Christi Bay)

  39. Sensors in Corpus Christi Bay TCOON stations TCEQ stations Corpus Christi Bay Oso Bay Gulf of Mexico Laguna Madre Ernest To 20090408 Aerial photo from Google Earth USGS gages SERF stations HRI stations

  40. Ernest To 20090408

  41. Ernest To 20090408

  42. Selecting a study area depressions ridges ? ? ? - 5.0 m above Mean High Water Level - 4.5 m above Mean High Water Level Oso Bay - 4.0 m above Mean High Water Level - 3.5 m above Mean High Water Level West Laguna Madre - 2.5 m above MeanHigh Water Level East Laguna Madre - 2.0 m above Mean High Water Level - 1.5 m above Mean High Water Level Ernest To 20090408 - 1.0 m above Mean High Water Level channel

  43. Downstream of East Laguna Madre Plume tracking survey July 14 to 17, 2006. (While gravity current was on the move) Ben Hodges University of Texas at Austin Water quality data July 12 and 18, 2006. (At birth and demise of gravity current) Paul Montagna Texas A&M University, Corpus Christi Ernest To 20090408

  44. Synthesis of data salinity salinity salinity salinity salinity salinity salinity salinity salinity salinity salinity salinity 0 0 0 0 0 0 0 0 0 0 0 0 depth depth depth depth depth depth depth depth depth depth depth depth t = 0 t = 2 t = 3 t = 1 Direction of flow Synthesis Ernest To 20090408 Salinity profiles collected at various locations and time Time history of gravity current along direction of flow

  45. HydroGet interface Acquired data in ArcHydro II Time Series Table HRI stations Data Preparation 1. Salinity data from HRI are acquired using HydroGet (a GIS web service client) and combined with plume tracking data. 2. Data locations are projected onto a reference line following the general direction of flow. • Space-time kriging is performed in 3 dimensions • X= Longitudinal measure • (meters from origin point) • Y =Time • (days since 7/12/2006) • Z =Elevation • (meters from water surface) Reference line Origin x = 0 m Ernest To 20090408

  46. Variogram along direction of flow where h= lag distance along direction of flow C0= nugget = 2 psu2 C1= sill = 3.6 psu2 a = range = 6000 m (Gaussian variogram model) Ernest To 20090408

  47. Variogram along direction of flow where h= lag distance along direction of flow C0= nugget = 2 psu2 C1= sill = 3.6 psu2 a = range = 6000 m (Gaussian variogram model) sill nugget range Ernest To 20090408

  48. Variogram along depth where h= lag distance along direction of flow C0= nugget = 0 psu2 C1= sill = 3.6 psu2 a = range = 1.7 m (Gaussian variogram model) Ernest To 20090408

  49. Variogram along time axis where h= lag distance along direction of flow C0= nugget = 0 psu2 C1= sill = 3 psu2 a = range = 1 day (Spherical variogram model) Ernest To 20090408

  50. Interpolation results N LEGEND 37 – 40 psu 40 – 42 psu 42 – 43 psu 42 – 44 psu 44 – 46 psu Elevation Longitudinal profile on 7/13/2006 18:00 z Time Distance to origin point N Longitudinal profile on 7/12/2006 18:00 y Ernest To 20090408 x

More Related