Correlation Modeling

Correlation Modeling • Find a “Response” between predictor and response (field sample) variables • Environmental Modeling • Finding a response between environmental variables and a field measurement • Examples: • Habitat maps, biomas, board feet, etc. • Also applies to: • Social issues, economic questions, transportation, engineering, public heath, security, disasters, and combinations

Linear Regression

Correlation Modeling • Creates a model in N-Dimensional “Hyper-Space” • Vary by: • Predictor variables • Response variables • Mathematics used to create the model • Statistics used to optimize parameters • Options for model evaluation

Multiple Linear Regression

Linear Regression: 2 Predictors Mathworks.com

Non-Linear Regression

Correlation Methods • Continuous Regression: • Linear Regression • Generalized Linear Models (GLM) • Generalized Additive Models (GAMs) • Categorical Regression (trees): • Regression Trees • Classification and regression trees (CART) • Machine Learning: • Maximum Entropy (Maxent) • NPMR, HEMI, BRTs, etc.

Brown Shrimp Size • Add graph from work

Exponential Phenomenon

Brown Shrimp in GOM

Spatial Modeling Process 100 0 Spreadsheets Measurements Predictor Layers Temperature Precipitation Modeling Algorithm Model Parameters Habitat Suitability Map Map Generation Habitat Suitability Map for Purple Loostrife by Catherine Jarnevich

Douglas-Fir sample data Create the Model Model “Parameters” Precip Extract Prediction To Points Text File Attributes To Raster

ArcGIS Commands Extract by Mask: Crop raster with polygon Copy Raster: Change raster data type Resample: Change raster resolution Douglas-Fir sample data Create the Model Model “Parameters” Precip Extract Multi Values to Points Prediction Raster To Points Text File Attributes Point To Raster

Sample Data • Original: • Occurrences (Presence) • Measured value: continuous or categorical • Date & Time • Uncertainty • Processed (aggregated): • Min, Max, Mean, Std. Dev., Range • “Filtered”

Aggregating Sample Data • Occurrences to Density • Gridded? • Height or average height?

Doug-Fir Height vs. Precip.

Douglas Fir Height

Predictor Variables • Distance to: water, roads, cities? • Temperature, precipitation • Elevation, aspect, slope, absolute aspect • Soil types • Other species? • Distance to humans? • Census factors: income, age, etc.

Predictor Layers • Means, mins, maxes • Range of values • Heterogeneity • Spatial layers: • Distance to… • Topography: elevation, slope, aspect

Characterizing Uncertainty • Where did the data come from? • What process has it gone through? • Collection methods • Equipment • Protocol • Processing • Transcription errors • Investigate to develop uncertainty estimates: • Documentation, contact those involved http://museum.sdsmt.edu

Data Qualification • What is the nature of the data? • Is the data good enough for the task? • Data: • Samples of the phenomenon we are going to predict (i.e. the response variable) • Predictor variables • Tools: • Plotting: Scattergrams, histograms • Mapping: Visual inspection • Analysis: Lots!

Gross Errors • Lat/Lon: • Reversed • 0, names, dates, etc. • Dates: • Extended in databases • Measurements: • Inconsistent units • Inconsistent protocols • What can you expect from a field team?

Occurrences of Polar Bears From The Global Biodiversity Information Facility (www.gbif.org, 2011)

Temporal Issues • Divide data into months, seasons, years, decades. • Consistent between predictors and response • Extract predictors as close to sample location and dates as possible • Use the “best” predictor layers

Samples and Predictors • As close to field measurements as possible • Clean and aggregate data as needed • Documenting as you go • Estimate overall uncertainty • Answer the question: • What spatial, temporal, and measurement scales are appropriate to model at given the data?

What’s the Impact on Models?

Basic Tools • Histograms: What is the distribution of occurrences of values (range and shape) • Scattergrams: What is the relationship between response and predictor variables and between predictor variables • QQPlots: Are the residuals normally distributed?

CONUS Annual Percip.

Predictor Variables

Min Temp of Coldest Month

Histograms hist(Temp,breaks=400)

Model Optimization & Selection • Modeling approach • Predictor Selection • Parameter estimation • Validation: • Against sub-sample of data • Against new dataset • Parameter sensitivity • Uncertainty estimation

Model Approach • Model Selection: • There are many different model methods and some methods have many options • Run a wide variety and select the one with the best AIC/AICC

Predictor Selection • Predictors are the most important? • Jackknifing • Remove each predictor, rerun model • All combinations of possible predictors?

Parameter Estimation • Most methods estimate the parameters of the model for us • What can we modify to see what the effect is on parameter estimation: • Data set • Maxent: Regularization parameter

Validating Against Samples (Cross-Validation) • Optimal: • Completely separate dataset • Training/test: • Build model with 70% training • Randomly sampled • Test against 30% • Bootstrapping: • Remove samples randomly, model, repeat • Examine how well model predicts removed values

Parameter Sensitivity • Basic: • Modify parameters to expected bounds and re-run model • More: • Modify parameters based on statistical distribution and rerun repeatedly

Uncertainty Estimation • Document all potential uncertainties • If we know (or can guess at) the uncertainty in the sample data or predictors, we can estimate the uncertainty in the outputs: • “Jiggle” the sample data and/or predictors and re-run the model to see the effect

Monte Carlo Methods • Run models repeatedly changing samples, predictor variables, or model options • Provides insight into: • Uncertainty effects • Model sensitivity / robustness • Parameter estimation • Validation (sub-sampling)

Programming is Important • Python or R: • Subset sample data in different ways • Randomly • Select different predictors • Reject one, all combinations • Select models and options • ? • Repeat

Additional Slides

Process Predicted Surface Point to Raster in ArcMap Query Database Predict in R Analysis, Modeling in R Field Data (Points) To Points in ArcMap Save to CSV Grid? Add Predictor Values (ArcMap: Extract From Raster) Predictor Rasters No Yes Convert to raster Convert to points

Digital Elevation Model (DEM)

Ready? • Table of points, polylines, or polygons • Spatial data • Measured values • Predicator layers • Add predictor values to table • Time to model!

Correlation Modeling

Correlation Modeling

Presentation Transcript

Correlation

Correlation

Correlation

Correlation

Modeling Time Correlation in Passive Network Loss Tomography

Correlation

Correlation

Correlation

Correlation

Working Independence versus modeling correlation Longitudinal Example

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

Correlation

2.1a Polynomial Functions Linear Functions Linear Correlation/Modeling

Reserve Variability Modeling: Correlation

Correlation

Correlation