140 likes | 295 Views
Lab #5: - handling missing georeferenced data: (1) estimating missing values (2) spatial interpolation. Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden. Today’s lab. estimating missing values spatial interpolation
E N D
Lab #5:- handling missing georeferenced data: (1) estimating missing values (2) spatial interpolation Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden
Today’s lab • estimating missing values • spatial interpolation • some comments about geographically weighted regression
Conventional EM estimation FILENAME ATTRIBUT 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-DEM&QUAD-DATA.TXT'; OPTIONS LINESIZE=72; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17.5); X=(SELEV-25)**0.5; ******************************************************************** * ARBITRATILY SUPPRESS A VALUE OF Y; * * FOR CROSSVALIDATION EACH VALUE OF Y WOULD BE SUPPRESSED, IN TURN * ********************************************************************; YO=Y; IF IDDEM=1 THEN Y=0; IF IDDEM=1 THEN IM1=-1; ELSE IM1=0; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; ******************************************************************* * THE MISSING VALUE IMPUTATON IS THE ESTIMATED COEFFICIENT OF IM1 * *******************************************************************; PROC REG; MODEL Y = X IM1; RUN; DATA ORIGINAL; SET STEP1; IF IM1=0 THEN DELETE; RUN; PROC PRINT; VAR YO; RUN;
Spatial EM: eigenvector selection FILENAME ATTRIBUT 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-DEM&QUAD-DATA.TXT'; FILENAME MC 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-MC-EIG.TXT'; FILENAME CONN 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-CON.TXT'; FILENAME EVEC 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-EIGENVECTORS.TXT'; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; * Y=LOG(MELEV+17.5); Y=(SELEV-25)**0.5; Y0=Y; X0=1; IF IDDEM=1 THEN Y='.'; IF IDDEM=1 THEN X0='.'; RUN; PROC SORT DATA=STEP1 OUT=STEP1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; PROC STANDARD MEAN=0 STD=1 OUT=STEP1(REPLACE=YES); VAR Y0; RUN; DATA STEP1 (REPLACE=YES); SET STEP1; INFILE EVEC LRECL=2048; INPUT IDE E1-E73; RUN; ***************************************************************** * THE SET OF CANDIDATE EIGENVECTORS IS DETERMINED BY MCADJ >= 0 * *****************************************************************; PROC REG OUTEST=COEF; MODEL Y = E1-E18/SELECTION=STEPWISE SLE=0.10; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; principal change
PROC TRANSPOSE DATA=COEF PREFIX=B OUT=COEF2; VAR E1-E18; RUN; DATA COEF(REPLACE=YES); INFILE MC; INPUT ID LAM_MCM MC MCADJ; IF MCADJ<0.25 THEN DELETE; RUN; DATA COEF(REPLACE=YES); SET COEF; SET COEF2; IF B1='.' THEN DELETE; IF B1=0 THEN DELETE; BSQ=B1**2; EMC=BSQ*MC; P=1; RUN; PROC PRINT; RUN; PROC MEANS SUM NOPRINT; VAR LAM_MCM P EMC BSQ; OUTPUT OUT=EMC SUM=SUML P SPANUM SPADEN; RUN; PROC STANDARD DATA=TEMP MEAN=0 STD=1 OUT=TEMP(REPLACE=YES); VAR YRESID; RUN; DATA STEP2(REPLACE=YES); INFILE CONN LRECL=1024; INPUT ID C1-C73; RUN; DATA STEP2(REPLACE=YES); SET STEP2; SET TEMP(KEEP=YRESID); SET STEP1(KEEP=Y0 X0); ARRAY CONN{73} C1-C73; ARRAY ZC{73} ZC1-ZC73; ARRAY Y0C{73} Y0C1-Y0C73; CSUM=0; DO I=1 TO 73; CSUM = CSUM + CONN{I}; ZC{I} = YRESID*CONN{I}; Y0C{I} = Y0*CONN{I}; END; Z=YRESID; RUN;
PROC MEANS SUM NOPRINT; VAR CSUM X0; OUTPUT OUT=CSUM SUM=CSUM N; RUN; PROC PRINT; VAR CSUM; RUN; PROC MEANS DATA=STEP2 NOPRINT; VAR Y0C1-Y0C73; OUTPUT OUT=Y0COUT1 SUM=Y0C1-Y0C73; RUN; PROC TRANSPOSE DATA=Y0COUT1 PREFIX=Y0C OUT=Y0COUT2; VAR Y0C1-Y0C73; RUN; PROC MEANS DATA=STEP2 NOPRINT; VAR ZC1-ZC73; OUTPUT OUT=ZCOUT1 SUM=ZC1-ZC73; RUN; PROC TRANSPOSE DATA=ZCOUT1 PREFIX=ZC OUT=ZCOUT2; VAR ZC1-ZC73; RUN; DATA STEP2(REPLACE=YES); SET STEP2(KEEP=X0 Z CSUM Y0); SET ZCOUT2(KEEP=ZC1); SET Y0COUT2(KEEP=Y0C1); Y0C=Y0C1; ZC=ZC1; DROP ZC1; RUN; PROC REG DATA=STEP2 OUTEST=DEN NOPRINT; MODEL CSUM=X0/NOINT; RUN; PROC REG DATA=STEP2 OUTEST=NUM0 NOPRINT; MODEL Y0C=Y0/NOINT; RUN; PROC REG DATA=STEP2 OUTEST=NUM NOPRINT; MODEL ZC=Z/NOINT; RUN; DATA STEP3; SET DEN; SET NUM0; SET NUM; SET CSUM(KEEP=CSUM N); SET EMC(KEEP=SUML SPANUM SPADEN P); MC0=Y0/X0; MC=Z/X0; EMC = -(1+(N/CSUM)*SUML)/(N-P-1); ZMC = (MC-EMC)/SQRT(2/CSUM); MCSPA=SPANUM/SPADEN; RUN; PROC PRINT; VAR MC0 MC EMC ZMC MCSPA; RUN; PROC STANDARD DATA=TEMP OUT=TEMP(REPLACE=YES) MEAN=0; VAR YHAT; RUN;
Spatial EM: imputation FILENAME ATTRIBUT 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-DEM&QUAD-DATA.TXT'; FILENAME EVEC 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-EIGENVECTORS.TXT'; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; * Y=LOG(MELEV+17.5); Y=(SELEV-25)**0.5; YO=Y; IF IDDEM=1 THEN Y=0; IF IDDEM=1 THEN IM1=-1; ELSE IM1=0; RUN; PROC SORT DATA=STEP1 OUT=STEP1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP1 (REPLACE=YES); SET STEP1; INFILE EVEC LRECL=2048; INPUT IDE E1-E73; RUN; **************************** *SPATIAL FILTER IMPUTATION * ****************************; PROC REG OUTEST=COEF; MODEL Y = E3 E4 E5 E7 E12 E13 E14 E16 E17 E18 IM1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; DATA TEMP(REPLACE=YES); SET TEMP; IF _N_=1 THEN SET COEF; IF IDDEM>1 THEN DELETE; RUN; PROC PRINT; VAR IDDEM YO IM1 YRESID; RUN; for calculating the imputation
Spatial filter GWR: eigenvector selection FILENAME ATTRIBUT 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-DEM&QUAD-DATA.TXT'; FILENAME EVEC 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-EIGENVECTORS.TXT'; TITLE 'SPATIAL FILTER GWR MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17.5); X=(SELEV-25)**0.5; RUN; PROC SORT DATA=STEP1 OUT=STEP1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP1 (REPLACE=YES); SET STEP1; INFILE EVEC LRECL=2048; INPUT IDE E1-E73; ARRAY EVECS{73} E1-E73; ARRAY XEVAR{73} XE1-XE73; DO I=1 TO 73; XEVAR{I} = X*EVECS{I}; END; RUN; ********************** * SPATIAL FILTER GWR * **********************; PROC REG OUTEST=COEF; MODEL Y = X E1-E18 XE1-XE18/SELECTION=STEPWISE SLE=0.10 INCLUDE=1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; computes the interaction terms keeps X in the equation
Spatial filter GWR: variable coefficient calculation FILENAME ATTRIBUT 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-DEM&QUAD-DATA.TXT'; FILENAME EVEC 'D:\JYU-SUMMERSCHOOL2006\LAB#1\PR-EIGENVECTORS.TXT'; FILENAME OUTFILE 'D:\JYU-SUMMERSCHOOL2006\LAB#5\SF-GWR.TXT'; OPTIONS LINESIZE=72; TITLE 'SPATIAL FILTER GWR MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17.5); X=(SELEV-25)**0.5; RUN; PROC SORT DATA=STEP1 OUT=STEP1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP1 (REPLACE=YES); SET STEP1; INFILE EVEC LRECL=2048; INPUT IDE E1-E73; ARRAY EVECS{73} E1-E73; ARRAY XEVAR{73} XE1-XE73; DO I=1 TO 73; XEVAR{I} = X*EVECS{I}; END; RUN;
********************** * SPATIAL FILTER GWR * **********************; PROC REG OUTEST=COEF; MODEL Y = X E1-E18 XE1-XE18/SELECTION=STEPWISE SLE=0.10 INCLUDE=1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; DATA COEF(REPLACE=YES); SET COEF (KEEP=INTERCEPT E2 E5 E6 E7 E9 E11 E12 E13 E15 E18 X XE4 XE6 XE9 XE10); A=INTERCEPT; BE2=E2; BE5=E5; BE6=E6; BE7=E7; BE9=E9; BE11=E11; BE12=E12; BE13=E13; BE15=E15; BE18=E18; BX=X; BXE4=XE4; BXE6=XE6; BXE9=XE9; BXE10=XE10; DROP INTERCEPT E2 E5 E6 E7 E9 E11 E12 E13 E15 E18 X XE4 XE6 XE9 XE10; RUN; DATA FINAL; SET STEP1; IF _N_=1 THEN SET COEF; GWRA=A*1 + BE2*E2 + BE5*E5 + BE6*E6 + BE7*E7 + BE9*E9 + BE11*E11 + BE12*E12 + BE13*E13 + BE15*E15 + BE18*E18; GWRB=BX + BXE4*E4 + BXE6*E6 + BXE9*E9 + BXE10*E10; RUN; DATA _NULL_; SET FINAL; FILE OUTFILE; PUT IDDEM GWRAGWRB; RUN; GWR coefficient calculations
Kriging in ArcGIS • Step #1: load a *.shp file • Step #2: access the Geostatistical Wizard • Step #3: fit a semivariogram model: • Select a variable • Select the kriging option • Step #4: select “Prediction Map” under the “Ordinary Kriging” heading • Step #5: select a semivariogram model
Step #6: click FINISH and then OK • Step #7: right-click on the “Ordinary Kriging” Layer that appears & change the extent to “the current display extent” • Step #8: right-click on the “Ordinary Kriging” Layer that appears & select “Create Prediction Standard Error Map”
Step #9: right-click on “Layers” select “Activate” right-click on “Layers” select “Properties” check “Enable” click “Specify Shape” select the map • Step #10: in the Display window, move the base map to the top, and remove its coloring
What you should achieve after this lab: • Impute missing spatial data • Create krigged maps (that are clipped) • Perform a spatial statistical cross-validation • Perform geographically weighted regression with spatial filters