320 likes | 556 Views
Geoprocessing for Animal Premises ID. Luanne Hendricks State of Ohio OIT/GISSC Intern Columbus State Community College. 2005 Ohio GIS Conference September 21-23, 2005 Marriott North Hotel Columbus, Ohio. Overview. Objective Source Data & Desired Outputs Timeline
E N D
Geoprocessing for Animal Premises ID Luanne Hendricks State of Ohio OIT/GISSC Intern Columbus State Community College 2005 Ohio GIS Conference September 21-23, 2005 Marriott North Hotel Columbus, Ohio
Overview • Objective • Source Data & Desired Outputs • Timeline • Tools and Automation • Process • Statistics • Observations
Objective Input: Source Data from County Auditors Geoprocessing Output: - NormalizedParcel Data - Unique AG Owners
Output - Deliverables • Normalized Parcel/Point Geodata • agricultural ( 100 <= LUC <= 199) • dairy (LUC = 103, 113) • residential ( 510 <= LUC <= 520, LUC = 560) • Normalized Tabular Data (Access DB) • Table of unique ag owners with owner_id • Table of parcel data with owner_id • Time Estimate to regenerate data annually
Source Data – Quantity/Quality • Large volume of data • approx. 5 million source records • some counties had 40-50 fields of data • approx. 5 GB of data • Multiple source files per county • Parcel, Point, CAMA data • Non-standardized data fields • Variable completeness
Processing – High Level View Data Collection from Counties Normalize Source Data Generate Owner Ids for Parcel Records Generate Owner Table Match Dairy Addresses to Parcel Table Create Project for User
Need Automation Strategy • Need to automate process for: • Repeatability • Ease of modification • Testability • Traceability • ...As well as speed
Processing Detail - Example Pre-normalization steps in Model-Builder for a county with 2 source files – shape and CAMA that need to be joined. This county is now ready for normalization in Access. Slightly different steps are needed for point files and counties with a single source parcel shapefile.
Processing Detail - Example Continued Loop thru cnty list. Make Field Map Get Fields Delete Temporary table view & layer Model-Builder has limitations – you can’t loop through these steps for a list of counties. But this model can be converted to script and coded to process a list. Additional field-name mapping steps needed due to “coarse-grained” geoprocessing object.
Example of Geoprocessing Tool Limitations When you join fields in the geoprocessing environment, and create a new Feature Layer shapefile, field names are [original layer name].[field name] truncated to 10 characters. Renaming is not done automatically for you as it is when you join and create a new layer manually in ArcMap.
Processing – Owner IDs Data Collection from Counties Normalize Source Data Generate Owner Ids for Parcel Records Generate Owner Table Match Dairy Addresses to Parcel Table Create Project for User
Owner ID Algorithm • Aggregate on Lastname, Firstname • Standardize addresses • For each Lastname,Firstname group, choose the address - OWNADD1, MAILADD1, or SITEADD, that produces the best set of matches
Statistics ORIG_REC = Total AG + Total Residential NOAD = # Records with no address information ADD_REC = Total # of AG + Total Residential associated with more than 1 parcel FINL_REC = Total # of AG + Total Residential associated with at least one AG pcl OWNR = # of Records in the Owner Table NMD_AG = Aggregate of OWNNAM1/MAILADD1 and OWNADD1/MAILADD1 as a sanity check and to compare how effective the processing was
Testing • Use Statistics • Numbers make sense • Numbers add up, e.g.: • All records in Parcel table assigned an ownerid • # Records in Owner Table = # Aggregated on Owner Id in PCL table • Visual Inspection • Visually inspect how Owner Ids were assigned • Create shapefile and view data in project • Spot check source vs. processed data in shapefiles
Status • 53 counties normalized • 40 counties have owner ids/owner table • Dairy matching - to do • Final project – to do
Observations and Conclusions (1) • After initial development, Automation speeds process • For example, using Form Interface to normalize:
Observations and Conclusions (2) • Automation: • speeds process after initial development investment • enables repeatability of process • makes modification and redo less painful • increases data consistency • reduces errors • accurately documents process • increases future capability to do similar processing – tools are reusable • Automation is cost effective
Observations and Conclusions (3) • This job would be easier if: • Data was maintained in small standard components: • Last Name, First Name, MI as separate fields • Address components – SiteNum, SiteDir, SiteStr • There was a standard for field names of components