260 likes | 356 Views
TerraStream: From Elevation Data to Watershed Hierarchies. Andrew Danner (Swarthmore), T. Moelhave (Aarhus), K. Yi (HKUST), P. K. Agarwal (Duke), L. Arge (Aarhus), H. Mitasova (NCSU). Thursday, 08 November 2007. Current Problem: Large Point Data Sets. LIDAR
E N D
TerraStream: From Elevation Data to Watershed Hierarchies Andrew Danner (Swarthmore), T. Moelhave (Aarhus), K. Yi (HKUST), P. K. Agarwal (Duke), L. Arge (Aarhus), H. Mitasova (NCSU) Thursday, 08 November 2007
Current Problem: Large Point Data Sets • LIDAR • NC Coastline: 200 million points – over 7 GB • Neuse River basin (NC): 500 million points – over 17 GB • Grid elevation models • Neuse River basin: • 20ft – 2.5 GB • 10ft – 10GB • 5ft – 40 GB • Data too big for RAM • Must reside on disk • Disk is slow
I/O-efficient Algorithms [AV88] Disk RAM CPU • Traditional algorithms optimize CPU computation • Not aware of performance penalty of disk access • Virtual memory, swap space can’t predict disk access • I/O model • Memory is finite • Data is transferred in blocks • Complexity measured in disk blocks transferred B M
TerraStream Goals • Scalable – All stages must work for 100+ million points/cells • General – Stages should work with either TIN or grid data • Automated – No need for manual intervention/preprocessing • Modular – Users only need to run the stages they want • Adaptable – Allow each stage to support multiple models
Points to DEM • Grid DEM • Interpolation • Use quad tree to automatically tile terrain • Use quad tree neighbors for smooth boundary transitions • TIN • I/O efficient Delaunay triangulation • Constrained Delaunay also possible if constraints (breaklines) fit in memory • Height graph • View both grids and TINs as a height graph. • Nodes, neighbors, and edges between neighboring nodes • Definition of node, neighbor different in TIN/Grid • Design algorithms to work on height graphs
Flow Modeling • Identifying minima due to noise • Removing noise from terrains • Modeling flow directions, extracting river networks
Coping with Noisy Data • Identifying minima likely due to noise • Topological persistence – Computed in Sort(N) I/Os • Assign a significance score to each minima (low score likely noise) • Provide mechanism for removing low scoring sinks • User can select score threshold
Noise Removal PhD Defense Noisy terrain After noise removal Flooding in Sort(N) I/Os Other Mechanisms? Carving?
From Elevation to River Networks 110 90 85 95 100 80 Sea • Where does water go? • From higher elevation to lower elevation • Single flow directions form a tree • Support for multiple flow directions
Drainage Area 1 2 2 1 7 1 9 1 1 2 11 14 17 1 2 25 1 3 7 1 1 1 • How much area is upstream of each node? • Each node has initial drainage area (1) • Drainage area of internal nodes depends on drainage area of children 3 3 5
Computing Flow Directions/Drainage • Terraflow • Sort(N) I/Os on grids • Modified to work on height graphs • Same I/O bound • Now works on TINs • New implementation • More robust, portable • Incorporate new sink removal • Better handling of flat areas…
Flow Modeling Improvements • Detection of flat areas • Improved method on grids if O(1) rows fit in memory • Routing on flat areas • Soille extension of Garbrecht & Martz • Flat areas usually result of hydrological conditioning with flooding
Watershed Hierarchies • Decompose a river network into a hierarchy of hydrological units • All water in HU flows to a common outlet • Hierarchy provides tunable level of detail • Method used: Pfafstetter • Want a solution scalable to large modern hi-res terrains
Pfafstetter 1 2 1 8 3 9 1 7 2 7 3 1 4 5 11 2 9 1 14 3 17 2 1 1 3 25 7 2 6 1 1 1 1 5 • Find main river • Find four largest tributaries • Label basins/interbasins • Recurse until single path
Example Watershed Boundaries 96 9 97 999 98 998 94 6 994 992 993 99 4 95 93 997 8 92 2 91 996 7 991 5 3 995 1 All levels computed in one run. User selects level of detail with map algebra
Implementation • TPIE: C++ primitives for I/O-efficient algorithms • Standalone command line apps with GDAL • GRASS: Open Source GIS Plugins • ArcGIS Plugins (soon) • Test Data: • North Carolina LIDAR • Neuse river basin: 400 million points (NC Floodmaps) • Outer banks coastal data : 128 million points (NOAA CSC) • USGS 30m NED
Our Results • Experimental Results • Scales to over 400 million points • Other software tools crash at 25 million points • Keeps memory usage low using I/O efficient methods
Future Directions – Grid Construction • Interpolate leaves in parallel • Test other interpolation methods • Test with more data sources • Finding the “ideal” resolution
Future Directions – Noise Removal • Bridge detection/removal • Hydrological conditioning with carving • Scoring of sinks based on volume • Other flow routing methods • Further flat routing improvements
Flow Routing and Bridges Use flooded terrain for connectivity but Use original terrain for routing