180 likes | 366 Views
Spatial Big Data Challenges Intersecting Cloud Computing and Mobility. Shashi Shekhar McKnight Distinguished University Professor Department of Computer Science and Engineering University of Minnesota www.cs.umn.edu/~shekhar. Shortest Paths. Storing graphs in disk blocks.
E N D
Spatial Big Data ChallengesIntersecting Cloud Computing and Mobility Shashi Shekhar McKnight Distinguished University Professor Department of Computer Science and Engineering University of Minnesota www.cs.umn.edu/~shekhar
Shortest Paths Storing graphs in disk blocks Evacutation Route Planning only in old plan Only in new plan In both plans Parallelize Range Queries Spatial Databases: Representative Projects
Why cloud computing for spatial data? • Geospatial Intelligence [ Dr. M. Pagels, DARPA, 2006] • Estimated at 140 terabytes per day, 150 peta-bytes annually • Annual volume is 150x historical content of the entire internet • Analyze daily data as well as historical data
Eco-Routing U.P.S. Embraces High-Tech Delivery Methods (July 12, 2007)By “The research at U.P.S. is paying off. ……..— saving roughly three million gallons of fuel in good part by mapping routes thatminimize left turns.” • Minimize fuel consumption and GPG emission • rather than proxies, e.g. distance, travel-time • avoid congestion, idling at red-lights, turns and elevation changes, etc.
Real-time and Historic Travel-time, Fuel Consumption, GPS Tracks 5
Eco-Routng Research Challenges • Frames of Reference • Absolute to moving object based (Lagrangian) • Data model of lagrangian graphs • Conceptual – generalize time-expanded graph • Logical – Lagrangian abstract data types • Physical – clustering, index, Lagrangian routing algorithms • Flexible Architecture • Allow inclusion of new algorithms, e.g., gps-track mining • Merge solutions from different algorithms • Geo-sensing of events, • e.g., volunteered geographic information (e.g., open street map), • social unrest (Ushahidi), flash-mob, … • Geo-Prediction, • e.g., predict track of a hurricane or a vehicle • Challenges: auto-correlation, non-stationarity • Geo-privacy
Cloud Computing and Spatial Big Data • Motivation • Case Study 1: Simpler to Parallelize • Case Study 2 – Harder • Case Study 3 – Hardest • Wrap up
Simpler: Land-cover Classification • Multiscale Multigranular Image Classification into land-cover categories Inputs Output at 2 Scales
Parallelization Choice 1. Initialize parameters and memory 2. for each Spatial Scale 3. for each Quad 4. for each Class 5. Calculate Quality Measure 6 end for Class 7. end for Quad 8. end for Spatial Scale 9. Post-processing
Set of Polygons Set of Polygons Local Terrain Database Remote Terrain Databases Graphics Engine Display 2Hz. 8Km X 8Km Bounding Box 25 Km X 25 Km Bounding Box 30 Hz. View Graphics High Performance GIS Component Harder: Parallelizing Vector GIS • (1/30) second Response time constraint on Range Query • Parallel processing necessary since best sequential computer cannot meet requirement • Blue rectangle = a range query, Polygon colors shows processor assignment
Data-Partitioning Approach • Initial Static Partitioning • Run-Time dynamic load-balancing (DLB) • Platforms: Cray T3D (Distributed), SGI Challenge (Shared Memory)
Hardest – Location Prediction Nest locations Distance to open water Vegetation durability Water depth
Maximum Likelihood Estimation • Need cloud computing to scale up to large spatial dataset. • However, computing determinant of large matrix is an open problem! Ex. 3: Hardest to Parallelize
Cloud Computing and Spatial Big Data • Motivation: Spatial Big Data in National Security & Eco-routing • Case Study 1: Simpler to Parallelize • Map-reduce is okay • Should it provide spatial declustering services? • Can query-compiler generate map-reduce parallel code? • Case Study 2 – Harder • Need dynamic load balancing beyond map-reduce • Case Study 3 – Hardest • Need new computer science, e.g., • Eco-routing algorithms • determinant of large matrix • Parallel formulation of evacuation route planning
Acknowledgments • HPC Resources, Research Grants • Army High Performance Computing Research Center-AHPCRC • Minnesota Supercomputing Institute - MSI • Spatial Database Group Members • Mete Celik, Sanjay Chawla, Vijay Gandhi, Betsy George, James Kang, Baris M. Kazar, QingSong Lu, Sangho Kim, Sivakumar Ravada • USDOD • Douglas Chubb, Greg Turner, Dale Shires, Jim Shine, Jim Rodgers • Richard Welsh (NCS, AHPCRC), Greg Smith • Academic Colleagues • Vipin Kumar • Kelley Pace, James LeSage • Junchang Ju, Eric D. Kolaczyk, Sucharita Gopal