260 likes | 570 Views
SPATIAL. SPATIAL DATA. RESEARCH. DATABASE. MINING. GROUP. AND. www.spatial.cs.umn.edu. Smarter Planet. Spatial Computing: Recent Trends. 4. SIG SPATIAL. Group Members Faculty Professor Shashi Shekhar Current Ph.D. Studnets Pradeep Mohan Mike Evans Dev Oliver Xun Zhou
E N D
SPATIAL SPATIAL DATA RESEARCH
DATABASE MINING GROUP
AND www.spatial.cs.umn.edu
SmarterPlanet Spatial Computing: Recent Trends 4 SIG SPATIAL
Group Members Faculty Professor Shashi Shekhar Current Ph.D. Studnets Pradeep Mohan Mike Evans Dev Oliver Xun Zhou Abdussalam Bannur KwangSoo Yang Viswanath Gunturi Zhe Jiang Jeff Wolff Changqing Zhou Others/Visitors Lydia Manikonda Ivan Brugere
Ongoing Projects Overview • Applications • Transportation, virtual environments, Earth science, epidemiology and cartography. • Spatial Data Mining • Flow anomalies • Teleconnection • Cascade pattern discovery • K-Main-Route (KMR) summarization • Pattern of life • Abrupt change detection • Spatial Database • Eco-Routing • Evacuation planning
Courses • Topics • Application Domains • Conceptual Data Models • Logical Data Models • Physical Data Models • Spatial Networks • Spatial Data Mining • Others • Course Website • http://www.spatial.cs.umn.edu/Courses/Fall11/8715 • Topics • Data Model • Representation & access • Architecture • Others CSCI 8715 – Spatial Databases and Applications National Research Council CSCI 5980 – GIS: a computational perspective
Flow Anomalies Sensor 5 Sensor 2 • Problem • Discover dominant time periods that exhibit anomalous behavior • Why is it hard? • A single dominant time period may have subsets that are not anomalous • No Dynamic Programming • Contributions • A SWEET (Smart Window Enumeration and Evaluation of persistent-Thresholds) Approach Sensor 3 http://www.esri.com/news/arcuser/0405/ss_crimestats2of2.html Sensor 1 Sensor 4 (Source: Shingle Creek, MN Study Site) Ex. An Oil Spill • J. M. Kang, S. Shekhar, C. Wennen, P. Novak, Discovering Flow Anomalies: A SWEET Approach, In the Eighth IEEE International Conference on Data Mining (ICDM '08), pp. 851-856, Pisa, Italy, December 15-19, 2008. (Source: http://www.sfgate.com/cgi-bin/news/oilspill/busan)
Teleconnections • Problem • Find remote connections • Example • El Niño in Pacific • Why is it hard? • Large spatial dataset • Long time series Global Influence of El Nino during the Northern Hemisphere Winter (D: Dry, W: Warm, R: Rainfall) Dead Zone, Gulf of Mexico
Cascading spatio-temporal patterns (CSTPs) Time T1 Time T2 > T1 Time T3>T2 Aggregate(T1,T2,T3) CSTP: P1 • Input: Crime reports with location and time. • Output: Cascading spatio-temporal patterns C {Bar Closing} B A Why are CSTPs important ? • Why is discovering CSTPs hard ? • Trade off between computational efficiency and statistical interpretation. • Pattern space exponential in number of event types. {Vandalism} {Assault} Assault(A) Drunk Driving (C) Bar Closing(B) • Why are CSTPs Novel/better ? • Current STDM literature ignores spatio-temporal semantics(e.g. partial order) a Bar closing a generator for crime related CSTP! • Courtsey: www.startribune.com Results: • Contributions • Interest measure: Cascade participation index lower bound on conditional probability. • Computational Structure • Compute measure efficiently • Avoid unnecessary measure computations Bar locations in Lincoln, NE CPI = 0.022; CPI-Downtown = 0.11 • Pradeep Mohan, Shashi Shekhar, James A. Shine, James P. Rogers. Cascading spatio-temporal pattern discovery: A summary of results. In Proc. of 10th SIAM International Data Mining (SDM) 2010, Columbus, OH, USA • Pradeep Mohan, Shashi Shekhar, James A. Shine, James P. Rogers. Cascading spatio-temporal pattern discovery. IEEE Transactions on Knowledge and data engineering(Accepted, In Press).
A K-Main Routes Approach to Spatial Network Activity Summarization Dev Oliver, Shashi Shekhar, James M. Kang, Renee Bousselaire, Abdussalam Bannur Problem Statement: The spatial network activity summarization (SNAS) problem: Given a spatial network and a collection of activities (e.g., crime reports, emergency requests), find a set of k paths to summarize the activities. K-Means Output K-Means Output KMR Output KMR Output Input Input KMR uses paths instead of ellipses in summarizing activities Importance: SNAS is important for crime analysis and disaster response. • Contribution • The K-Main Routes (KMR) algorithm • Discovers k paths to summarize activities. • Generalizes K-means for network space but uses paths instead of ellipses to summarize activities. • Improves performance by using a network voronoi technique to assign activities to summary paths and a divide and conquer method to recompute summary paths. • Challenge: • Computational Complexity • Choose(N,2) paths, given N nodes • Exponential number of k subsets of paths Related Work: • Results • Proposed two new algorithms for improving the performance of KMR: Network Voronoi activity Assignment (NOVA) and Divide and conquer Summary PAthREcomputation (D-SPARE). • Validation via case studies, experiments and analytical evaluation to verify correctness in context of real workloads. • Successfully transferred software for direct evaluation by the National Geospatial-Intelligence Agency.
Abrupt Change Interval Detection • Given: • A path S in a Spatiotemporal Dataset • A unit-interval change abruptness threshold a • A sameness degree threshold sd • Find: • Dominant ST sub-intervals of S with persistently abrupt change • Objective: • Reduce Computational Cost • Constraints: • Correctness & Completeness Vegetation cover in Africa, August 1-15, 1981. Abrupt vegetation cover change in Africa, August 1-15, 1981. • Results: • Temporal intervals of abrupt rainfall change in Sahel, Africa. Longitudinal spatial abrupt change of vegetation cover in Africa. Publication: Xun Zhou, Shashi Shekahr, Pradeep Mohan, Stefan Liess, Peter K. Snyder, Discovering Interesting Sub-paths in Spatiotemporal Datasets: A Summary of Results. In Proc. 19th Intl’ Conf. Advances on Geographical Information Systems (ACM GIS 2011), Nov 2011, Chicago, IL, USA.
Fuel Efficient Routing INPUT: Road network; a source and destination; a time interval OUTPUT: A path between source and destination for each start time OBJECTIVE: The path should be fuel efficient. Venkata M. V. Gunturi, Ernesto Nunes, KwangSoo Yang, and ShashiShekhar. 2011. A critical-time-point approach to all-start-time lagrangian shortest paths: a summary of results. In SSTD'11, pp 74--91
Evacuation Planning University of Minnesota 2006 Annual Report (http://www.research.umn.edu/communications/publications/documents/OVPRAnnualRpt06.pdf)
Evacuation Planning System in Cloud Environment Why Evacuation Planning? System Architecture for Cloud Environment Hurricane Andrew Florida and Louisiana, 1992 • Lack of effective evacuation plans • Traffic congestions on all highways • Great confusions and chaos "We packed up Morgan City residents to evacuate in the a.m. on the day that Andrew hit coastal Louisiana, but in early afternoon the majority came back home. The traffic was so bad that they couldn't get through Lafayette." Mayor Tim Mott, Morgan City, Louisiana ( http://i49south.com/hurricane.htm ) ( National Weather Services) Hurricane Rita Gulf Coast, 2005 ( www.washingtonpost.com) A Real Scenario (Monticello): Result Routes Hurricane Rita evacuees from Houston clog I-45. ( National Weather Services) ( FEMA.gov) Problem Statement • Given • Transportation network with capacity constraints • Initial number of people to be evacuated and their initial locations • Evacuation destinations • Output • Routes to be taken and scheduling of people on each route • Objective • Minimize total time needed for evacuation • Minimize computational overhead • Constraints • Capacity constraints: evacuation plan meets capacity of the network • Network data size is too large. (Data are stored into secondary storage) • Utilize cloud environment for scalability
Group Alumni • Academia: • Mete Celik (Erciyes Univ.) • Jin SoungYoo (IU-Purdue Univ. Indy) • HuiXiong (Rutgers Univ.) • Yan Huang (Univ. of North Texas) • Wei Li Wu (U. of Texas, Dallas) • Chang-Tien Lu (Virginia Polytechnic Univ) • Sanjay Chawla (Univ. of Sydney) • Du-Ren Liu (National Chiao Tung Univ.) • Andrew Yang (Univ. of Houston). • Government Agency: • James Kang (USDOD) • RangaRajuVatsavai(USDOE-ORNL) • Industry: • Betsy George (Oracle Spatial) • Qingsong Lu (Microsoft Research) • Sangho Kim (ESRI) • BarisKazar (Oracle Spatial) • Pusheng Zhang (Microsoft Virtual Earth) • Xuan Liu (IBM TJ Watson) • SivaRavada (Oracle) • Mark Coyle (Appirio) • BabakHamidzadeh (Boeing Research)
Location prediction: nesting sites Spatial outliers: sensor (#9) on I-35 Nest locations Distance to open water Vegetation durability Water depth Co-location Patterns Tele connections Spatial/Spatio-temporal Data Mining: Representative Project (Ack: In collaboration w/V. Kumar, M. Steinbach, P. Zhang)
Shortest Paths Storing graphs in disk blocks Evacutation Route Planning only in old plan Only in new plan In both plans Parallelize Range Queries Spatial Databases: Representative Projects
Co-location Patterns • Given: • A collection of different types of spatial event • Find: • Co-located subsets of event types • Objective: • Minimize computation time Yan Huang, Shashi Shekhar, and Hui Xiong, Discovering Co-location Patterns from Spatial Datasets: A General Approach, IEEE Transactions on Knowledge and Data Engineering (TKDE), 16(12), pp. 1472-1485, December 2004. (Earlier version appeared in SSTD ’01)
Spatial Outlier Detection • Given: • A spatial graph G={V,E} • A neighbor relationship (K neighbors) • An attribute function f : V -> R • An aggregation function : faggr :R k -> R • Confidence level threshold • Find: • O = {vi | vi V, vi is a spatial outlier} • Objective: • Correctness: The attribute values of vi is extreme, compared with its neighbors • Computational efficiency • Constraints: • Attribute value is normally distributed • Computation cost dominated by I/O op. S. Shekhar, C.T. Lu, and P. Zhang. A unified approach to detecting spatial outliers. GeoInformatica, 7(2), 2003 (Earlier version appeared in SIGKDD ’01).
Location Prediction: Spatial Auto-regression • Given: • Spatial Framework S={s1,…,sn} • Explanatory functions: fxi : S->R • A dependent class: fy : S->[0,1] • A family ζ of function mappings: R x…x R -> [0,1] • Find: • Classification model: f^yЄζ • Objective: • Maximize classification accuracy • Constraints: • Spatial Autocorrelation exists Nest locations Distance to open water S. Shekhar, P. Schrater, R. Vatsavai, W. Wu, and S. Chawla, Spatial Contextual Classification and Prediction Models for Mining Geospatial Data, In IEEE Transactions on Multimedia (special issue on Multimedia Dataabses) p174-188, 2002. Water depth Vegetation durability
Eco-Routing U.P.S. Embraces High-Tech Delivery Methods (July 12, 2007) By “The research at U.P.S. is paying off. ……..— saving roughlythree million gallons of fuel in good part bymapping routes thatminimize left turns.” Do you idle at green light during traffic congestion? • Minimize fuel consumption and GPG emission • rather than proxies, e.g. distance, travel-time • avoid congestion, idling at red-lights, turns and elevation changes, etc.
Evacuation Planning: A Real Scenario, New Plan Routes Experiment Result Total evacuation time: - Existing Plan: 268 min. - New Plan: 162 min. Monticello Power Plant Source cities Destination Routes used only by old plan Routes used only by result plan of capacity constrained routing Routes used by both plans Congestion is likely in old plan near evacuation destination due to capacity constraints. Our plan has richer routes near destination to reduce congestion and total evacuation time. Twin Cities