1 / 60

6350 Spatio -temporal Data Processing Course Overview

6350 Spatio -temporal Data Processing Course Overview. Yan Huang huangyan@unt.edu. Basic Information. Instructor: Yan Huang ( huangyan at unt.edu) Meeting place and time: M 2:30-:520pm B157 Office hours: M 12:30-2:30pm. Basic Information. TA : Sasi Koneru (SasiKoneru@my.unt.edu)

verna
Download Presentation

6350 Spatio -temporal Data Processing Course Overview

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 6350 Spatio-temporal Data Processing Course Overview Yan Huang huangyan@unt.edu

  2. Basic Information • Instructor: Yan Huang (huangyan at unt.edu) • Meeting place and time: M 2:30-:520pm B157 • Office hours: M 12:30-2:30pm

  3. Basic Information • TA: Sasi Koneru (SasiKoneru@my.unt.edu) • Office hours: Monday 10:00 AM to 2:00 PM, F208

  4. Evaluation • The evaluation scheme will be • class participation 10% • paper analysis and presentation - 25% • project - 40%. • Term paper – 30%

  5. Classroom policy • No computers or laptops unless told so.

  6. Paper Analysis I • Collect 5 or more papers in one sub-area • Write short summaries for 3 (100-200 words) • Make a 15 minutes presentation on what you learn on this topic • The presentation will take an integrated approach where you introduce the motivation of the three papers, give a precise problem definition, compare and contrast the ways the 3 papers approach the problem and how they validate their results, present conclusions, and point to some future directions if you can identify

  7. Paper Analysis II • Choose and present one paper from the reading list • Collect two questions from each group • Ask two questions yourself • Lead group discussion • Detail instructions are available from: • http://www.cse.unt.edu/~huangyan/6350/paperAnalysis.txt • One paper every week

  8. Find Related Work • Need to know the key words • May need to explore and refine during your search • Often you can find electronic version of the papers, especially for publications related to computer science • Author’s website • ACM digital library • IEEE xplore • Springer Online • Google scholar • You school typically subscribes to these publishers • Search from a computer with IP address belonging to your school

  9. Computer Science Bibliography Collections • CiteSeer • http://citeseer.ist.psu.edu/ • DBLP • http://www.informatik.uni-trier.de/~ley/db/ • Google Scholar • http://scholar.google.com/ • ACM Digital Library • http://portal.acm.org/dl.cfm • IEEE Xplore • http://portal.acm.org/dl.cfm

  10. One Way to Find Related Papers

  11. Term Project • ACMGIS CUP 2014 • Team of up-to 2 person • March 03, 10 minutes presentation on algorithm design and cost analysis • Score is based on normalized grade you get from submission.

  12. Term Paper • Two choices • Term paper • Survey paper

  13. Term paper • Research oriented • Key components: • Problem Statement, Significance of the problem • Related Work and Our Contributions • Proposed Approach • Validation of listed contributions (experimental, analytical) • Conclusions and Future Work

  14. Survey paper • Key components • Problem Statement, Significance of the problem • Our Contributions (usually it is the categorization/classification of the research literature) • A classification of the papers related to the problem. Use a concept hierarchy, figures, and diagrams if necessary. • Summarize, classify, contrast, and compare the research literature according to your classification scheme • A summary of the trend and future work of this line of research. • Conclusion.

  15. Spatial Databases (SDBMS) • Traditional (non-spatial) database management systems provide: • Persistence across failures • Allows concurrent access to data • Scalability to search queries on very large datasets which do not fit inside main memories of computers • Efficient for non-spatial queries, but not for spatial queries • Non-spatial queries: • List the names of all bookstore with more than ten thousand titles. • List the names of ten customers, in terms of sales, in the year 2001 • Use an index to narrow down the search • Spatial Queries: • List the names of all bookstores with ten miles of Minneapolis • List all customers who live in Tennessee and its adjoining states • List all the customers who reside within fifty miles of the company headquarter

  16. Value of SDBMS • Examples of non-spatial data • Names, phone numbers, email addresses of people • Examples of Spatial data • Census Data • NASA satellites imagery - terabytes of data per day • Weather and Climate Data • Rivers, Farms, ecological impact • Medical Imaging • Exercise: Identify spatial and non-spatial data items in • A phone book • A Product catalog

  17. User, Application domains • Many important application domains have spatial data and queries. Some Examples follow: • Army Field Commander: Has there been any significant enemy troop movement since last night? • Insurance Risk Manager: Which homes are most likely to be affected in the next great flood on the Mississippi? • Medical Doctor: Based on this patient's MRI, have we treated somebody with a similar condition ? • Molecular Biologist:Is the topology of the amino acid biosynthesis gene in the genome found in any other sequence feature map in the database ? • Astronomer:Find all blue galaxies within 2 arcmin of quasars. • Exercise: List two ways you have used spatial data. Which software did you use to manipulate spatial data?

  18. SDBMS • A SDBMS is a software module that • can work with an underlying DBMS • supports spatial data models, spatial abstract data types (ADTs) and a query language from which these ADTs are callable • supports spatial indexing, efficient algorithms for processing spatial operations, and domain specific rules for query optimization • Example: Oracle Spatial data cartridge, ESRI SDE • can work with Oracle DBMS • Has spatial data types (e.g. polygon), operations (e.g. overlap) callable from SQL3 query language • Has spatial indices, e.g. R-trees • IBM: Spatial Option • Informix: Spatial Datablade

  19. SDDMB vs. GIS • GIS is a software to visualize and analyze spatial data using spatial analysis functions such as • Search Thematic search, search by region, (re-)classification • Location analysis Buffer, corridor, overlay • Terrain analysis Slope/aspect, catchment, drainage network • Flow analysis Connectivity, shortest path • Distribution Change detection, proximity, nearest neighbor • Spatial analysis/Statistics Pattern, centrality, autocorrelation, indices of similarity, topology: hole description • Measurements Distance, perimeter, shape, adjacency, direction • GIS uses SDBMS • to store, search, query, share large spatial data sets

  20. SDBMS vs. GIS • SDBMS focuses on • Efficient storage, querying, sharing of large spatial datasets • Provides simpler set based query operations • Example operations: search by region, overlay, nearest neighbor, distance, adjacency, perimeter etc. • Uses spatial indices and query optimization to speedup queries over large spatial datasets. • SDBMS may be used by applications other than GIS • Astronomy, Genomics, Multimedia information systems, ...

  21. Issues in SDBMS • Spatial data model • Query language • Query processing • File organization and indices • Query optimization, etc.

  22. Spatio-temporal Databases • Add temporal dimension • Examples: • Trajectories • Evolving region • Moving points

  23. Geo-stream databases • Many data are generated continuously • Transaction data • Network monitoring • Financial application • Most recent data are commonly queried in a one-pass fashion • Monitoring • Aggregation • Database system provides abstractions and declarative languages that stream processing can benefit from

  24. Stream Application • Environmental monitoring • Patient monitoring • Finance • Network monitoring • Click-streams • Transaction monitoring • Traffic analysis • Moving object queries • Sensor network • RFID

  25. Sample Applications • Environmental monitoring • Notify me when UV is high, temperature is low • Traffic monitoring • Traffic jam: aggregated speed much below speed limit on a road segment for extended time • Accident: vehicle on unintended space, e.g. high way for longer than expected time • Click-streams • Find the school districts of the houses that the user browses the most.

  26. Geo-streams • Current streams systems lack native spatial support • Spatial stream queries are common in • traffic monitoring • environment monitoring • moving object databases

  27. Location Privacy

  28. Route prediction • Next position • Next stop • The entire route • Application: • Mobile commerce • Save energy • Traffic notification

  29. Location-based social networking • Social networking with location • Loopts • Google latitude • Geocache • Social dynamics • Iphone applications

  30. Volunteer Geographic Information System • OpenStreetMap, • Wikimapia • Foursquare • Trapster

  31. Spatio-temporal Analytics • The analysis of data with both spatial and temporal information • The data are spatially and/or temporally correlated "Everything is related to everything else, but near things are more related than distant things."

  32. Why do we need spatio-temporal analytics • Analytics help us to describe what happened in the past, understand what is happening now, predict what will happen in the future, and make decisions. • The proliferation of sensor devices makes spatio-temporal information a fundamental component for almost every analytical applications

  33. Types of Spatio-Temporal Analytics Methods • Visualization and exploratory analysis • Segmentation (classification and clustering) • Outlier analysis • Colocation mining • Dependency analysis • Trend discovery

  34. Data Visualization and Exploratory Analysis • Map querying task • Static query (one-time query using map tools available on the interface) • Dynamic query[36] (setup of event alert conditions) • Spatial constraints are expressed using the map, while temporal constraints are expressed as linear time moments[37] • Map animation[38] • Focusing, linking and arranging views[39] • Map iteration[40] • Existential changes[25] • Location changes • Attribute Changes

  35. Data Visualization and Exploratory Analysis: Example

  36. Segmentation methods • Classification[41] • Spatial classification: decision tree, Bayesian, ANN… • Temporal classification: decision tree, Bayesian, ANN… • Temporal extensions to spatial classification/ Spatial extension to temporal classification • Clustering[42] • Spatial clustering: partitioning method, hierarchical method, density based method, and grid-based method. • Temporal clustering • Interactive spatio-temporal clustering: perform clustering spatially or temporally and then test whether the cluster exist in both dimensions (EMM Test[43]) • Simultaneous spatio-temporal clustering: space-time scan[44]

  37. More on Spatio-Temporal Clustering

  38. More on Spatio-Temporal Clustering • Model-based clustering[46] • define a multivariate density distribution and look for a set of fitting parameters for the model. • Distance-based method • Moving object similarity search • Density-based method • DBSCAN extensions, OPTICS[47] • Flocks and convoy • Moving clusters[47] • Applications: movement data, cellular networks, environment data…

  39. Spatio-Temporal Clustering: Example

  40. Spatio-Temporal Outlier Analysis • Definition of outliers • “spatial-temporal object whose thematic attribute values are significantly different from those of other spatially and temporally referenced objects in its spatial or/and temporal neighborhoods”. • Methods[48] • Clustering-based approach • Distance based approach • Computational geometry based approach • Spatial scan based approach

  41. Spatio-Temporal Outlier Detection: Example

  42. Co-Location Mining • Colocation mining finds subset of Boolean features located in spatial proximity • Methods[50] • Data mining-based approach • Spatial statistical approach • Buffer-based model • Temporal extension: mixed-drove approach, weighted window-based model[51]

  43. Co-Location Mining: Example

  44. Other methods • Association rule mining • Spatial preprocessing is required to discretize spatial measurements • Methods[49] • Bayesian networks • Hieratical approach • Trend discovery • Regression • Sequence mining

  45. List of Current Spatio-Temporal Analytics Tools • Commercial • ESRI ArcGIS series • Microsoft SQL Spatial +StreamInsight • Other commercial tools • Open source/free software • Descartes and CommonGIS • MapServer • Other free tools

  46. ESRI ArcGIS Series • ArcGIS desktop and server provide most advanced and complete toolkit • Has many extensions for different domains • Can use APIs to develop extensions, web or desktop applications for customized needs. Many other commercial tools such as CUBE[9] are built on top of ArcGIS.

  47. ESRI ArcGIS Desktop and Server Extensions[1] • 3D Extension (Desktop and Server) • Analyze terrain data, model subsurface features, view and analyze impact zones, determine optimum facility placement, share 3D views, create a 3D virtual city. • Geostatistical Extension (Desktop and Server) • Visualize, model, and predict spatial relationships. • Link data, graphs, and maps dynamically. • Perform deterministic and geostatistical interpolation. • Evaluate models and predictions probabilistically

  48. ESRI ArcGIS Desktop and Server Extensions • Network Extension (Desktop and Server) • Dynamically model realistic network conditions and solve vehicle routing problems • Multipoint optimized routing, time-sensitive, turn-by-turn driving directions , allocation of service areas, determining the fastest fixed route to the closest facility  • Schematics Extension (Desktop and Server) • Rapid checking of network connectivity • Automatically generate schematics

  49. ESRI ArcGIS Desktop and Server Extensions • Spatial extension (Desktop and Server) • Comprehensive, raster-based spatial modeling and analysis. • Survey Extension (Desktop) • Capture, edit, and leverage land records using proven survey methodologies • Tracking Extension (Desktop) • Create time series visualizations so you can analyze information relative to time and location

  50. ESRI Domain-Specific Solutions • ESRI Business Analyst Online • Web-based solution that combines GIS technology with extensive demographic, consumer spending, and business data for the entire United States to deliver on-demand, boardroom-ready reports and maps • Perform drive-time analysis • Analyze trade areas • Evaluate sites • Identify most profitable customers and reach customers

More Related