210 likes | 333 Views
Change Analysis in Spatial Data by Combining Contouring Algorithms with Supervised Density Functions. PAKDD 2009, Bangkok, Thailand. April 29, 2009 Chun Sheng Chen 1 , Vadeerat Rinsurongkawong 1 , Christoph F. Eick 1 , and Michael D. Twa 2
E N D
Change Analysis in Spatial Data by Combining Contouring Algorithms with Supervised Density Functions PAKDD 2009, Bangkok, Thailand. April 29, 2009 Chun Sheng Chen1 , Vadeerat Rinsurongkawong1, Christoph F. Eick1, and Michael D. Twa2 1 Department of Computer Science, University of Houston 2 College of Optometry, University of Houston
Abstract • Detecting changes in spatial datasets is important for many fields such as • early warning systems that monitor environmental conditions or sudden disease outbreaks, • epidemiology, • crime monitoring, and • automatic surveillance. • To address this need, this paper introduces a novel methodology and algorithmsthat discover patterns of change in spatial datasets.
Outline Introduction Contributions Supervised Density Estimation Contour Clustering Algorithm Contour Polygons Change Analysis Approaches Change Analysis Predicates Demonstration Related Work Summary and Future Work
1. Introduction We are interested in finding what patterns emerged between two datasets, Oold and Onew, sampled at different time frames. Change analysis centers on identifying changes concerning interesting regions with respect to Oold and Onew. The approach employs supervised density functions [Jiang 2007] that create density maps from spatial datasets. Regions (contiguous areas in the spatial subspace) where density functions take high (or low) values are considered interesting by this approach. Interesting regions are identified using contouring techniques.
2. Contributions • In general, our work is a first step towards analyzing complex change patterns. • The contributions of this paper include: 1) using density functions in contouring algorithm; 2) change analysis is conducted by interestingness comparison; 3) degrees of change are computed relying on polygon operations; 4) a novel change analysis approach is introduced that compares clusters that are derived from supervised density functions.
3. Supervised Density Estimation In particular, the influence of object oO on a point vF is defined as: The overall influence of all data objects oiO for 1≤ i ≤ n on a point vFis measured by the density function O(v), which is defined as follows:
4. DCONTOUR: A Contour Clustering Algorithm We have developed a contour clustering algorithm named DCONTOUR that combines contouring algorithms and density estimation techniques.
5. Contour Polygons In our approach, interesting regions (clusters) are represented by polygons. Our change analysis performs on a set of polygons by using polygon operations such as polygon intersection, union, difference and size (area).
7. Change Analysis Predicates • We introduce basic predicates that capture different relationships for change analysis. • Change analysis predicates operate on polygons. • Agreement between r and r’ can be computed as follows: • Agreement(r,r’) = |r r’|/|r r’| • The most similar region r’ in X’ with respect to r in X is the region r’ for which Agreement(r,r’) has the highest value.
7. Change Analysis Predicates • In addition to agreement, we also define predicates novelty, relative-novelty, disappearance and relative-disappearance below. • Novelty (r’) = (r’—(r1… rk)) • Relative-Novelty(r’) = |r’—(r1… rk)|/|r’| • Disappearance(r) = (r—(r’1… r’k)) • Relative-Disappearance(r) = |r—(r’1… r’k)|/|r| • We claim that the above and similar measurements are useful to identify what is new in a changing environment. • Moreover, the predicates we introduced so far can be used as building blocks to define more complex predicates.
8. Demonstration • We uniformly sampled earthquakes • Oold : January 1986 to November 1991 • Onew: December 1991 and January 1996 • Each dataset contains 4132 earthquakes. • We analyze changes in strong positive or negative correlations between the depth of the earthquake and the severity of the earthquake. • The variable of interest, z(o) is defined as follows:
8. Demonstration Contour polygons generated by DCONTOUR for Oold(upper-left figure) and Onew(lower-right figure).
8. Demonstration Overlap of contour polygons of Oold and Onew • Agreement(r,r’)= |r r’|/|r r’| Novel polygons of Onewwith respect to Oold • Novelty (r’) = (r’—(r1… rk))
8. Demonstration Contour polygons generated by DCONTOUR for Oold (left figure) and Onew (right figure). Overlap of contour polygons of Oold and Onew Novel polygons of Onewwith respect to Oold
9. Related Work • Our change analysis approach relies on clustering analysis. • The advantage of our change analysis approaches over the previous work [Asur 2007], [Fleder 2006], [Spiliopoulou 2006] is that we can detect various types of changes in data with continuous attributes and unknown object identity. • Existing contour plotting algorithms can be seen as variations of two basic approaches: • Level curve tracing algorithms [Watson 1992] scan a grid and mark grid-cell boundaries that are passed by the level curve. Contour polygons are created by connecting the marked edges. • Recursive subdivision algorithms [Bruss 1977] start with a coarse initial grid and recursively divide grid cells that are passed by the level curve. • DCONTOUR uses level curve tracing.
10. Summary Developing techniques for discovering change in spatial datasets is important and providing methods to detect change for continuous attributes and for objects that are not identified apriori are advantages of the work we describe here. In this paper, change analysis techniques that rely on comparing clusters for the old and new data based on a sets of change predicates are proposed. A novel contour clustering algorithm named DCONTOUR that combines supervised density functions with contouring algorithms has been introduced.
10. The Ultimate Vision of this Research Development of change analysis systems that automatically detect important changes in spatial datasets The change analysis system provides reusable components that can be used for any problem that requires continuous of spatial temporal events Embedding the change analysis system itself into bigger systems that solve critical problems of our society such as automatic surveillance systems, early warning systems and diagnostic tools Mining the patterns of changes themselves to detect complex patterns such as progression of pollution and diseases To contribute to important scientific disciplines such as epidemiology that requires the analysis of complex patterns of changes
Thank you for your attention Question?