160 likes | 386 Views
6. Spatial Mining. Spatial Data and Structures Images Spatial Mining Algorithms. Definitions. Spatial data is about instances located in a physical space Spatial data has location or geo-referenced features Some of these features are: Address, latitude/longitude (explicit)
E N D
6. Spatial Mining Spatial Data and Structures Images Spatial Mining Algorithms Data Mining by H. Liu, ASU
Definitions • Spatial data is about instances located in a physical space • Spatial data has location or geo-referenced features • Some of these features are: • Address, latitude/longitude (explicit) • Location-based partitions in databases (implicit) Data Mining by H. Liu, ASU
Applications and Problems • Geographic information systems (GIS) store information related to geographic locations on Earth • Weather, climate monitoring, community infrastructure needs, disaster management, and hazardous waste • Homeland security issues such as prediction of unexpected events and planning of evacuation • Remote sensing and image classification • Biomedical applications include medical imaging and illness diagnosis Data Mining by H. Liu, ASU
Use of Spatial Data • Map overlay – merging disparate data • Different views of the same area: (Level 1) streets, power lines, phone lines, sewer lines, (Level 2) actual elevations, building locations, and rivers • Spatial selection – find all houses near ASU • Spatial join – nearest for points, intersection for areas • Other basic spatial operations • Region/range query for objects intersecting a region • Nearest neighbor query for objects closest to a given place • Distance scan asking for objects within a certain radius Data Mining by H. Liu, ASU
Spatial Data Structures • Minimum bounding rectangles (MBR) • Different tree structures • Quad tree • R-Tree • kd-Tree • Image databases Data Mining by H. Liu, ASU
MBR • Representing a spatial object by the smallest rectangle [(x1,y1), (x2,y2)] or rectangles (x2,y2) (x1,y1) Data Mining by H. Liu, ASU
Tree Structures • Quad Tree: every four quadrants in one layer forms a parent quadrant in an upper layer • An example Data Mining by H. Liu, ASU
R6 R8 R1 R7 R2 R3 R4 R5 R-Tree • Indexing MBRs in a tree • An R-tree order of m has at most m entries in one node • An example (order of 3) R8 R6 R7 R1 R2 R3 R4 R5 Data Mining by H. Liu, ASU
kd-Tree • Indexing multi-dimensional data, one dimension for a level in a tree • An example Data Mining by H. Liu, ASU
Common Tasks dealing with Spatial Data • Data focusing • Spatial queries • Identifying interesting parts in spatial data • Progress refinement can be applied in a tree structure • Feature extraction • Extracting important/relevant features for an application • Classification or others • Using training data to create classifiers • Many mining algorithms can be used • Classification, clustering, associations Data Mining by H. Liu, ASU
Spatial Mining Tasks • Spatial classification • Spatial clustering • Spatial association rules Data Mining by H. Liu, ASU
Spatial Classification • Use spatial information at different (coarse/fine) levels (in different indexing trees) for data focusing • Determine relevant spatial or non-spatial features • Perform normal supervised learning algorithms • e.g., Decision trees, NBC, etc. Data Mining by H. Liu, ASU
Spatial Clustering • Use tree structures to index spatial data • Cluster locally • DBSCAN: R-tree • CLIQUE: Grid or Quad tree Data Mining by H. Liu, ASU
Spatial Association Rules • Spatial objects are of major interest, not transactions • A B • A, B can be either spatial or non-spatial (3 combinations) • What is the fourth combination? • Association rules can be found w.r.t. the 3 types Data Mining by H. Liu, ASU
Summary • Spatial data can contain both spatial and non-spatial features. • When spatial information becomes dominant interest, spatial data mining should be applied. • Spatial data structures can facilitate spatial mining. • Standard data mining algorithms can be modified for spatial data mining, with a substantial part of preprocessing to take into account of spatial information. Data Mining by H. Liu, ASU
Bibliography • M. H. Dunham. Data Mining – Introductory and Advanced Topics. Prentice Hall. 2003. • R.O. Duda, P.E. Hart, D.G. Stork. Pattern Classification, 2nd edition. Wiley-Interscience. • J. Han and M. Kamber. Data Mining – Concepts and Techniques. 2001. Morgan Kaufmann. Data Mining by H. Liu, ASU