240 likes | 392 Views
Chapter 5 : Query Processing and Optimization. Group 4: Nipun Garg, Surabhi Mithal http://www-users.cs.umn.edu/~smithal/. Chapter Organization.
E N D
Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal http://www-users.cs.umn.edu/~smithal/
Chapter Organization OLD Organization5.1 Evaluation of Spatial Operations 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Summary New Organization5.1 Evaluation of Spatial Operations -Parallel spatial joins -Top k spatial joins 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Introduction to query models 5.7 Spatial Query types • Reverse nearest neighbour queries (RNN) • Skyline queries 5.8 Trends : Spatial Query Evaluation on Hadoop 5.9 Summary
New Learning Objectives • Learning Objectives (LO) • LO2 : Learn about alternative algorithms to process spatial queries • LO6: Introduction to query models • LO7: Understanding new spatial query types • LO7.1 : Understanding concept of RNN queries • LO7.2 : Understanding concept of skyline queries • LO8 : Trends : Spatial queries on Hadoop Map Reduce • Mapping Sections to learning objectives • LO2 - 5.1.6 • LO6 - 5.7 • LO7 - 5.8 • LO8 - 5.9
Parallel spatial joins Concept • In a parallel architecture, work is distributed amongst several processors. • For a spatial join, the work can be distributed in both the filtering and refinement stages. Top k spatial joins Concept • A spatial join finds all pairs of objects satisfying a given relation between the objects • Given two data sets A and B, the top-k spatial Join retrieves the k objects in data set A or B that intersect the maximum number of objects from the other data set
Example – Parallel spatial join • Steps- • Task creation - Creating a set of tasks to be executed in parallel. • Task assignment • Task execution Src: Parallel Processing of Spatial Joins Using R-trees Thomas Brinkhoff, Hans-Peter Kriegel, Bernhard Seeger
New Learning Objectives • Learning Objectives (LO) • LO2 : Learn about alternative algorithms to process spatial queries • LO6: Introduction to query models • LO7: Understanding new spatial query types • LO7.1 : Understanding concept of RNN queries • LO7.2 : Understanding concept of skyline queries • LO8 : Trends : Spatial queries on Hadoop Map Reduce • Mapping Sections to learning objectives • LO2 - 5.1.6 • LO6 - 5.7 • LO7 - 5.8 • LO8 - 5.9
LO6: Introduction to query models Concept Overview of Query models for Oracle spatial & ArcSDE Oracle Spatial: provides a SQL schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial features in an Oracle database. Oracle Spatial uses a two-tier query model to resolve spatial queries and spatial joins. It implements the idea of Filter-Refine Paradigm. The two operations are referred to as primary and secondary filter operations. • The primary filter permits fast selection of candidate records to pass along to the secondary filter. • The secondary filter-Expensive- yields an accurate answer to a spatial query.
Example • The primary filter checks to see if the MBRs of the candidate objects interact, not whether the objects themselves interact. • The secondary filter ensures that only candidate objects that actually interact are selected.
New Learning Objectives • Learning Objectives (LO) • LO2 : Learn about alternative algorithms to process spatial queries • LO6: Introduction to query models • LO7: Understanding new spatial query types • LO7.1 : Understanding concept of RNN queries • LO7.2 : Understanding concept of skyline queries • LO8 : Trends : Spatial queries on Hadoop Map Reduce • Mapping Sections to learning objectives • LO2 - 5.1.6 • LO6 - 5.7 • LO7 - 5.8 • LO8 - 5.9
LO7.1: Understand concept of rnn queries Reverse Nearest Neighbor Queries • Concept – Focuses on inverse relations among points • Example - 5 data points • What are the RNNs of 1? 4 3 2 1 5
Algorithm • Step 1: For each point p ε S, determine the distance to the nearest neighbor of p in S, denoted N(p). N(p) = min q ε S –{p} d(p,q). For each p ε S, generate a circle (p,N(p)) where p is its center and N(p) its radius. • Step 2: For any query q (example Target store), determine all the circles (p,N(p)) that contain q and return their centers p.
New Learning Objectives • Learning Objectives (LO) • LO2 : Learn about alternative algorithms to process spatial queries • LO6: Introduction to query models • LO7: Understanding new spatial query types • LO7.1 : Understanding concept of RNN queries • LO7.2 : Understanding concept of skyline queries • LO8 : Trends : Spatial queries on Hadoop Map Reduce • Mapping Sections to learning objectives • LO2 - 5.1.6 • LO6 - 5.7 • LO7 - 5.8 • LO8 - 5.9
LO7.2 : Understanding concept of skyline queries Example - You have to attend a conference and for your stay you are trying to find a good hotel. Your purpose is to optimize this hotel search so that both the distance from conference centre as well as price of the booking is low.
Concept Domination: a point dominates A another point B if and only if the coordinate of A on any axis is not larger than the corresponding coordinate of B.
Example Given a set of points, the skyline query returns a set of points (referred to as the skyline points), such that any point in skyline is not dominated by any other point in the dataset.
Example contd…. h6 h5 S2 h1 h7 h9 h8 S1 h11 h13 h10 h12 S4 S3 h3 h2 Price h4 Distance from conference center
Example contd…. h6 h5 h1 S2 h7 h9 h8 S1 h11 h13 h10 h12 S4 S3 h3 h2 Price h4 Distance from conference center
Result h1 h4 h2 Price Distance from conference center
New Learning Objectives • Learning Objectives (LO) • LO2 : Learn about alternative algorithms to process spatial queries • LO6: Introduction to query models • LO7: Understanding new spatial query types • LO7.1 : Understanding concept of RNN queries • LO7.2 : Understanding concept of skyline queries • LO8 : Trends : Spatial queries on Hadoop Map Reduce • Mapping Sections to learning objectives • LO2 - 5.1.6 • LO6 - 5.7 • LO7 - 5.8 • LO8 - 5.9
Spatial Query Evaluation on Hadoop • Hadoop • HDFS – Hadoop Distributed File System • Map Reduce : Programming paradigm
Parallel Databases v/s Map Reduce • Parallel DBMS or Map Reduce Hadoop Conclusion: Hadoop/Map reduce cannot replace DBMS Combination or Map Reduce and SQL - Aster Data A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden & M. Stonebraker "A comparison of approaches to large-scale data analysis," SIGMOD ’09
Spatial Query Evaluation Map Stage 1) Homogenize data 2) Map to tiles. 3) Merge tiles into buckets. Reduce Stage Filter to find overlapping MBRs Refine results