120 likes | 487 Views
Multidimensional Search Structures. Spatial Databases. Spatial Objects Points : location (x,y) Lines : pairs of points (roads, coastal lines) Polygons : list of points (states, countries) Data Types Point : a data type with no extension (area)
E N D
Spatial Databases • Spatial Objects • Points: location (x,y) • Lines: pairs of points (roads, coastal lines) • Polygons: list of points (states, countries) • Data Types • Point: a data type with no extension (area) • Region: has location and boundary that defines the extension • Spatial Queries • Range queries • “Find all Italian restaurants within 20 miles from UTA” • Nearest neighbor queries • “Find the 10 Italian restaurants that are nearest to UTA” • “Find the nearest fire station to Neddermann Hall” • Spatial join queries • “Find pairs of cities within 20 miles of each other” • “Find restaurants that are adjacent to university campuses”
Applications • Geographical Information Systems (GIS) • Map systems • Resource management systems • Computer-aided design and manufacturing (CAD/CAM) • VLSI design (avoiding overlaps, routing wires) • Multimedia databases • Various dimensions (color, shape, texture)
Representation of Spatial Objects • It is very expensive to work on real boundary lines • Minimum Bounding Rectangle (MBR) • Testing for intersection: • Test if MBRs intersect • If they do, test if boundary lines intersect
Grid-Tree (G-Tree) • Mainly for point data of any dimension • Based on hypercubes insert point E E A C F C A B B F D D A C C B A B D D E E 0 1 0000 0010 10 00 11 0001 0011 01 10 1110 000 111 001 01 110 110 1111 0001 0000 0010 1111 0011 1110
G-Tree Organization • G-Trees are organized in B+-trees • Key is the binary string x • Rotate vertical/horizontal split • When split, add 0/1 at the end x0 x0 x1 x0 x1 x1 01 1110 0 1 00 11 01 0000 0001 0010 01 10 110 1110 1111 10 000 111 001 110 0001 0000 0010 1111 0011 1110
Searching G-Trees • Search: find point P=(x1,x2,…,xn) • Let m be the number of bits of the largest bitstring • Find the m-bit region that contains P: • Start with si=0 and ti=1for all 1<=i<=n • For k=0 to m-1: Slice over j=(k mod n)+1 dimension The kth bit of the bitstring is 0 iff xj < (sj+tj)/2; then set tj= (sj+tj)/2 Otherwise, it is 1 and set sj= (sj+tj)/2 • For n=2, m=5, and P=(0,1,0.7): 0.1 < ½ bit0 is 0 0.7 > ½ bit1 is 1 0.1 < ¼ bit2 is 0 0.7 < ½+¼ bit3 is 0 0.1 < 1/8 bit4 is 0 Bitstring is 00010 • Use the bitstring as the key for the G-Tree
Point Quadtrees • Divide the space into four quadrants • Represented as an unbalanced tree where each node has 4 children • Internal node: point • Leaf: empty • Node: (x,y,NE,NW,SW,SE) E B F A G H NE NW C D SW SE A B C D E F G H SE NE SW NW
Spatial Operations on Quadtrees • Search for a point P=(x,y) if current node T in quadtree is leaf, then not found if T=P, then found else find quadrant of T that includes P and search recursively: e.g., if x>T.X and y>T.Y then search SE quadrant • Given a rectangle (x1,y1,x2,y2) find all points in the rectangle • Need to keep a set of points • The rectangle may intersect more than one quadrants • Insert a point P=(x,y) If current node is leaf, then replace leaf with (x,y,nil,nil,nil,nil) Else determine the quadrant and apply the algorithm recursively
R*-Trees • It is like a B+-tree with key the MBR, but • There is no order in the rectangles in each node • Sibling rectangles may overlap • Restrictions on R*-tree nodes: • The root has at least two children (unless it is the only node) • Every non-root node has m children where M/4<=m<=M • The tree is balanced A A C E D B C D E F B F
Spatial Operations on R*-Trees Find all objects that intersect rectangle 9: 21 2 21 22 23 1 2 3 4 8 5 6 7 9 1 3 22 5 4 6 7 8 23 Insert rectangle 9: 26 25 2 26 27 24 25 22 23 1 9 2 3 4 8 5 6 7 24 1 3 27 22 9 5 4 6 7 8 23
Insert a New Object Rectangle S • At each node (starting from root) choose only one bounding rectangle R to insert the rectangle S so that the insertion of S into R has the least overlap enlargement. (Overlap enlargement of a rectangle is the total overlapping area of the rectangle with the other rectangles in the tree.) • If there are many with the same overlap enlargement, choose the one with the smallest area • If overflow, split the node into two sets of nodes using a split axis so that there is no underflow and the perimeters of the two bounding boxes are minimized.