Data Structures for Orthogonal Range Queries

Data Structures for Orthogonal Range Queries A New Data Structure and Comparison to Previous Work. Application to Contact Detection in Solid Mechanics. Sean Mauch Caltech April, 2003

Terminology: Orthogonal Range Queries • An orthogonal range query, (ORQ), determines the points in a set which lie within a given window (bounding box). • To the right is a set of points representing the location and population of cities. • We do an orthogonal range query on the cities We show the window and its projections onto the coordinate planes. 2

Example Application: Contact Detection • Algorithm for solid mechanics simulation with contact detection: • Search for potential contacts between nodes and surfaces. Which nodes are close to each face? • Do detailed contact check among potential contacts. • Enforce contacts by computing force required to remove overlap. 3 Contact search. Contact check.

Contact Detection Depends on ORQ’s • Searches for potential contacts are done with orthogonal range queries. • The capture box contains the nodes that may contact the face. • An ORQ finds the nodes in the capture box. • Detecting possible contacts is computationally expensive, typically half of total solid simulation execution time. • Good performance depends on choosing an efficient data structure and search algorithm for the orthogonal range queries. 4

Data Structures for ORQ’s • Octrees and kd-trees. • Tried and tested data structures from computational geometry. • Point in box method, (Swegle, et al.). • Developed at Sandia National Laboratories for doing parallel contact. • Cell arrays. • A bucket sort in 3-D. • Optimal computational complexity when properly tuned. • Dense arrays may have a high storage overhead. • Cell arrays coupled with binary and forward searching. • New data structure. 5

Kd-trees • Kd-trees recursively divide the domain by choosing the median in the chosen coordinate direction. • Depth is determined only by the number of points. 6

Octrees • Quadtrees (2-D) and Octrees (3-D) recursively divide space into quadrants or octants. • Depth is determined by the number of points and point spacing. 7

Point-in-box Method (Swegle et. al.) • Sorts and ranks the points in each coordinate direction. • There are three slices for a window. Choose the slice with the least points and see if those points are in the window. • The rank array allows integer instead of floating point comparisons which is much faster on some processors. 8

Cell Arrays • The computational domain is spanned by an array of cells. Each cell contains a set of points. • Constant time access to cells, but perhaps large storage overhead because of empty cells. 9

Sparse Cell Arrays • Array is sparse in one dimension. • Store only non-empty cells. • Trade reduced storage for increased cell access time. 10

Cell Arrays Coupled with Searching • Cell array does not span one dimension. Produces long, thin cells. • Compared to sparse cell array, we search on records instead of on cells. • Binary search on records is similar to sparse cell array. • If we sort the queries we can do forward searching in each cell. 11

List of Methods 12

Parameters • N: The number of records. • K: The records are in K-dimensional space. • M: The number of cells. • I: The number of records in the query. • R: The ratio of the length of a query range to the length of a cell. • Q: The number of queries. 13

Computational Complexity Table 14

Test Problem: Randomly Distributed Points • Records are randomly distributed points in the unit cube. • Number of records varies from 100 to 1,000,000. • ORQ on a box containing approximately 10 records is performed around each point. 15

Execution Time – Random Points 16

Memory Usage – Random Points 17

Conclusions • Projection methods like Point-In-Box perform poorly. • The kd-tree does not yield the best performance, but it performs fairly well over a wide range of problems. • Cell methods offer the best performance when they are well suited to the problem at hand. • Dense cell arrays work well for uniformly distributed data. • Sparse cell arrays and cell arrays coupled with binary searching offer good performance for most problems. • Cell arrays coupled with forward searching offers the best performance when doing multiple queries. 18

Data Structures for Orthogonal Range Queries

Data Structures for Orthogonal Range Queries

Presentation Transcript

Conjunctive, Subset, and Range Queries on Encrypted Data

How range queries work

Dynamic Structures for Top- k Queries on Uncertain Data

Space Efficient Data Structures for Dynamic Orthogonal Range Counting

Data Structures: Range Queries - Space Efficiency

Parallel Orthogonal Range Searching

Access Structures for Angular Similarity Queries

Orthogonal Range Searching

Orthogonal Range Reporting: Optimal 3-d Structures and Query Lower Bounds

Conjunctive, Subset, and Range Queries on Encrypted Data

Orthogonal Range Reporting

How range queries work

Orthogonal Range Searching

Queries for data modification: Action queries

Orthogonal Range Search

I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes

Optimal Planar Orthogonal Skyline Counting Queries

Parallel Orthogonal Range Searching

Orthogonal Range Searching

Mercury: Scalable Routing for Range Queries