310 likes | 487 Views
Dynamic Data Structures : Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011. Κωνσταντίνος Τσακαλίδης. 2000-2006 B. Eng. Computer Engineering and Informatics Dpt., University of Patras , Greece Sum. 2007 Intern
E N D
Dynamic Data Structures:Orthogonal Range Queriesand Update EfficiencyKonstantinosTsakalidisPhD Defense23 September 2011
Κωνσταντίνος Τσακαλίδης • 2000-2006 B. Eng. • Computer Engineering and Informatics Dpt., University of Patras, Greece • Sum. 2007Intern • Google Inc., Mountain View, California, USA • 2007-2009 Ph. D. Student (Part A) • MADALGO, Aarhus University, Denmark • Sum. 2010Visiting Prof. Ian Munro • D. Cheriton School of Computer Science, University of Waterloo, Canada • 2009-2011Ph. D. Student (Part B)
Overview • Dynamic Planar Orthogonal 3-Sided Range Reporting Queries • [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” • [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” • Dynamic Planar Orthogonal Range Maxima Reporting Queries • [ICALP ’11]“Dynamic Planar Range Maxima Queries” • Multi-Versioned Indexed Databases • [SODA ‘12] “Fully Persistent B-Trees”
Databases and Geometry Planar (D=2) Euclidean Space 29 38 N points D dimensions • Query Operation • Question about • stored data • Update Operation/Transaction • Insert/Delete Tuple • Change Value Age Date Name Salary … Phone
Models of Computation Memory Disk N M<N N/B B words N B Record O(1) fields I/O Operation M/B O(1) Time I/O Model [Aggarwal, Vitter ‘88] Pointer Machine word-RAM #Occupied Records #Occupied Cells #Occupied Blocks Space #Arithmetic Operations +#Pointer Traversals #Arithmetic Operations +#cell READ/WRITEs #I/O Operations w bits/cell Time specialized database
Overview • Dynamic Planar Orthogonal 3-Sided Range Reporting Queries • [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” • [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” • Dynamic Planar Orthogonal Range Maxima Reporting Queries • [ICALP ’11]“Dynamic Planar Range Maxima Queries” • Multi-Versioned Indexed Databases • [SODA ‘12] “Fully Persistent B-Trees”
Orthogonal Range Reporting Queries Employees Age Contour Query Report all points with: Salary > 1000 Dominance Query Report all points with: Salary > 1000 and Age > 35 35 3-Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35 Salary 1000 2000
Average-Case EfficientDynamic 3-Sided Range Reporting Worst-Case EfficientDynamic 3-Sided Range Reporting word-RAM Pointer Machine X, Y: μ-random X, Y: μ-random X: smooth Y: restricted X: smooth Y: restricted X: smooth X: smooth
Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth Restricted classof distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions Probabilistic Distributions Smooth Restricted
Priority Search Tree[McCreight’75] Pointer Machine Update: Space: O(n) Move Up Maximum Y Update: O(log n)
Query by X-Coordinate:logn+ t Pointer Machine SubtreesInX( s) Path O(logn)
Query by Y-Coordinate:logn+t word-RAM Pointer Machine [Alstrup, Brodal, Rauhe ‘00] 1D Range Maximum Queries (Children) Find next point to be reported in O(1) time u O(1) time u ul ur
[ISAAC ‘09] word-RAM Update:O(log log n) exp. amo. Query: O(log log n+t) exp. w.h.p. Space: O(n) Weighti=Θ(22i) [Andersson, Thorup‘07] O(1) expected amortized RMQ O(loglogn) expected w.h.p. [Mehlhorn, Tsakalidis’93, Kaporis et al. ’06]
Average-Case EfficientDynamic 3-Sided Range Reporting word-RAM X: smooth
Overview • Dynamic Planar Orthogonal 3-Sided Range Reporting Queries • [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” • [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” • Dynamic Planar Orthogonal Range Maxima Reporting Queries • [ICALP ’11]“Dynamic Planar Range Maxima Queries” • Multi-Versioned Indexed Databases • [SODA ‘12] “Fully Persistent B-Trees”
Orthogonal Range MAXIMA Reporting QueriesOR “Generalized Planar SKYLINE Operator” Dominance Maxima Queries Report all maximal points among points with x in [xl,+∞) and y in [yb,+∞) Employees Age Is NOT Dominated Interesting Points Oldest and Best Payed Contour Maxima Queries Report all maximal points among points with x in (-∞,xl] Maximal Point yt yb Dominates: Is “Above” yb yb yb Salary xl xl xr xl xr xl 4-Sided Maxima Queries Report all maximal points among points with x in [xl, xr] and y in [yb,yt] 3-Sided Maxima Queries Report all maximal points among points with x in [xl,xr] and y in [yb,+∞)
Tournament Tree Pointer Machine Copy Up Maximum Y Y-Winning Paths
Tournament Tree Pointer Machine Find next point to be reported in O(1) time u MAX( ) Right(u)
3-Sided Range Maxima Queries Pointer Machine Query Time: log n + t MAX( ) Subtrees(Paths) O(logn)
Update Operation Pointer Machine Previous Update: O(log2n)
Update Operation Pointer Machine MAX(Right(u)) U [Sundar ‘89] Priority Queue with Attrition UL MAX(Right(uL)) MAX(Right(uR)) UR O(1) time
Update Operation Pointer Machine Space:O(n) Update:O(logn) Partially Perstistent Priority Queue with Attrition Reconstruct Rollback [Brodal ‘96] [Driscol et al. ‘89] amortized worst case O(1) time, space overhead per update step
Rectangular Visibility Queries Proximity Queries/Similarity Search (-∞,+∞) (+∞,+∞) 4-Sided Range Maxima Queries 4x (+∞,-∞) (-∞,-∞)
Worst-Case Efficient4-Sided Range MAXIMA Reporting andRectangular Visibility Queries
Overview • Dynamic Planar Orthogonal 3-Sided Range Reporting Queries • [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” • [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” • Dynamic Planar Orthogonal Range Maxima Reporting Queries • [ICALP ’11]“Dynamic Planar Range Maxima Queries” • Multi-Versioned Indexed Databases • [SODA ‘12] “Fully Persistent B-Trees”
B-Trees [Bayer,McCreight ‘72] Space: O(N/B)blocks Update:O(logBN) I/Os Access: O(logBN) I/Os • Multi-Versioned • Databases • Btrfs • Data Platform Indexed Database
Fully Persistent B-Trees n elements in one version m update operations = #versions B block size
[SODA ‘12] • I/O-Efficient Full Persistence • Interface of Primitive Operations • READ • WRITE • Input is a pointer-based Structure • Node occupies O(1) blocks • Node has indegree O(1) • O(1) I/O-Overhead per access • to a block • O(log2B) I/O-Overhead per change to a block • [Driscol et al.’89] • Node-Splitting Method • Incremental B-Trees • Lazy Updates • O(logBN) READs • O(1) WRITEs that make O(1) changes to a block • NEW_VERSION • ACCESS • NEW_NODE Result Space O(N/B) Query O(logBN+t/B) I/Os Update O(logBN + log2B) I/Os
Tsakalidis K., et al. [ISAAC ‘09]“Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”[ICDT ’10]“Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”[ICALP ’11]“Dynamic Planar Range Maxima Queries”[SODA ‘12]“Fully Persistent B-Trees” Mange TakKonstantinosTsakalidisPh.D. Studenttsakalid@madalgo.au.dk