Efficient Range Queries in Non-blocking k-ary Search Trees

Range Queries in Non-blockingk-ary Search Trees Trevor Brown Hillel Avni

Problem Statement Want to store keys in a dynamic data structure supporting insertion, deletion, and: RangeQuery(a, b): returns all keys of the data structure in range [a, b]

Previous Solutions Software transactional memory Locks Persistence root pointer e e Insert(c) b b f a d d c

k-ary Search Tree (k-ST) Add or remove keys by replacing node(s) Related to persistent data structures [Brown, Helga]

The Range Query Algorithm RangeQuery(a, b): Traverse the tree, skipping sub-trees which cannot contain a key in [a, b] During this traversal, save a pointer to each leaf that contains a key in [a, b] […] Problem: how to efficiently tell if a key was added or removed during this traversal?

Extending the Data Structure Add a dirty bit to each leaf Each leaf has its dirty bit set just before it is replaced Consequence: If a leaf's dirty bit is not set, then it has not been replaced

The Range Query Algorithm RangeQuery(a, b): Traverse the tree, skipping sub-trees which cannot contain a key in [a, b] During this traversal, save a pointer to each leaf that contains a key in [a, b] After this traversal, check the dirty bits of these leaves, one by one If no dirty bit is set, then return “the result” Otherwise, retry Reading dirty bit is farfaster than re-traversing

Example: RangeQuery(3, 14) RangeQuery sees 4 is dirty… Retry! 8, 13, 25 2, 3, 5 8, 9, 12 14, 19, 21 29, 35 1 4 5, 7 13 15, 16, 18 23, 24 3, 4 Insert(3) Saved pointers

Retrying RangeQuery(3, 14) Success! Return the result… 8, 13, 25 2, 3, 5 8, 9, 12 14, 19, 21 29, 35 1 4 5, 7 13 15, 16, 18 23, 24 3, 4 Saved pointers

Ctrie Taking a “snapshot:” atomically replace root Old tree no longer changes Future searches and updates copy nodes from the old tree [Prokopec, Bronson, Bagwell, Odersky] root pointer | | | | | | 00 01 01 10 10 11 11 … … … …

When our Algorithm is Good When workloads contain range queries over small ranges (i.e., where snapshots are bad) Example: database applications such asairline database of flights When it might not be Very large ranges increase the chance that a range query will have to retry Our experiments explore how much this matters In extreme cases Ctrieor Snap might be better

Experiment: compare performance of k-ST: k=16, 32, 64 Snap Ctrie Java’s Concurrent Skip List (SL) NOT LINEARIZABLE!

Experiment Throughput vs. number of concurrent threads Each thread repeatedly chooses a random operation (Search, Insert, Delete, RangeQuery) with arguments chosen uniformly randomly in [0, 10^6) Each experiment ran with a fixed amount of memory, for a fixed, sufficiently long amount of time

Hardware Intel 4-chip, 40-core, 80-thread Sun 2-chip, 16-core, 128-thread

Many queries with small ranges 50% search, 5% insert, 5% delete, 40% range query size 100 Throughput (millions) 64-ST 32-ST 16-ST SL Ctrie Snap Number of threads

Many queries with bigger ranges 50% search, 5% insert, 5% delete, 40% range query size 10,000 Throughput (hundred thousands) 64-ST 32-ST 16-ST SL Ctrie Snap Number of threads

Few queries (with small ranges) 59% search, 20% insert, 20% delete, 1% range query size 100 64-ST Throughput (ten millions) 32-ST 16-ST SL Ctrie Snap Number of threads

Throughput versus P(range query) Throughput (ten millions) 1:10000 operations is RQ KST64 KST32 KST16 SL Ctrie Snap Probability of range query

Throughput versus arity 20i-20d-1r-size100 Throughput (ten millions) 5i-5d-40r-size100 20i-20d-1r-size10000 5i-5d-40r-size10000 Degree of tree

Conclusion Provably correct algorithm Searches can ignore concurrent updates Although dirty bits invalidate range queries,they do not invalidate searches Range queries are invisible No CAS, don’t change data structure Avoids excessive duplication of nodes Appears to be practical when workloads contain queries over small ranges

Future work Adding balance (a, b)-tree, chromatic tree, relaxed AVL tree Wait-freedom?

Efficient Range Queries in Non-blocking k-ary Search Trees

Efficient Range Queries in Non-blocking k-ary Search Trees

Presentation Transcript

Binary Trees, Binary Search Trees

Succinct Dynamic -ary Trees

How range queries work

Search Trees

Efficient Evaluation of k-Range Nearest Neighbor Queries in Road Networks

k - Ary Search on Modern Processors

N - ary Trees

Range Queries in Distributed Networks

Non-blocking I/O

Non-Blocking Communications

Search Trees

How range queries work

Supporting Range Queries on Web Data Using k -Nearest Neighbor Search

Search Trees

m -ary Trees

Non-blocking I/O

Blocking / Non-Blocking Send and Receive Operations

Non-blocking k- ary Search Trees

Pertemuan 10 Non Blocking

Non-blocking Caches

Search Trees

Non-Blocking Communications