1 / 14

Cache-Oblivious Query Processing

Cache-Oblivious Query Processing. Bingsheng He, Qiong Luo {saven, luo}@cse.ust.hk Department of Computer Science & Engineering Hong Kong University of Science & Technology. Cache-Oblivious Algorithms [Frigo et al., FOCS 1999].

roscoe
Download Presentation

Cache-Oblivious Query Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache-Oblivious Query Processing Bingsheng He, Qiong Luo {saven, luo}@cse.ust.hk Department of Computer Science & Engineering Hong Kong University of Science & Technology

  2. Cache-Oblivious Algorithms[Frigo et al., FOCS 1999] • Assuming no knowledge about cache parameter values, e.g., cache size • Optimal cache complexity • For an ideal cache model • Two-level hierarchy: cache on top of memory • Automatic, optimal cache replacement • Fully associative • For more realistic cache models as well

  3. Motivation • Relational database systems have too many knobs to tune for performance. • Tuning may be difficult, ineffective, and sometimes infeasible. • The memory hierarchy becomes increasingly complex.

  4. Our focus: CPU caches Memory Hierarchy Main memory Disk L2 L1 Registers CPU

  5. Cache-Conscious (CC) Techniques • Aware of cache parameters of a target level in a specific memory hierarchy • Cache block size, e.g., B+-trees • Cache capacity, e.g., blocked NLJ • Achieve a high performance with correct parameter values

  6. Tuning the parameter is difficult • The best parameter value varies with the platform. • It may be none of the cache parameters of the platform. • It may vary with different data and algorithmic characteristics.

  7. Our Goal To automatically and consistently achieve a good performance on various memory hierarchies at all times

  8. Challenges • How to optimize query processing cache-obliviously? • Divide-and-conquer methodology • Amortization methodology • How to achieve a comparable overall performance with fine-tuned cache-conscious algorithms? • Work complexity • Recursion overhead

  9. Fit into the cache Reuse Divide-and-conquer

  10. R Partitioner Buffer Partition Amortization • Reduce the average cost for a set of operations • A buffer hierarchy • Buffer sizes are recursively defined.

  11. EaseDB: System Architecture

  12. Limitations • Employ sophisticated data structures and mechanisms. • Require some automatic and machine-independent optimization to improve their efficiency.

  13. Opportunities • Storage models • Transactions • New architectural features • CMP/SMT • GPUs • Transactional memory

  14. Conclusion • First cache-oblivious query processor • Complexity results on our CO alg. • Empirical results of our CO alg. on three hardware platforms in comparison with their CC counterparts http://www.cse.ust.hk/cactus/

More Related