1 / 17

Database Architecture Optimized for the New Bottleneck: Memory Access

Database Architecture Optimized for the New Bottleneck: Memory Access. Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl. Stefan Manegold Martin Kersten CWI Amsterdam The Netherlands {S.Manegold,M.Kersten}@cwi.nl. Contents. How Memory Access works

ziv
Download Presentation

Database Architecture Optimized for the New Bottleneck: Memory Access

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Architecture Optimized for the New Bottleneck: Memory Access Peter Boncz Data Distilleries B.V. Amsterdam The Netherlands P.Boncz@ddi.nl Stefan Manegold Martin Kersten CWI Amsterdam The Netherlands {S.Manegold,M.Kersten}@cwi.nl

  2. Contents • How Memory Access works • Simple Scan Experiment • Consequences for DBMS • Data Structures: vertical decomposition • Algorithms: tune random memory access • Partitioned Join Algorithms • Monet Experiments • Accurate Cost Models • Conclusion

  3. CPU Speed vs. Memory Speed Moore’s Law: CPU speed doubles every 3 years

  4. Memory Access in Hierarchical Systems

  5. Simple Scan Experiment

  6. Consequences for DBMS • Memory access is a bottleneck • Prevent cache & TLB misses • Cache lines must be used fully • DBMS must optimize • Data structures • Algorithms (focus: join)

  7. Vertical Decomposition in Monet

  8. Partitioned Joins • Cluster both input relations • Create clusters that fit in memory cache • Join matching clusters • Two algorithms: • Partitioned hash-join • Radix-Join (partitioned nested-loop)

  9. Partitioned Joins: Straightforward Clustering • Problem: Number of clusters exceeds number of • TLB entries ==> TLB trashing • Cache lines ==> cache trashing • Solution: Multi-pass radix-cluster

  10. Partitioned Joins: Multi-Pass Radix-Cluster • Multiple clustering passes • Limit number of clusters per pass • Avoid cache/TLB trashing • Trade memory cost for CPU cost • Any data type (hashing)

  11. Monet Experiments: Setup • Platform: • SGI Origin2000 (MIPS R10000, 250 MHz) • System: • Monet DBMS • Data sets: • Integer join columns • Join hit-rate of 1 • Cardinalities: 15,625 - 64,000,000 • Hardware event counters • to analyze cache & TLB misses

  12. Monet Experiments: Radix-Cluster (64,000,000 tuples)

  13. Accurate Cost Modeling: Radix-Cluster

  14. Monet Experiments: Partitioned Hash-Join

  15. Monet Experiments: Radix-Join

  16. Monet Experiments: Overall Performance (64,000,000 tuples)

  17. Conclusion • Problem: • Memory access is increasingly the most important bottleneck for database performance • Solutions: • Vertical decomposition improves column-wise data access • Radix-algorithms optimize join performance • General: • Algorithms can be tuned to achieve optimal memory access • Detailed and accurate estimation of memory cost is possible Monet homepage: www.cwi.nl/~monet

More Related