1 / 23

Compressing Relations And Indexes

Compressing Relations And Indexes. Jonathan Goldstein Raghu Ramakrishnan Uri Shaft Department of Compter Sciences, University of Wisconsin-Madison June 18, 1997. Agenda. Introduction Compressing A Relation Compression Applied to Rectangle Base Indexes Performance Evaluation

thuong
Download Presentation

Compressing Relations And Indexes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compressing Relations And Indexes Jonathan Goldstein Raghu Ramakrishnan Uri Shaft Department of Compter Sciences, University of Wisconsin-Madison June 18, 1997

  2. Agenda Introduction Compressing A Relation Compression Applied to Rectangle Base Indexes Performance Evaluation Questions and Remarks

  3. Introduction • Page level Compression • Performance Study • Application to B-trees and R-trees • Multidimensional bulk loading algorithm

  4. Introduction

  5. Introduction

  6. Compressing A relation • Frames Of Reference • Non numeric attributes • File level compression

  7. Frames of Reference

  8. Lossy Compression Point approximation in lossy compression

  9. Compressing an indexing structure • Compressing a B-tree • Compressing a rectangle based indexing structure • Compression oriented Bulk Loading

  10. Rectangle Based indexing qualities

  11. Changing the frame of reference

  12. Bulk-Loading Algorithm • Input. A set of points in some n-dimentional space. • Output. A partition of the inut into subsets. • Requirements. The partition shuold group points that are close to each other in the same group as much as possiblg

  13. GB-Pack compression oriented bulk loading

  14. GB-Pack compression oriented bulk loading • Qualities: • trading off some tree quality for increased compression. • number of entries per page is data-dependent. • cutting a dimension in a value boundary in the data.

  15. GB-Pack compression oriented bulk loading

  16. GB-Pack compression oriented bulk loading

  17. GB-Pack compression oriented bulk loading

  18. Performance Evaluation • Relational Compression Experiments. • CPU vs. I/O Costs. • Comparison With Techniques in commercial systems. • Importance of Tuple-Level Decompression. • R-tree Compression Experiments.

  19. Synthetic Data Sets • Size: The number of tuples in the relation. • Dimensionality: The number of attributes of the relations. • Range: The range of values for the attributes. • Distribution :uniform(worst case) / exponential. • Partition Strategy. • Page size.

  20. Sales Data Set Sales data set. Compression Achieved versus dimensionality

  21. CPU vs. I/O Costs

  22. R-tree Compression Experiments Testing the quality of R-trees on Sales Data Set.

  23. Questions And Remarks

More Related