220 likes | 383 Views
Bitmap Index Design and Evaluation. By: Chee-Yong Chan Yannis E.Ioannidis. Ariel Noy Data representation and retrieval seminar. Introduction. Query performance issues On Line Transaction Processing. Read write database. Decision Support System.
E N D
Bitmap Index Design and Evaluation By: Chee-Yong Chan Yannis E.Ioannidis Ariel Noy Data representation and retrieval seminar
Introduction Query performance issues • On Line Transaction Processing. Read write database. • Decision Support System. Read mostly environments, with high selectivity factor.
Bitmap In Simple Form Value List Index Every value has it’s own column == bitmap.
Advantages • Compact size. • Efficient hardware support for bitmap operations (AND, OR, XOR, NOT). • Fast search. • Multiple differentiate bitmap indexes for different kind of queries.
Selection queries. • Queries of the form “A op v” A refers to indexed attribute. Op Range predicates Equality predicates
Space time tradeoff of bitmap indexes, for selection queries. • Space optimal bitmap index. • Time optimal bitmap index under a given space constraint. • Bitmap index with optimal space time tradeoff. • Time optimal bitmap index.
Bitmap Encoding Scheme • Equality Encoding: bi bits one for each possible value, all 0, vi 1. • Range Encoding: vi right most bits 0, rest 1.
Evaluation Algorithm for Range-Encoded Bitmap Indexes. • RangeEval - O’Neil and Quass • RangeEval-Opt: • number bitmap operation 50% off • less bitmap scans for range predicate evaluation • caluclating only the requested bitmap • avoids the intermediate equality predicate evaluation by evaluating each range query in term only off <= based on: • A < v == A<=v-1 • A > v == ! (A<=v) • A>=v == A<=v-1 • Working with only one bitmap B vs. working with at least two [Beq and ( Blt or Bge)]
Example: • A<=864 using a 3 component base-10 index. • RaneEval-Opt: 4 operation 5 scans • RangeEval: 10 operations 6 scans
Cost Model for Space-Time Tradeoff Analysis • Space(I) Space metric is in term of number of bitmaps stored. • Time(I) Time metric is in term of expected number of bitmap scans for a selection query evaluation.
Comparison of Bitmap Encoding Scheme • Equality encoded: S(I) ~ C T(I) ~ n*b/2 • Range encoded: S(I) ~ C-n T(I) ~ 2n
Space Optimal: • number of bitmap in n-component space optimal = n(b-2) b~ • space efficiency is non-decreasing function of the number of components. • The ultimate optimal is when n=log(C) • Time Optimal: • the optimal base in n-component base is <2,2,2,…,C/2^N> • time efficiency is non-increasing function of the number of components. • The ultimate optimal is when n=1
Optimal Space-Time Tradeoff (knee). Based on experimental, guessing and guts filling. 2 component index The base of the most time-efficient 2-component space-optimal index is given by:
Bitmap Index Storage Schems • Bitmap Level Storage (BS) each bitmap his own file • Component Level Storage (CS) each index component has its own file • Index Level Storage (IS) all together in one file
Compression of each file • CS has the best Space(I) tradeoff after compression. • BS has the best Time(I) tradeoff after compression.