140 likes | 197 Views
Context-based Data Compression. Xiaolin Wu Polytechnic University Brooklyn, NY. Part 3. Context modeling. Context model – estimated symbol probability. Variable length coding schemes need estimates of probability of each symbol - model Model can be
E N D
Context-basedData Compression Xiaolin Wu Polytechnic University Brooklyn, NY Part 3. Context modeling
Context model – estimated symbol probability • Variable length coding schemes need estimates of probability of each symbol - model • Model can be • Static - Fixed global model for all inputs • English text • Semi-adaptive - Computed for specific data being coded and transmitted as side information • C programs • Adaptive - Constructed on the fly • Any source!
Adaptive vs. Semi-adaptive • Advantages of semi-adaptive • Simple decoder • Disadvantages of semi-adaptive • Overhead of specifying model can be high • Two-passes of data required • Advantages of adaptive • one-pass universal As good if not better • Disadvantages of adaptive • Decoder as complex as encoder • Errors propagate
Adaptation with Arithmetic and Huffman Coding • Huffman Coding - Manipulate Huffman tree on the fly - Efficient algorithms known but nevertheless they remain complex. • Arithmetic Coding - Update cumulative probability distribution table. Efficient data structure / algorithm known. Rest essentially same. • Main advantage of arithmetic over Huffman is the ease by which the former can be used in conjunction with adaptive modeling techniques.
Context models • If source is not iid then there is complex dependence between symbols in the sequence • In most practical situations, pdf of symbol depends on neighboring symbol values - i.e. context. • Hence we condition encoding of current symbol to its context. • How to select contexts? - Rigorous answer beyond our scope. • Practical schemes use a fixed neighborhood.
Context dilution problem • The minimum code length of sequence achievable by arithmetic coding, if is known. • The difficulty of estimating due to insufficient sample statistics prevents the use of high-order Markov models.
Estimating probabilities different contexts • Two approaches • Maintain symbol occurrence counts within each context • number of contexts needs to be modest to avoid context dilution • Assume pdf shape within each context same (e.g. Laplacian), only parameters (e.g. mean and variance) different • Estimation may not be as accurate but much larger number of contexts can be used
Consider a random variable • source alphabet : • probability mass function : • Self-information of is • measured in bits if the log is base 2. • event of lower the probability carries more information • Self-entropy is the weighted average of self information Entropy (Shannon 1948)
Consider two random variables and Alphabet of : Alphabet of : Conditional Self-information of is Conditional Entropy is the average value of conditional self-information Conditional Entropy
Entropy and Conditional Entropy • The conditional entropy can be interpreted as the amount of uncertainty remaining about the , given that we knowrandom variable . • The additional knowledge of should reduce the uncertainty about .
S symbol to code form the context space, C 0 0 100 1 1 0 1 1 1 0 1 0 0 1 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 1 1 0 1 1 0 ... EC EC EC 0 0 0 0 0 1 1 1 1 A(0.2,0.8) B(0.4,0.6) C(0.9,0.1) Context Based Entropy Coders • Consider a sequence of symbols
Decorrelation techniques to exploit sample smoothness • Transforms • DCT, FFT • wavelets • Differential Pulse Coding Modulation (DPCM) • predict current symbol with past observations • code prediction residual rather than the symbol
Benefits of prediction and transform • A priori knowledge exploited to reduce the self-entropy of the source symbols • Higher coding efficiency due to • Fewer parameters to be estimated adaptively • Faster convergence of adaptation
Further Reading • Text Compression - T.Bell, J. Cleary and I. Witten. Prentice Hall. Good coverage on statistical context modeling. Focus on text though. • Articles in IEEE Transactions on Information Theory by Rissanen and Langdon • Digital Coding of Waveforms: Principles and Applications to Speech and Video. Jayant and Noll. Good coverage on Predictive coding.