1 / 25

Operational Rate-Distortion information theory in optimization of advanced digital video codec

Operational Rate-Distortion information theory in optimization of advanced digital video codec. Dragorad A. Milovanović Dr ago AM @ G mail.com Zoran S. Bojković z.bojkovic @ yahoo .com Univer sity of Be l grad e. CONTENTS. 1. Rate-Distortion theory

Download Presentation

Operational Rate-Distortion information theory in optimization of advanced digital video codec

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Operational Rate-Distortion information theory in optimization ofadvanced digital video codec Dragorad A. Milovanović DragoAM@Gmail.com Zoran S. Bojković z.bojkovic@yahoo.com University of Belgrade

  2. CONTENTS 1.Rate-Distortion theory 1.1 Source coding and R-D function 1.2 Operational R-D framework 1.3 Formulation of efficient video coding 2. Operational control of standard-based encoder 2.1 Operational MPEG framework 2.2 Performance/efficiency of digital video codec 2.3 Bitrate control and joint optimization

  3. 1. Rate-Distortion theory • Information Transmission System (message, symbols encoding, entropy) • Source coding: perceptual signals and distortion criterion D ≤ Dmax • Average distortion: • Rate-distortion theory calculates the minimum transmission bitrate R for a required video quality D. • Mutual information is the information that symbols and symbols convey about each other. • Average mutual information: • Channel coding: channel capacity C is a maximum of mutual information I between source and destination. R

  4. 1.1 Source coding and R-D function • For a given maximum average distortion Dmax, the rate distortion function is lower bound for the transmission bitrate Shannon lower boundRL(D) assumes statistical independence between distortion and reconstruction. R(D) function is non-increasing and convex function of D. For continuous source S, function R(D) approaches infinity as D approaches zero. For discrete source S, the minimum rate that is required for a lossless transmission is equal to the entropy rate R(0)=H(S) (losseless coding). Stochastic model of Gauss-Markov source (correlation 0<ρ<0.9): DL(R) = (1- ρ2)・σ2・2−2R Stochastic model of Laplacian pdf source (variance σ2=1): DL(R) = e/π・σ2・2−2R

  5. 1.2 Operational (R,D) framework • In a practical coding framework, structure of the coder is determined and finite set of encoding modes is defined. In addition, it is usually difficult or simply impossible to find closed-form expressions for the R(D) and D(R) functions for general sources. • Then, each of encoding parameters choices lead to pair of rate and distortion values of operational point in R-D plane. The lower bound of all these rate-distortion pairs is referred as ORD function. • Block diagram for a typical lossy source coding system: • block code QN={αN,βN,γN}(N consecutive input samples are independently coded) • bitrate R (average number of bits per source symbol) • additive distortion measure D (MSE of source/reconstructed symbols)

  6. Operational R-D function • For given source Sandcode Q, operational point(R,D) is definedR=r(Q) andD=δ(Q). • Operational plane R-D is possible partitioned into region of achievable rate distortion points (R,D) if there is a codeQwith r(Q)≤Randδ(Q)≤D. The function R(D) that describes this fundamental bound for a given source S is the operational function ORD. • The ORD boundary regions of achievable rate distortion points specifies: • the minimum rate R that is required for representing the source Swith a distortion less than or equal to a given value D or, alternatively, • the minimum distortion D that can be achieved if the source Sis coded at a rate less than or equal to a given value R. D=Max Region of achievable rate-distortion points (R,D) R=Max Operational R(D) function Rmin Dmin

  7. Quantization • Uniform scalar quantizer (Δ=const, D~Δ2/12, opt. γ) • Non-uniform optimal quantizer (Lloyd–Max centroids of pdf ) • Asymptotic performance DL(R) = σ2・εS2・2−2R (Shannon lower bound)

  8. Etropy coding (γ) • Variable length code (VLC ): • Huffmancode minimize average code length • Ls = Σ p(si)・length(si) [bps] • Optimal code p*(si) minimize first-order entropy • Hs = - Σ p*(si)・log2 p*(si) [bps] • K= 2: p(s1) = P1, p(s2) = 1-P1 • Hs = - P1 log2 P1 – (1-P1) log2(1-P1) bits/symbol • P1= 0.5 max Hs =1, Redundancys = log2K - Hs = 0 • Arithmetic encoder (CABAC): • adaptive estimation of statistical distribution p(si)

  9. Predictive coding • Differential coder • Predictive coder (DPCM) • Linear prediction Ŝn: • prediction coefficient pi • prediction error Un • reconstruction error U'n • Optimal linear prediction (Un orthogonal on Ŝn) • Prediction error variance σ' 2 = εα2σ2 ≥ γS2 εα2σS2 , γ=sfm • asymptotic performance: Coding gain CG =1/ γS2 • N =1: p1,opt=ρ1, CG=1/(1- ρ1 2) • N =2: p1,opt=ρ1 (1- ρ2)/(1- ρ1 2), p2,opt=ρ1 (ρ2 - ρ1 2) (1- ρ2)/(1- ρ1 2)

  10. Transform coding • Linear transformation • A transformation matrices • B inverse matrices • Aorthogonal matrices A-1 = AT, ATA= A AT = I • Borthonormal matrices B = A-1 = AT (sum of N variances of coeff. = variance of s) • Optimal linear transformation KLT (eigenvalues of auto-covariance matrices RSS) • Asymptotic performance: Coding gain CG =1/ γS2 • Optimal bitrate allocation R between N quantizers: • N=2

  11. 1.3 Formulation of efficient video coding • Standard-based codec requires optimization procedure over a set of allowed operating parameters as well as additional criteria that arise from real-time operations (complexity, delay). • The goal of operational information theory is to find a set of operating parameters of the encoder which is optimal in R(D) sense. Also, an efficient optimization procedure based on a fast algorithms solution instead the full search of parameter’s space, is requires. • Practical trade-off between the allowed distortion D and available bitrate R in designing an encoder, is based on the discrete optimization procedureof finding a local optimum operational (R, D) points.

  12. Lagrange multiplier method • Formulation of R-D problem: Cost function with constraint Necessary condition for the existence of a minimum: The solution: • Unconstrained Lagrangian cost function: Necessary condition for the existence of a minimum : The solution is simultaneousiteration ofRandλ:

  13. Geometrical interpretation Operational R-D function is convex border which connects subset of local optimum operational points (connected operational points are sub-optimal solution of Lagrange method). Optimal operational point (D,R) as a solution of Lagrange method min(D+ λ R) for constant λ, is operational pointon convex border which touches slopeλ.

  14. Optimal bit allocation • Formulation: Optimal bit allocation with constraint • Unconstrained Lagrangian cost function: • Necessary condition for the existence of a minimum: • The solution is simultaneous iteration of Ri and λ:

  15. Joint hierarchical optimization • Optimal image decomposition and bitrate allocation: • discrete version of Lagrange multipliermethod, • deterministic dynamic programming (forward/backward). • The solution: • The image is decomposed to pre-specified number of levels. • For the adopted value of quality parameters λ = const, on each level of decomposition is calculated operational point min(D + λR) for each partition and the specified set of quantizers. • At each level of decomposition split/merge decision is made (principle of optimality) in the comparison of the Lagrange function of successive levels of decomposition: • Binary search (Newton method) determines the optimal λ * for a given bitrateRmax and the initial search interval

  16. 2. Operational control of standard-based encoder • Digital video encode exploits statistical redundancy of source as well as perceptual irrelevancy of an user. • Block-adaptive hybrid transform-entropy encoder with motion estimation&compensation: Scope of standardization

  17. 2.1 Operational MPEG framework • ITU/MPEG process of standardization: • Encoding techniques and operational parameters:

  18. Set of operational parameters • The task of an encoder control is to determine the values of the standardized syntax elements, and thus the bitstreamb, for a given input sequence in a way that the distortion between the input sequence and its reconstruction is minimized subject to a set of constraints on average and maximum bit rate. • Let Bcbe the set of all conforming bitstreams that obey the given set of constraints. For distortion measure D, the optimal bitstream in the rate–distortion sense is given by • Due to the huge parameter space and encoding delay, it is impossible to directly apply the minimization. Instead, the overall minimization problem is split into a series of K smaller minimization problems (p is subset of operational parameters) • The constrained minimization problem can be reformulated as an unconstrained minimization, where Q denotes the quantization step size, which is controlled by the quantization parameter QP. R(QP) D(QP)

  19. 2.2 Performance/efficiency of digital video codec 1 1 1 1 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.66dB QP=20BR=512 PSNR= 34.00dB 2 2 2 2 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.36dB QP=31BR=512 PSNR= 30.94dB 3 3 3 3 H.265 H.263 HD720 QP=30BR=512 PSNR= 39.24dB QP=25BR=512 PSNR= 32.78dB

  20. Coding gain BRCG, PSNR=const • The three test sequences (1/2/3) with typical video conferencing content was selected in experiments (Vidyo1280x720 60fps x10s). • Each test sequence was coded at 12 different bitrates. The ORD functionPSNRYUV(BR) are shown for bitrates BR = 0.256, 0.384, 0.512, 0.850, 1.500Mbps • The combined PSNRYUV is first calculated as the weighted sum of the PSNR per picture of the individual components (PSNR) to obtainPSNRYUV = (6·PSNRY+PSNRU+PSNRV)/8 where individual components are computed asPSNR = 10 log10 (2B-1)2/MSE, B=8 1 1 2 2 3 BitRate reduction of HEVC vs. AVC based on subjective MOS performance for typical video conferencing bitrates 3

  21. Coding gain PSNRCG, BR=const • Variability PSNRY per frame (time) for BR=const(BR~0.512Mbps: QPHEVC=30, QPAVC=32, QPH.263=20/31/25) 1 2 3

  22. Complexity of encoder/decoder • The encoding and decoding times for the representative HD720 sequences (60fps x 10s) are shown.Times are recorded in 10s of seconds such as to illustrate the ratio to real-time operation: • the HEVC encoding time exceed 1000 times real-time, • the decoding time exceed 4 times real-time on an Ultrabook x86-64 Core i5 2/4@1.7GHz 4GB RAM. 1 2 3

  23. 2.3Bitrate control • The objective of rate control is to regulate the MPEG coded bit stream to satisfy certain given conditions (variable/constant bits budget constraints, buffer over/underflow prevention). • Variable/Constant (VBR/CBR)bitrate is under control of constant/variable quantization parameter QP in open/closed loop. • A typical rate-control scheme consists of two basic operations: • bit allocation (R-D model), and • bit rate control (buffer occupancy measure). • To achieve the target bit rate R, rate control scheme appropriately chooses a quantization parameter Q . For accuracy, it is of importance R-Q rate-quantization model. Together with distortion-quantization D-Q function, R-Q functions characterize the rate-distortion (R-D) behavior of video encoding. • The first step of the derivation of a rate control formula is to approximate the rate-distortion function R-Q by an inverse proportional curve as shown in figure.

  24. Joint encoding (Det/StatMux) • Deterministic multiplex ofL video sequences, CBR encoded with constant bitrate Ri (variableDiand picture quality) in fixed channel caacity Rc: • Statistical multiplex ofL+SMCGvideo sequences, VBR encoded withvariable bitrate Ri (constant Diand picture quality). Criteria are joint buffer occupancy measure 1 1 2 2 3 3 1 2 3 . . . . . .

  25. References [1] K.R.Rao, Z.S.Bojkovic, D.A.Milovanovic, Introduction to multimedia communications: applications – middleware - networking, Wiley, 2005. [2] K.R.Rao, Z.S.Bojkovic, D.A.Milovanovic, Multimedia communication systems: techniques, standards, and networks, Prentice Hall, 2002. [3] Y. Shoham, A Gersho, “Efficient bit allocation for an arbitrary set of quantizers,” IEEE Trans. ASSP,vol.36,pp. 1445-1453,Sep 1988. [4] T. Berger, Rate-Distortion theory: A mathematical theory for data compression, Prentice-Hall, 1971. [5] D.P. Bertsekas, Constrained optimization and Lagrange multiplier methods,Athena Scientific, 1996. [6] R. Bellman, Dynamic Programming, Princeton University Press, 1957. [7] D.A.Milovanovic, Z.S.Bojkovic, From information theory to standard codec optimization for digital visual multimedia, Seminar on Computer science and Applied mathematics - June 2013, Mathematical institute of the Serbian Academy of science and arts, andIEEE Chapter Computer Science (CO-16), Belgrade, Serbia. [8] D.Milovanović, Z.Milićević, Z.Bojković, MPEG video deployment in digital television: HEVC vs. AVC codec performance study, 11th International Conference on Telecommunications in Modern Satellite, Cable and Broadcasting Services TELSIKS2013, Nis, Serbia, Oct. 2013.

More Related