Outline

Outline • Introduction on Multimedia Coding • Motion Estimation • Discrete Cosine Transform • Video Coding Standards

Multimedia Concepts • What is multimedia? • Combination of audio, video, image, graphic, and text. • Coverage of all human I/O’s. • Why does multimedia need to be coded?

Multimedia Coding for Different Applications • Mobile devices • Low data-rate, error resilience, scalability • Streaming service • Scalability, low to medium data-range, interactivity • On-disk distribution (DVD) • Interactivity • Broadcast • On-demand services

System Architecture Compression Layer Streams from as low as bps to Mbps Media aware Delivery unaware System Layer Manages Elementary Streams, their synchronization and hierarchical relations Media aware Delivery aware Delivery Layer Provides transparent access and delivery of content irrespective of delivery technologies Media unaware Delivery aware

Coding of Audiovisual Objects • Audiovisual scene is with “objects” • Mixed different objects on the screen • Visual • Video • Animated face & body; • 2D and 3D animated meshes • Text and Graphics • Audio • General audio – mono, stereo, and multichannel • Speech • Synthetic sounds (“Structured audio”) • Environmental spatialization

Arbitrary shape video object Animated Face Rectangular shape video object Example of MPEG-4 Video Objects From Olivier Avaro

The Scene Tree

Composition • Description & Synchronization • Delivery of streaming data • Interaction with media objects • Management and identification of intellectual property

Major Components

Scene Graph Media Objects Composition Rendering

Adding or Removing Objects (1) – = +

Adding or Removing Objects (2) From Igor S. Pandžić

Adding or Removing Objects (3) • Applications • Video conferencing • Real-time, automatic • Separate foreground (communication partner) from background • Object tracking in video • May allow off-line and semi-automatic • Separate moving object from others

Coding Techniques • Video objects • Shape • Motion vectors • texture • Audio objects • MPEG • AAC (Advanced Audio Coder) • TTS (Text-To-Speech) • Face and Body • Animation parameters • 2D Mesh • Triangular patches • Motion vector

Encoding of Visual Objects • Binary alpha block • Motion vector • Context-based arithmetic encoding • Texture • Motion vector • DCT

General audio(AAC, TwinVQ) Parametric audio(HILN) Parametric speech(HVXC) High quality speech(CELP) Natural Audio Coder Quality CD FM AM Telephone Cellular 2 4 8 16 32 64 kbit/s From Olivier Dechazal

Facial Animation From Eine Übersicht

Object Mesh • Useful for animation, content manipulation, content overlay, merging natural and synthetic video... • Tessellate with triangular paths

Sprite Coding • Represent background image with a larger size than that of image. • Useful for camera motion

Multiview Video

Outline • Introduction on Multimedia Coding • Discrete Cosine Transform • Motion Estimation • Video Coding Standards

Outline • What Is DCT And Why Use DCT • How to Compute DCT • Program The DCT • Conclusion

An Image-Transform Coding System Input samples Forward transform Binary decoder Quantizer Inverse transform ÷10 Network ×10 Binary encoder Inverse quantizer e.g. zip, RAR Huffman coding Output samples

Introduction(1/5) – Representation of An Image • How to code an image ? • Spatial domain (pixel-based) • Transform domain • Transformation methods • KLT , DFT , DWT , DCT...

Introduction(2/5) – Why Use DCT? Properties of DCT • Use cosine function as its basis function • Performance approaches KLT • Fast algorithm exists • Most popular in image compression application • Adopted in JPEG, M-JPEG, MPEG, H.26x

Introduction (3/5) - Does Transform Really Make Sense ? • Energy compaction • De-correlation: dependency elimination

Introduction (4/5) - Examples 8 8

Introduction (5/5) - Examples The coefficient of the basis vector (0,0) A pixel expressed by it’s value DCT IDCT Pixel values in spatial domain DCT coefficients in transform domain

Definition of Basis Function • Basis function of the 1-D N-point DCT • For N = 8

Basic diagram of DCT Discrete cosine transform and Inverse DCT (1) (2)

The basis of 2D-DCT with 8x8 block

Again – Do You Know What DCT Mean? The coefficient of the basis vector (0,0) A pixel expressed by it’s value DCT IDCT Pixel values in spatial domain DCT coefficients in transform domain

How to Compute:1-D VS. 2-D • [1-D] For a M × N 2D-block, we can use 1D N-point DCT in the row direction, then the 1-D M-point DCT in the column direction to get the 2D-DCT • [2-D] If 8 × 8 blocks are applied, the 2D-DCT will be

DCT matrix is orthonormal • The above equation is zero if u≠vorthorgonal • The basis vector of DCT has unit norm • According the above two , we know DCT matrix is orthonormal • The same is applied to 2D-DCT

Properties of Orthonormal • Energy can be conservation • Transform matrix can be refractorseparable

Energy conservation of orthonormal transform

Separable Transform (1/2)

Separable Transform (2/2)

Fast DCT algorithm (1/2)

Fast DCT algorithm (2/2)

How to program (1/3) - Normal form /***************************************************************************/ /*2D N*N DCT */ /*Input */ /*int argSource[N][N]：One block in the original image */ /*Output */ /*float argDCT[N][N]：The block in frequency domain corresponding to argSource[M][N] */ /***************************************************************************/ void DCT(int argDCT[8][8] , int argSource[8][8]) { float C[8],Cos[8][8]; float temp; int i,j,u,v; for(i=0;i<8;i++) for(j=0;j<8;j++) Cos[i][j]=cos((2*i+1)*j*PI/16); C[0]=0.35355339; for(i=1;i<8;i++) C[i]=0.5; for(u=0;u<8;u++) for(v=0;v<8;v++) { temp=0.0; for(i=0;i<8;i++) for(j=0;j<8;j++) temp+=Cos[i][u]*Cos[j][v]*(argSource[i][j]-128); temp*=C[u]*C[v]; argDCT[u][v]=temp; } }

How to program (2/3) - Fast algorithm -1 /***************************************************************************/ /*2D N*N DCT */ /*Input */ /*int argSource[N][N]：One block in the original image */ /*Output */ /*float argDCT[N][N]：The block in frequency domain corresponding to argSource[M][N] */ /***************************************************************************/ void DCT(int argDCT[8][8] , int argSource[8][8]) { float temp[8][8],temp1; int i,j,k; for(i=0;i<8;i++) for(j=0;j<8;j++) { temp[i][j] = 0.0; for(k=0;k<8;k++) temp[i][j] +=((int) argSource[i][k]-128)*Ct[k][j]; } for(i=0;u<8;u++) for(j=0;v<8;v++) { temp1=0.0; for(k=0;k<8;k++) temp1+ =C[i][k] * temp[k][j]; argDCT[i][j]=ROUND(temp1); } }

How to program (3/3) - Algorithm suitable for hardware implement #include <stdio.h> #define RS(r,s) ((r) >> (s)) #define SCALE(exp) RS((exp),10) void DCT(short int*input, short int*output) { short int jc, i, j, k; short int b[8]; short int b1[8]; short int d[8][8]; int c0=724;/* ; lect shift 10*/ int c1=502; int c2=474; int c3=426; int c4=362; int c5=284; int c6=196; int c7=100; for (i = 0, k = 0; i < 8; i++, k += 8) { for (j = 0; j < 8; j++) { b[j] = input[k+j]; } /* row transform */ for (j = 0; j < 4; j++) { jc = 7 - j; b1[j] = b[j] + b[jc]; b1[jc] = b[j] - b[jc]; } b[0] = b1[0] + b1[3]; b[1] = b1[1] + b1[2]; b[2] = b1[1] - b1[2]; b[3] = b1[0] - b1[3]; b[4] = b1[4]; b[5] = SCALE((b1[6] - b1[5]) * c0); b[6] = SCALE((b1[6] + b1[5]) * c0); b[7] = b1[7]; d[i][0] = SCALE((b[0] + b[1]) * c4); d[i][4] = SCALE((b[0] - b[1]) * c4); d[i][2] = SCALE(b[2] * c6 + b[3] * c2); d[i][6] = SCALE(b[3] * c6 - b[2] * c2); b1[4] = b[4] + b[5]; b1[7] = b[7] + b[6]; b1[5] = b[4] - b[5]; b1[6] = b[7] - b[6]; d[i][1] = SCALE(b1[4] * c7 + b1[7] * c1); d[i][5] = SCALE(b1[5] * c3 + b1[6] * c5); d[i][7] = SCALE(b1[7] * c7 - b1[4] * c1); d[i][3] = SCALE(b1[6] * c3 - b1[5] * c5); } /* column transform */ for (i = 0; i < 8; i++) { for (j = 0; j < 4; j++) { jc = 7 - j; b1[j] = d[j][i] + d[jc][i]; b1[jc] = d[j][i] - d[jc][i]; } b[0] = b1[0] + b1[3]; b[1] = b1[1] + b1[2]; b[2] = b1[1] - b1[2]; b[3] = b1[0] - b1[3]; b[4] = b1[4]; b[5] = SCALE((b1[6] - b1[5]) * c0); b[6] = SCALE((b1[6] + b1[5]) * c0); b[7] = b1[7]; d[0][i] = SCALE((b[0] + b[1]) * c4); d[4][i] = SCALE((b[0] - b[1]) * c4); d[2][i] = SCALE(b[2] * c6 + b[3] * c2); d[6][i] = SCALE(b[3] * c6 - b[2] * c2); b1[4] = b[4] + b[5]; b1[7] = b[7] + b[6]; b1[5] = b[4] - b[5]; b1[6] = b[7] - b[6]; d[1][i] = SCALE(b1[4] * c7 + b1[7] * c1); d[5][i] = SCALE(b1[5] * c3 + b1[6] * c5); d[7][i] = SCALE(b1[7] * c7 - b1[4] * c1); d[3][i] = SCALE(b1[6] * c3 - b1[5] * c5); } for (i = 0; i < 8; i++) { /* store 2-D array(8*8) data into a 1-D array (64)*/ for (j = 0; j < 8; j++) { *(output + i*8 + j) = (d[i][j]); } } }

Conclusion • DCT provides a new method to express an image with the properties of the image • The fast algorithm provided for hardware implement is possible.

Outline • Introduction on Multimedia Coding • Motion Estimation • Discrete Cosine Transform • Video Coding Standards

Outline • What are motions in videos • The importance of motions • Motion representation • How to find the motion of a block • Block matching • Residual • Fast block matching algorithm • Intra frame and inter frame

Motions in Video Clips • Local motions • Global motions Background

Outline

Outline

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

OUTLINE