430 likes | 449 Views
Audio Video coding Standard of (AVS) China . Submitted by, Swaminathan Sridhar EE 5359 Multimedia Processing Project. Video coding standards [5]. Video coding standards [4], [5]. MPEG-2 (DVD, MPEG-2 (DVD, SDTV, HDTV) More than 10 years old Compression efficiency
E N D
Audio Video coding Standard of (AVS) China Submitted by, Swaminathan Sridhar EE 5359 Multimedia Processing Project
Video coding standards [4], [5] MPEG-2 (DVD, MPEG-2 (DVD, SDTV, HDTV) • More than 10 years old • Compression efficiency • 4.7GB DVD – 2 hours movie (5.3Mbps) • 18GB – 2 hours high definition movie (20Mbps) MPEG-4 AVC/H.264 (Multimedia applications) • Advanced coding techniques • Multiple-reference frame prediction • Context-based adaptive binary arithmetic coding (CABAC) • High compression efficiency • 1.5~2Mbps for SD, 6~8Mbps for HD • Save storage space, channel bandwidth, and frequency spectrum
Development stages of AVS [3] December 2003 • In the 7th AVS meeting, AVS-video (part-2) and AVS-system (part-1) was finalized. December 2004 • In the 11th AVS meeting, AVS-M (part-7) was finalized March 2005 • Authentication of ‘AVS101’ – high definition decoding chip May 2005 • AVS Industry Alliance was set up. June 2005 • Joint AVS/ISMA workshop on IPTV standard and industry forum February 2006 • AVS part-2 was announced as a national standard.
Applications of the commonly used parts of AVS China [3] AVS Part-2: HD/SD video • Jizhun Profile & Zengqiang Profile • HD broadcasting • High density storage media • Video surveillances • Video on demand AVS Part-7: Mobile video • Jiben Profile • Record and local playback on mobile devices • Multimedia Message Service (MMS) • Streaming and broadcasting • Real-time video conversation
Major and Minor coding tools used in AVS part 2 [1] Major tools • Interlace handling: Picture-level adaptive frame/field coding (PAFF) • Macroblock-level adaptive frame/field coding (MBAFF) • Intra prediction: 5 modes for luma and 4 modes for chroma • Motion compensation: 16x16/16x8/8x16/8x8 block size • Resolution of MV: 1/4-pel, 4-tap interpolation filter • Transform: 16bit-implemented 8x8 integer cosine transform • Quantization and scaling: scaling only in encoder • Entropy coding: 2D-VLC and Arithmetic Coding • In-loop deblocking filter Minor tools • Motion vector prediction • Adaptive scan
Different picture types [2] • Three types of picture are defined by AVS namely • Intra pictures (I) • Predicted pictures (P)- At most two reference frames (P or I) • Interpolated pictures (B)- two reference frames (I or P or both)
MB level Adaptive frame coding [2] MB-level adaptive frame/field coding (MBAFF) • The frame/field encoding decision is made independently for each vertical pair of macro blocks in a frame. • A frame consisting of both moving and non-moving regions is coded more efficiently by: • frame mode for the non-moving regions • field mode for the moving regions • MBAFF is much more complicated than PAFF – zig-zag scanning – motion vector prediction – intra prediction – deblocking – context modeling in entropy coding • The advantage compared with the MBAFF in H.264 – A field-coded MB belonging to the bottom field CAN use the top field of the same frame as a reference for motion prediction
Intra Prediction [2] • Five different modes for luma
Luma Intra Prediction difference between AVS and H.264 [6] AVS • Block size: 8x8 • 5 modes • Reference pixels low pass filtered • Advantages: low complexity with less modes H.264 • Block size: 4x4 or 16x16 • 9 modes for 4x4 & 8x8, 4 modes for 16x16 • Advantage: better prediction • Disadvantage: more complex
Intra prediction modes for Chroma [2] • 4 Prediction modes for Chroma
Inter Prediction and Motion Compensation [1] • At most 2 frames can be stored as reference for motion prediction. • Block size of motion prediction and compensation – 16x16, 16x8, 8x16 and 8x8 • In each MB, the number of MV pairs can be 1, 2 or 4, depending on the block size of MC. • MVD, the difference between the predicted MV and the real MV, is coded. • Resolution of MV – 1/4-pixel for luma – 1/8-pixel for chroma • Motion prediction modes • – Forward • – Backward (only applicable for B frame) • – Bi-directional (only applicable for B frame) • Skip • Direct • Symmetric
Reference Frame [1] • At most 2 reference frames are used. • PAFF or MBAFF is used, – if the current MB is frame-coded, 2 frames can be used as reference for motion prediction. – if the current MB is field-coded, 4 fields can be used. • Reference index should be coded with every MC block to indicate which reference picture is used
Motion Vector Prediction [3] • Use A, B, C, D’s MV (MVA, MVB, MVC and MVD) to predict E’s MV (PredMVE) • Reason: reduce the bits for coding MV • Method: • Geometrical median of MVA, MVB, MVC • VAB = Dist(MVA, MVB) • VBC = Dist(MVB, MVC) • VCA = Dist(MVC, MVA) • FMV = Median(VAB, VBC, VCA) where Dist(MV1, MV2)=|x1-x2|+|y1-y2|. • Determine PredMVE • If FMV equals VAB, PredMVE=MVC. • If FMV equals VBC, PredMVE=MVA. • If FMV equals VCA, PredMVE=MVB.
Interpolation for Luma [3] • Resolution – Quarter-pixel • Filter – Half-pixel • Blue: [-1, 5, 5, -1] – Quarter-pixel • White: [1, 7, 7, 1] • Red: bilinear
Interpolation for Chroma [3] PredMatrix[x,y]= [(8–dx)×(8–dy)·A + dx · (8-dy)·B + (8–dx) ·dy·C + dx·dy·D]/64
Forward and Backward Prediction [1] • Forward prediction • MV pointing only to the previous frame • Get reference block only from the previous frame • Backward prediction • MV pointing only to the next frame • Get reference block only from the next frame
Bi-directional Prediction [1] • Skip mode • Block size of MC: 16x16 • No transform coefficient is coded, since they all equal zeros. • No MV is coded, since they can be calculated. • Direct mode • Block size of MC: 16x16 or 8x8 • Transform coefficients are not all zeros, so they have to be coded. • No MV is coded, since they can be calculated the same way for skip mode. • Symmetric mode • Block size of MC: 16x16, 16x8, 8x16, 8x8. • Transform coefficients are not all zeros, so they have to be coded. • Only forward MV is coded, and the backward MV can be calculated by using the forward one.
Context-based Adaptive 2D VariableLength Coding (CA-2D-VLC) [1] (level, run) pair mapping to CodeNum using VLC tables • level>0: CodeNum is the number in VLC tables directly • level<0: CodeNum is number+1 in VLC tables . • • Example • level= 2, run=1, CodeNum=11; • level= −2, run=1, CodeNum=12 CodeNum mapping to bit • string using Exp-Colomb coding
Context-based Adaptive 2D VariableLength Coding (CA-2D-VLC) [1]
Deblocking Filter [3] 8x8 block • Three steps • Choose boundary strength • (BS), according to • Prediction modes • MV • Decide whether to filter • according to • Quantization Parameter (QP) • BS • – Apply filter to the boundary
AVS Part-2 vs H.264/AVC [4], [6] # BS: Boundary strength
References 1] L. Yu et al, “An Overview of AVS-Video: tools,performance and complexity”, Visual Communications and Image Processing 2005, Proc. of SPIE, vol. 5960, pp.596021, July 31, 2006. 2] L. Yu et al,“An area-efficient VLSI architecture for AVS intra frame encoder” Visual Communications and Image Processing 2007, Proc. of SPIE-IS & T Electronic Imaging, SPIE vol. 6508, pp. 650822, Jan. 29, 2007. 3] W. Gao et al, “AVS - The Chinese Next-Generation Video Coding Standard” NAB, Las Vegas, 2004. 4] T. Wiegand et al,“Overview of the H.264/AVC Coding Standard” IEEE Trans. Circuits Syst. Video Technol., vol.13, pp.560-576, July 2003. 5] J. Wang et al, “An AVS-to-MPEG2 Transcoding System” Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing , Hong Kong, pp. 302-305, Oct. 20-22, 2004. 6] X. Wang et al, “Performance comparison of AVS and H.264/AVC video coding standards” J. Comput. Sci. & Technol., Vol.21, No.3, pp.310-314 J, May 2006. 7] B. Tang et al, “AVS Encoder Performance and Complexity Analysis Based on Mobile Video Communication”, WRI International conference on Communications and Mobile Computing, CMC ‘09, vol. 3, pp. 102-107, 6-8 Jan. 2009.
Web References: AVS China software A] ftp://159.226.42.57/public/avs_doc/avs_software