400 likes | 555 Views
Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering. IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu- Huan Sung Jia-Ching Wang, Senior Member, IEEE. Outline. Introduction Related Works Feature Selection Proposed Fast Mode Decision Experiment Results
E N D
Fast Mode Decision for H.264/AVC Based on Rate-Distortion Clustering IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 14, NO. 3, JUNE 2012 Yu-Huan Sung Jia-ChingWang, Senior Member, IEEE
Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion
Introduction • The up-to-date video coding standard H.264/AVC • twice the compression ratio of other video coding standards. • maintaining nearly the same visual equality. • However, an extremely high computational complexity is a tradeoff of the performance gains. • Video conferencing • Live TV broadcasting • Mobile computing
Introduction • H.264/AVC adopts many features that can enhance coding performance. • Variable block-size MC • Sub-pixel ME • Multiple reference pictures selection • Directional intra prediction • In-the-loop de-blocking filtering, etc. • The features incur a heavy burden during the encoding process.
Introduction • Reducing the computational time has received considerable attention recently. • Reducing the encoding time involves two main parts : • Inter-mode decision • Intra-mode decision according to a RD cost optimization scheme.
Introduction • The proposed method presents a Multi-Phase Classification (MPC) scheme • use a nearest mean criterion. • determine inter-modes and intra-modes. • MPC is a hierarchical classification scheme that allows an MB to be classified into a category phase by phase.
Introduction • The MPC presents a three-phase classification scheme. • a phase consists of several categories. • partition from current phase into next phase. • categories are the sub-sets of the upper phase. • Each category within a phase is represented as a feature point in the feature space. • assign an MB to a category with the minimum distance.
Outline • Introduction • Related Works • Feature Selection • Proposed Method • Experiment Results • Conclusion
Related Works • Four ways to develop the fast mode decision algorithm in previous works. • The first approach is SIKP-mode detection • early identified if anMB can be skipped. • Kannangara et al. [3] and Zhao et al. [4]. [3] C. Kannangara et al., “Low-complexity skip prediction for H.264 through Lagrangian cost estimation,” IEEE Trans. Circuits Syst. Video Technol., vol. 16, no. 2, pp. 202–208, Feb. 2006. [4] Y. Zhao, M. Bystrom, and I. E. G. Richardson, “A MAP frame work for efficient skip/code mode decision in H.264,” in Proc. ICIP2006, Atlanta, GA, Oct. 8–11, 2006.
Related Works • The second approach is mode prediction • directly or indirectly predict the best mode for the current MB. • Wu et al. [5], Ri et al. [6] and Paul et al. [17]. • The third approach is mode classification • classifies the current MB into a specific category. • the corresponding candidate modes will be checked to find the best. • Kim et al. [7], Yu et al. [8], Liu et al. [9], Zeng et al. [10] and Zhao et al. [11]. [5] D.Wu, F. Pan, K. P. Lim, and S.Wu et al., “Fast intermode decision in H.264/AVC video coding,” IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 7, pp. 953–958, Jul. 2005. [6] S. H. Ri, Y. Vatis, and J. Ostermann, “Fast inter-mode decision in an H.264/AVC encoder using mode and Lagrangian cost correlation,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 2, pp. 302–306, Feb. 2009. [17] M. Paul,W. Lin, C. T. Lau, and B. S. Lee, “Direct inter-mode selection for H.264 video coding using phase correlation,” IEEE Trans. Image Process., vol. 20, no. 2, pp. 461–473, Feb. 2011.
Related Works • The last approach redefines the optimization cost function • number of operations needed for mode selection can be reduced. [7] C. Kim and C. C. Jay Kuo, “Feature-based intra-/inter coding mode selection for H.264/AVC,” IEEE Trans. Circuits and Syst. Video Technol., vol. 17, no. 4, pp. 441–453, Apr. 2007. [8] A. C. W. Yu, G. R. Martin, and H. Park, “Fast inter-mode selection in the H.264/AVC standard using a hierarchical decision process,” IEEE Trans. Circuits Syst. Video Technol., vol. 18, no. 2, pp. 186–195, Feb. 2008. [9] Z. Liu, L. Shen, and Z. Zhang, “An efficient intermode decision algorithm based on motion homogeneity for H.264/AVC,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 1, pp. 128–132, Jan. 2009. [10] H. Zeng, C. Cai, and K.-K. Ma, “Fast mode decision for H.264/AVC based on macro block motion activity,” IEEE Trans. Circuits Syst. Video Technol., vol. 19, no. 4, pp. 491–499, Apr. 2009. [11] T. Zhao, H.Wang, S. Kwong, and C.-C. Jay Kuo, “Fast mode decision based on mode adaptation,” IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 5, pp. 697–705, May 2010.
Outline • Introduction • Related Works • Feature Selection • Feature Vector • Feature Space and Classifier • Proposed Fast Mode Decision • Experiment Results • Conclusion
Feature Vector • There is a strong correlation of RD cost between the best mode and the temporal-spatial modes. • A three-dimensional feature vector that comprises RD costs of neighboring MBs is used to discriminate between the different modes for mode decision.
Feature Vector • RD costs range to various extents under different coding modes and motion contents and should not be directly used as a universal criterion. • Using a three-dimensional feature vector • ensure that an MB can be assigned to the most probable category accurately. • adapt to the variable motion contents of various video sequences properly.
Feature Vector • The three components of a feature vector, fskip, fspat, and ftemp, are expressed as :
Feature Vector • RD cost is expressed as :
Feature Space and Classifier • The 3D feature space
Feature Space and Classifier • Feature Space and Voronoi Diagram ftemp fskip
Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion
Fast Mode Decision • Nearest Mean Criterion • assign MBs into a specific category. • classify MBs by using Euclidean distance. • predict the best mode of an MB by finding a mean Mi (cluster center).
Fast Mode Decision • Category Organization • directly assigning the mode with minimum distance to the given MB. • unsatisfactory prediction accuracy. • grouping modes with similar characteristics into a category. • reducing the probability of a false prediction.
Fast Mode Decision • Multi-Phase Classification • pass through multiple phases. • avoid assigning an MB to a category too cursorily. • Phase-Iidentifies • Large-Middle category (SKIP/DIRECT, 1616, 168, 816, I1616) • Middle-Small category (168, 816, P88, I44) • Phase-II and Phase-III then divide each motion category into much smaller categories.
Fast Mode Decision • Mode decision process can be further accelerated by Early Termination. • activated => if the fskipis below a specific threshold. • SKIP mode is the best mode. • Initial threshold is set to be the average RD costs of SKIP-MBs in the training sequences, and will be dynamically updated according to : fskip Tskip
Error Propagation and Performance Degradation Control • A performance control process is incorporated into the proposed method. • Avoid serious performance degradation caused by repeated use of wrongly predicted resultsor accidental false predictions. • The idea is providing an inspection for the coding result of each MB produced from the fast mode decision algorithm.
Error Propagation and Performance Degradation Control • An adaptive RD cost inspection is proposed and all it needs have been gained already. • temporal RD costs • spatial RD costs • A fast mode decision is made and the corresponding RD cost is obtained, an inspection is performed by :
Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion
Training and Test Conditions • The means of each category and the related statistics are generated by JM17.0 [15]. • Ten video sequences are Silent, Ice, Hall, Highway, Miss-America, Carphone, Tempete, Soccer, Bus, and Table Tennis. Video format is QCIF-format. • QP values are 20, 24, 28, 32, and 36. • Two GOP structures (IPPP and IBBP) are used for the training purpose.
Training and Test Conditions • The number of frames to be encoded is set to 100. • The search range of motion estimation is 16, and the search strategy is full search. • The number of reference frames is 1, and the intra-period is set to 4.
Outline • Introduction • Related Works • Feature Selection • Proposed Fast Mode Decision • Experiment Results • Conclusion
Conclusion • Experimental results indicate that the quality loss and bitrate increasing are only 0.02 dB and 1.65%, respectively. • Reducing 67.5% encoding time on average among the 12 video sequences of different GOP structures. • Encompass a wide variety of motion contents and different resolutions.