1 / 48

Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264

Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264. Olivier Brouard. 20 juillet 2010 Encadrants : Dominique Barba et Vincent Ricordel. École Doctorale Sciences et Technologie de l’Information et Mathématiques (EDSTIM)

edison
Download Presentation

Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pré-analyse de la vidéo pour un codage adapté Application au codage de la TVHD en flux H.264 Olivier Brouard 20 juillet 2010 Encadrants : Dominique Barba et Vincent Ricordel École Doctorale Sciences et Technologie de l’Information et Mathématiques (EDSTIM) Spécialité : Automatique, Robotique, Traitement du Signal et Informatique Appliquée

  2. Pre-analysis of video for its advanced coding Application to the HDTV coding in H.264 streams Olivier Brouard July 20th 2010 Supervisors : Dominique Barba and Vincent Ricordel École Doctorale Sciences et Technologie de l’Information et Mathématiques (EDSTIM) Spécialité : Automatique, Robotique, Traitement du Signal et Informatique Appliquée

  3. Introduction Motivations • Emergence of the HDTV • New displays • From SDTV to HDTV • SDTV: 720x576 pixels • HDTV: 1920x1080 pixels • from 4% to 20% of the visual field • better immersion for the users • more pixels (5x) • Need for a new video coding standard • H.264 (or MPEG-4 AVC) 20 October, 2014 Olivier Brouard Slide 3/47

  4. Introduction H.264 Reference frames • Advanced video coder (dissymetrical coding) + prediction modes richness + advanced entropy coding • higher bit rate reduction (up to 50%  MPEG-2) • But • short term decisions, « low level » signal based • no coding consistency 20 October, 2014 Olivier Brouard Slide 4/47

  5. Introduction Human as the final observer Needs • Control the perceptual quality • Ensure the coding temporal coherence of the objects • the rendering of an object has to be consistent temporally • avoid the perceptible distortions • blocking effects • flickering effects 20 October, 2014 Olivier Brouard Slide 5/47

  6. Introduction Objectives & proposals • How to do ? • medium/long term decisions • « high level » considerations • no such tools within the current encoders • Solution • realize a video pre-analysis before the encoding step • guide the encoder in its decisions 20 October, 2014 Olivier Brouard Slide 6/47

  7. Outline • Video pre-analysis • Video pre-analysis 1.1 Advanced motion estimation 1.2 Spatio-temporal segmentation 1.3 Visual attention modeling • Applications: H.264 video coding 2.1 GOP structure adaptation 2.2 Adaptive quantization 20 October, 2014 Olivier Brouard Slide 7/47

  8. 1- Video pre-analysis Video pre-analysis • Based on HVS properties • « high level » information to the encoder • The Human Visual System (HVS) • Luminance perception • Color perception • Contrast sensibility • Masking effects • Visual Attention • Bottom-Up guided by the saliency • Top-Down  guided by the tasks 20 October, 2014 Olivier Brouard Slide 8/47

  9. 1- Video pre-analysis Visual attention • Attributes guiding the deployment of visual attention [Wolfe 04] • Contrast, Motion, Color, Orientation, … • Visual attention modeling [Itti 01; Le Meur 07; Marat 10]  based on the Koch and Ullman model [Koch 85] • Perceptually important regions most salient objects (physically and semantically) • Shapes of regions (saliency maps)  shape of objects [Milanese 1993] • moving objects attract our visual attention 20 October, 2014 Olivier Brouard Slide 9/47

  10. 1- Video pre-analysis Video pre-analysis 20 October, 2014 Olivier Brouard Slide 10/47

  11. Assumption • uniform motion • spatio-temporal tube • coherence of the motion along a perceptually significant duration • motion vectors field more homogeneous 1- Video pre-analysis – Advanced motion estimation Spatio-temporal tube (1) • Visualfixing time in the HVS ~ 200 ms • Next generation of HDTV • 1920x1080 in progressive mode at 50Hz • temporal segment of 9 frames: 180ms [Péchard 2007] 20 October, 2014 Olivier Brouard Slide 11/47

  12. The spatio-temporal tubeminimizes => MSEG with k = -4, -2, +2, +4 • MSEk based on the 3 YUV components 1- Video pre-analysis – Advanced motion estimation Spatio-temporal tube (2) • Implementation • spatial down-sampling • temporal down-sampling - central frame  current frame - 4 reference frames 20 October, 2014 Olivier Brouard Slide 12/47

  13. Apparent motions due to • moving objects • camera motion • Motion segmentation • based on the residual motion • Affine model a1, a2, a3, a4: deformation parameters tx, ty: translation parameters Vx, Vy: horizontal and vertical components of each MV (spatio-temporal tube) 1- Video pre-analysis – Spatio-temporal segmentation Global motion 20 October, 2014 Olivier Brouard Slide 13/47

  14. Global motion estimation in 2 steps: 1. For each MV (tube)  calculation of the derivatives • accumulation of the parameters assumptions • localization of the main peak 2.Accumulation of the residual MVs (tubes) 2-D histogram (tx, ty) 1- Video pre-analysis – Spatio-temporal segmentation Global motion parameters estimation • Motion vectors fields  parameters estimation [Coudray 2005] 20 October, 2014 Olivier Brouard Slide 14/47

  15. Iterative approach • Initialisation  detection of the main peak  greedy approach (local gradient) 2. Detection of the other peaks  greedy approach Accumulation histogram Main peak Secondary peak Segmented space 1- Video pre-analysis – Spatio-temporal segmentation Motion segmentation • 2-D Histogram of the translation parameters • residual MVs (tx, ty) • Each histogram peak => a moving object • analysis of all the peaks 20 October, 2014 Olivier Brouard Slide 15/47

  16. 1- Video pre-analysis – Spatio-temporal segmentation Motion segmentation – results • need of a spatial and temporal regularization 20 October, 2014 Olivier Brouard Slide 16/47

  17. 1- Video pre-analysis Video pre-analysis 20 October, 2014 Olivier Brouard Slide 17/47

  18. 1- Video pre-analysis – Spatio-temporal segmentation Spatio-temporal regularization • Motion-based segmentation  some blocks are misclassified • more criteria to improve the segmentation • connexity • color • texture • motion • Markovian approach 20 October, 2014 Olivier Brouard Slide 18/47

  19. Markovian property • U(o, e): sum of potential functions defined on cliques • site  spatio-temporal tube 1- Video pre-analysis – Spatio-temporal segmentation Markovian approach • The Hammersley-Clifford theorem [Besag 1974] • Gibbs distribution  Markov Random Field • the optimal label configuration minimize a global energy function E: label field O: observation field 20 October, 2014 Olivier Brouard Slide 19/47

  20. 1- Video pre-analysis – Spatio-temporal segmentation Spatial regularization • Spatial connexity • Segmented region • locally homogeneous • Color features • color distributions • Bhattacharrya coefficient  discrete densities • Texture features • texture distributions  2 spatial gradients (Sobel filters) • Bhattacharrya coefficient 20 October, 2014 Olivier Brouard Slide 20/47

  21. 1- Video pre-analysis – Spatio-temporal segmentation Temporal regularization • Motion features • distance between the MVs • Temporal connexity • Segmented region => temporally homogeneous • segmentation map of the previous temporal segment • Regions tracking • criteria - color, texture, recovery video objects tracking 20 October, 2014 Olivier Brouard Slide 21/47

  22. 1- Video pre-analysis – Spatio-temporal segmentation Energy minimization • The global energy function -  potential functions -  weigthing factors • Sequential sites processing • stack of instability 20 October, 2014 Olivier Brouard Slide 22/47

  23. 1- Video pre-analysis – Spatio-temporal segmentation Results motion segmentation only regularized spatio-temporal segmentation 20 October, 2014 Olivier Brouard Slide 23/47

  24. 1- Video pre-analysis Video pre-analysis 20 October, 2014 Olivier Brouard Slide 24/47

  25. 1- Video pre-analysis – Visual attention modeling Spatial saliency • Spatial saliency based on the color contrast [Aziz 2008] • color transformation: YUV to HSV • color features influencing the visual attention 1- Saturation Contrast 2- Intensity Contrast 3- Hue Contrast 4- Opponents Contrast 5- Warm andColdcolors Contrast 6- Dominance of the warm colors 7- Dominance of the luminance and saturation Spatial saliency: SSP => combination of these 7 features 20 October, 2014 Olivier Brouard Slide 25/47

  26. 1- Video pre-analysis – Visual attention modeling Temporal saliency • Temporal saliency based on the relative motion : MV of the site s : dominant motion : relative motion of s => • maximum velocity of smooth pursuit of the eye [Daly 1998]: => 80°/s => temporal saliency ST 20 October, 2014 Olivier Brouard Slide 26/47

  27. 1- Video pre-analysis – Visual attention modeling Spatio-temporal saliency • Fusion of the spatial saliency and temporal saliency maps • Observers => focus on the center of the screen [Le Meur 2005] • weighting by a 2-D gaussian function 20 October, 2014 Olivier Brouard Slide 27/47

  28. 1- Video pre-analysis – Visual attention modeling Results 20 October, 2014 Olivier Brouard Slide 28/47

  29. 1- Video pre-analysis Possible applications • Video pre-analysis • information • moving objects segmentation, objects tracking • color, texture • salient regions • applications • advanced video coding • video transmission with priority (saliency maps) • video summarization, indexation • … • ArchiPEG (ANR Project) • HD MPEG-4 AVC real-time compression • pre-analysis video resource 20 October, 2014 Olivier Brouard Slide 29/47

  30. Outline • Video pre-analysis 1.1 Advanced motion estimation 1.2 Spatio-temporal segmentation 1.3 Visual attention modeling • Applications: H.264 video coding 2.1 GOP structure adaptation 2.2 Adaptive quantization • Applications: H.264 video coding 20 October, 2014 Olivier Brouard Slide 30/47

  31. 2- Applications: H.264 video coding – GOP structure adaptation GOP structure • Three kinds of frames: I, P, B • GOP begins by a I frame  intra coded • P frames at regular intervals  predicted • B frames between P frames  bi-predicted • Fixed interval between I frames • not adapted to changing scenes and temporal variations of the video => more bits • dynamic GOP size  irregular I-frames insertion • Typically: number of B frames = 1 or 2  good trade-off between bitrate and quality • low motion or panning of the camera • increase the number of B-frames 20 October, 2014 Olivier Brouard Slide 31/47

  32. 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (1) • Analysis of the video sequences • x264 encoder • different fixed number of B frames: 0, 1, 2, 3 • optimal number of B frames => content dependent • classify videos according to their content 20 October, 2014 Olivier Brouard Slide 32/47

  33. 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (2) • Spatio-temporal characterization -> 2 indices to evaluate the spatio-temporal activity - IT: temporal activity => MVs - IS: spatial activity => MSEG For each temporal segment For the entire sequence 20 October, 2014 Olivier Brouard Slide 33/47

  34. 2- Applications: H.264 video coding – GOP structure adaptation B frames adaptation (3) • Classification space function of IT and IS • classe Ci => i B frames between P-P or I-P frames • IT constant between P-P or I-P frames • same rule for IS 20 October, 2014 Olivier Brouard Slide 34/47

  35. 2- Applications: H.264 video coding – GOP structure adaptation GOP size adaptation (1) • Changes detection within a video shot • high motion • significant changes • reduce the interval • low motion • little variation • increase the interval • mid-range motion • classical approach => fixed GOP size • 2 thresholds to detect critical changes - sh => high motion - sb => low motion 20 October, 2014 Olivier Brouard Slide 35/47

  36. 2- Applications: H.264 video coding – GOP structure adaptation GOP size adaptation (2) • Analysis of IT evolution  3 cases Mid-range motion High motion Low motion 20 October, 2014 Olivier Brouard Slide 36/47

  37. 2- Applications: H.264 video coding – GOP structure adaptation Performances • 8 video sequences • 4 different bitrates  defined by an experts group • Comparison between • x264 encoder: GOP size = 25, 2 B frames • a modified version => GOP structure adaptation 20 October, 2014 Olivier Brouard Slide 37/47

  38. 2- Applications: H.264 video coding – GOP structure adaptation Results • Rate – Distortion (PSNR) [Bjontegaard 2001] 20 October, 2014 Olivier Brouard Slide 38/47

  39. 2- Applications: H.264 video coding – GOP structure adaptation Subjective tests • Setup • display  resolution 1920x1080 • normalized room [BT.500-11] • ~30 naïve observers • (72=8x4x2+8) video sequences • Methodology  ACR • for each sequence  observers have to assess the quality 20 October, 2014 Olivier Brouard Slide 39/47

  40. 2- Applications: H.264 video coding – GOP structure adaptation Results • QGOP: MOS  modified coder • Qx264: MOS  x264 coder • sequences with a high IT value  high motion • GOP structure adaptation 20 October, 2014 Olivier Brouard Slide 40/47

  41. 2- Applications: H.264 video coding – Adaptive quantization Adaptive quantization • Objective • control the distribution of binaries resources  saliency maps • increase the perceived visual quality • Modification of the saliency maps  quantization and morphological filtering • Modification of the coder 20 October, 2014 Olivier Brouard Slide 41/47

  42. 2- Applications: H.264 video coding – Adaptive quantization Results (1) • Rate – Distortion (PSNR) [Bjontegaard 2001] 20 October, 2014 Olivier Brouard Slide 42/47

  43. 2- Applications: H.264 video coding – Adaptive quantization Subjective assessments • Results • QQA: MOS  modified coder (adaptive quantization) • Qx264: MOS  x264 coder • no specific content suitable  unsuitable for coding and broadcasting of HDTV at high bitrate • overhead, linear law ? 20 October, 2014 Olivier Brouard Slide 43/47

  44. Conclusion Conclusion (1) • Video pre-analysis • spatio-temporal segmentation • detection of moving objects • objects tracking • visual attention modeling • saliency maps • Applications • advanced video coding • video transmission with priority based on the saliency maps [Boulos 2010] • video summarization, indexation • … 20 October, 2014 Olivier Brouard Slide 44/47

  45. Conclusion Conclusion (2) • Applications of the video pre-analysis • GOP structure adaptation • B frames dynamic variation • temporal segment classification • IT and IS • GOP size adaptation • I frame insertion  change detection: IT • Adaptive quantization based on the saliency maps 20 October, 2014 Olivier Brouard Slide 45/47

  46. Conclusion Conclusion (3) • Subjective quality assessment tests • GOP structure adaptation  no significant differences • +0.18 (on a scale of 1 to 5) • well suited for sequences with high motion • Adaptive quantization • no clearly content suitability  seems unsuitable for coding and broadcasting of HDTV at high bitrate … adaptation law could be modified … 20 October, 2014 Olivier Brouard Slide 46/47

  47. Conclusion Perspectives • Better performance evaluation of our visual attention model • eye-tracking experiments • Psychophysical experiments to optimize the model parameters  improve the fusion process [Marat 2010] • Add high-level visual information  face, flesh hue, … 20 October, 2014 Olivier Brouard Slide 47/47

  48. Thank you. Questions ? 20 October, 2014 Olivier Brouard Slide 48

More Related