1 / 47

Advanced Video Coding Techniques for Quality Optimization in Machine Learning Era

Explore the latest developments in video coding through machine learning for content-based optimization and complexity reduction. Discover efficient methods to enhance video quality assessment and optimize encoding recipes, all aimed at boosting subjective testing accuracy and codec performance. Dive deep into the world of content characterization and feature extraction to further advance video quality metrics in the realm of coding optimization.

bernad
Download Presentation

Advanced Video Coding Techniques for Quality Optimization in Machine Learning Era

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. On PerceptualCoding: Quality, Content Features and Complexity Patrick Le Callet Université de Nantes

  2. VideoCoding @ Machine (Deep) Learning Era (1) STANDARD PLAYGROUND DISRUPTIVE PLAYGROUND Full Deep Autoendoder GANs … Geometricdeeplearning • Hybridapproach: symbolic, statistic and/or deeplearningfully compatible withexistingCoDecs • Modelsthatcanpredict: • Optimal/ad hoc transform S. Puri, S. Lasserre, et P. Le Callet, « Annealedlearningbased block transforms for HEVC videocoding », ICASSP 2016 • Optimal Syntaxprediction/signaling S. Puri, S. Lasserre, et P. Le Callet, « CNN-basedtransform index prediction in multiple transformsframework to assistentropycoding »EUSIPCO 20172017, p. 798‑802. • Ad hoc VideoQualitymeasures…

  3. VideoCoding @ Machine (Deep) Learning Era (2) Hyper Space of possibilities: codec complexity, content diversity, viewingexperience Content type (PGC, UGC…) New viewingExperience (technology push) HDR/WCG VR/AR FTV New Distortions (e.g. CODECs) Distortion level

  4. VideoCoding @ Machine (Deep) Learning Era (3): STANDARD PLAYGROUND OUR CURRENT FOCUS Rate / Distortion / Complexity optimisation (RDCO) UGC encodingRecipe Pre-processing –coding optimisation (PPCO) Ad hoc testingmethodologiesboosted by AI VideoQualityMeasure for ContextResilientCoding • Hybridapproach: symbolic, statistic and deeplearning • Modelsthatcanpredict: • Optimal/ad hoc transform • Optimal Syntaxprediction/signalling • Ad hoc VideoQualitymeasures… Characterize Content

  5. Local Vs Global Optimize a system VideoQualityAssessment? …a matter of use case Quality Range Use case: how the the media isexploited Benchmark Systems « codec A vs codec B » Display proc. 1 Display proc. 2

  6. VIDEO QUALITY ASSESSMENT @ MACHINE LEARNING ERA BOOSTING SUBJECTIVE TEST: active sampling IMPROVING Metric learning : active sampling SMART DATA vs BIG DATA: - data augmentation - Full Reference metric as annotation => global or local

  7. CONTENT CHARACTERIZATIONTOWARDS Rate DISTORTION COMPLEXITY OPTIMISATION (RDCO)

  8. Content Influence [motion search, CU size, Depth …] A. Aldahdooh, M. Barkowsky, and P. Le Callet, “The impact of complexity in the rate-distortionoptimization: A visualizationtool,” IWSSIP 2015

  9. Learning Content Features 144 features: spatial and temporal, luma and chroma Motion Range prediction (HM/Qp 32) Predicting Block size (x.265/Qp 32) A. Aldahdooh, M. Barkowsky, and P. Le Callet, “The impact of complexity in the rate-distortionoptimization: A visualizationtool,” (IWSSIP 2015

  10. CONTENT CHARACTERISATION towarDS USER GENERATED CONTENTS (UGC) ENCODING RECIPES

  11. Exploring the Characteristics of UGC map any uploaded UGC that have the same encoding characteristics (similar R-D curves obtained using certain codecs) with already known UGC predict UGC encoding characteristics (R-D category) from Content characteristics? Distance of R-D curves: BD-Rate/Distortion/Quality

  12. BD-Rate/PSNR based clustering Different color represent different R-D related category. AVC/H264 • Encoded using AVC/H264 with fixed QP values of 20, 22, 25, 27, 32, 36, 41, 46, and 51 • Cluster results based on BD-rate VP9 • Encoded using VP9 with fixed CRF values of 20, 22, 25, 27, 32, 36, 41, 46, and 51 • Cluster results based on BD-rate

  13. Clustering based on BDR/quality + Feature Selections Explainable AI Feature Selection Classification BD-Rate/Quality based clustering BD-Rate/Quality based clustering R-D labels (Facebook) R-D labels (YouTube) Selected Feature Set Feature Selection RD-Curve related classification Selected content feature Content feature Feature Extraction Feature Extraction Predicted R-D related labels

  14. Clustering based on BDR/quality + Feature Selections Explainable AI Feature Selection Classification BD-Rate/Quality based clustering BD-Rate/Quality based clustering R-D labels (Facebook) R-D labels (YouTube) Selected Feature Set Feature Selection RD-Curve related classification Selected content feature Content feature Feature Extraction Feature Extraction Predicted R-D related labels

  15. Clustering based on BDR/quality + Feature Selections Explainable AI Feature Selection Classification BD-Rate/Quality based clustering BD-Rate/Quality based clustering R-D labels (Facebook) R-D labels (YouTube) Selected Feature Set Feature Selection RD-Curve related classification Selected content feature Content feature Feature Extraction Feature Extraction Predicted R-D related labels

  16. THE NEED FOR CONTENT CHARACTERISATION …and GOOD (NEW?) VIDEO QUALITY MEASUREFOR PREPROCESSING/CODING OPTIMIZATION (PPCO)

  17. PreProcessing (content Based) – Codingevaluation Source video Reconstructed HVSPP video Encoded Bit-stream Encode HVSPP Decode Transmit Compare performance (HOW?) Network Reconstructed original video Needs for accurate Quality Estimator at given operating points (not global): + check the pre-processing/encoding efficiency for multiple viewing conditions Subjective tests as a ground truth: Paired comparison (PC) has higher discriminatory power Image quality estimator: often not optimized for different viewing conditions M. Bhat, JM Thiesse, P. Le Callet“HVS based perceptual pre-processing for video coding ”, EUSIPCO 2019 M. Bhat, JM Thiesse, P. Le Callet, “OnAccuracy of Objective Metrics For Assessment of PerceptualPre-Processing for VideoCoding », ICIP 2019

  18. -ΔRmax -ΔRmean -ΔRmin Bit-rate savings at same quality for observer at 3H and 4.5H

  19. THE NEED FOR CONTENT CHARACTERISATION …and GOOD (NEW?) VIDEO QUALITY MEASUREFOR DYNAMIC CODING

  20. CODECs and Dynamiccoding Selecting optimalResolution & frame Rate: content dependency Quality Bit Rate

  21. CODECs and Dynamiccoding Quality -ΔRmax -ΔRmean -ΔRmin • Confidence in Subjective Score? Confidence in Videoqualitymetric? a.k.ametricresolution Bit Rate

  22. DEVELOPING MEANINGFULL VIDEO QUALTY MEASURES

  23. Objective VideoQualityMeasure native spaces? Michel Saad, Patrick Le Callet and Phil Corriveau «Blind Image Quality Assessment: Unanswered Questions and Future Directions in the light of consumers needs », 2nd VQEG eLetter, 2015 VQM predict a quality score on a scale And validatedwith subjective data obtained on Scaletoo (ACR, DSIS, SSQE, DSCQS, SAMVIQ …) Figure of merit of VQM: correlation Coefficient?

  24. MEANINGFULL METRIC? We want a metric to… • …say if A is of better quality of B • => PAIRWISE problem that should addressed as such OM = 0.80 OM = 0.80 OM = 0.75 OM = 0.45 |ΔOM| = 0.05 |ΔOM| = 0.35 ΔOM = +0.35 ΔOM is irrelevant

  25. VIDEO CODING and QUALITY ASSESSMENT: What do we need?GETTING MORE RELIABLE DATA iN NARROW QUALITY RANGE => improving confidence of GROUND TRUTH

  26. MEANINGFULL METRIC? We want a metric to… • …give closer scores for qualitatively similar pairs and distant scores for significantly different pairs • …give higher score for significantly preferred stimulus Left image preferred with statistical significance No significant difference in preferences OM = 0.80 OM = 0.80 OM = 0.75 OM = 0.45 |ΔOM| = 0.05 |ΔOM| = 0.35 ΔOM = +0.35 ΔOM is irrelevant

  27. PAIR COMPARISON (A/B) test design and analysis Ground truth for video Quality measures development: Conversion to scale values possible using Bradley-Terry or Thurstone-Mosteller models The goal: Mappingprobabilities of preferenceto a scale => Linearmodels of pairedcomparisons A3 A4 A1 A2 Each stimulus Ai has a merit «Vi »: in psychophysics, a sensation magnitude on a scale

  28. Boosting PC test: Adaptive Square Design (ASD) ITU and IEEE standard For the scenario that the ranking order of the test stimuli is not available • Initializethe square matrix randomly • Run paired comparisonsaccording to the rules of square design. 3. Calculate the estimated scores. According to current paired comparison results calculate the scores and sort them. 4. Updatethe square matrix. The adjacent pairs could be arranged according to this spiral 5. Repeatstep 2 and 4, until certain conditions are satisfied (e.g., 40 observers) B-T model A6 A1 A5 A2 A4 A3 A8 A9 A7 Rearrange the matrix Run pair comparison A6 A5 A1 A4 A2 A3 A8 A9 A7 Final result (PoE)

  29. Active sampling for pairwise comparison (NIPS 2018) Batchselection: Activelearningaccording toBayesiantheory,KLdivergence, Expected Information Gain (EIG), Minimum SpanningTree(MST) Batchselection: Activelearningaccording toBayesiantheory,KLdivergence, Expected Information Gain (EIG), Minimum SpanningTree(MST) A minimum spanning tree (selection of batch) J.Li,et.al.,Hybrid-MST: A Hybrid Active Sampling Strategy for Pairwise Preference Aggregation,NIPS2018

  30. VIDEO CODING and QUALITY ASSESSMENT: What do we need?GETTING MORE RELIABLE DATA iN NARROW QUALITY RANGE => improving confidence of GROUND TRUTHVALIDATING OBJECTIVE IMAGE QUALITY PREDICTORS …REVISIT BENCHMARKING

  31. Objective VideoQualityMeasure native spaces? VQM predict a quality score on a scale And validatedwith subjective data obtained on Scaletoo (ACR, DSIS, SSQE, DSCQS, SAMVIQ …) Figure of merit of VQM: RMSE? => needs for mapping

  32. Objective VideoQualityMeasure native spaces? VQM predict a quality score on a scale And validatedwith subjective data obtained on Scaletoo (ACR, DSIS, SSQE, DSCQS, SAMVIQ …) Figure of merit of VQM: RMSE? => needs for mapping => Meaningregardingaccuracy of the MOS?

  33. VQM native spaces? Most of VQM predict a quality score on a scale And validatedwith subjective data obtained on Scaletoo (ACR, DSIS, SSQE, DSCQS, SAMVIQ …) => alternative Pair Comparison • DedicatedframeworkHanhart and al. Qomex 16 …not a native space of VQM

  34. Alternative analysis in a native space (Krasula and al. Qomex 16) Take care of Confidence of subjective scores No Mapping objective scores to common subjective scales

  35. CODECs and Dynamiccoding Quality -ΔRmax -ΔRmean -ΔRmin • Confidence in Subjective Score? Confidence in Videoqualitymetric? a.k.ametricresolution Bit Rate

  36. Per Quality Range Analysis Native output of VQM/OM (no mapping) MOS RANGE [1-2] MOS RANGE [2-3] MOS RANGE [3-4] MOS RANGE [4-5]

  37. Per Quality Range Analysis All VQM challenges, AUC ≈ 0.5 (HDR-VDP2.2 slighltybetter) VIF-PU 68% HDR-VDP 59% PSNR/SSIM <50% All VQM ≈ 50% All VQM ≈ 50% (PSNR worst 45%) HDR-VDP 73% VIF-PU 66% SSIM 64%% PSNR 60% MOS RANGE [1-2] MOS RANGE [2-3] MOS RANGE [3-4] MOS RANGE [4-5] BROADCAST QUALITY

  38. Training objective metrics on multiple databases • 20 databases • 3,606 videos • 6 different subjective procedures • ACR, DSIS, SSCQE, SAMVIQ, and PC • Proof of concept • Shallow NN with the methodology as a cost function • Conservative training strategy • Significant improvement in overall metric’s performance Krasula et al. – Training Objective Image and Video Quality Estimators Using Multiple Databases, IEEE Transactions on Multimedia 2019

  39. VideoQualityMeasure for ContextResilientCoding

  40. Local Vs Global Optimize a system VideoQualityAssessment? …a matter of use case Quality Range Use case: how the the media isexploited Benchmark Systems « codec A vs codec B » Display proc. 1 Display proc. 2

  41. A HyperSpace of possibilities: more (meta)dimensions Content type IDIOSYNCHRASY New viewingExperience (technology push) Distortion level CONTEXT?

  42. Context: Viewing distance & Visual Angle

  43. TMO and Preference LDR HDR LDR L. Krasula, M. Narwaria, K Fliegel and P. Le Callet, “Influence of HDR Reference on Obervers Preference in Tone Mapped Images Evaluation," in Seventh International Workshop on Quality of Multimedia Experience (QoMEX), May 2015.

  44. Context: The importance of viewing conditions B. Watson: “Viewing distance should be an input parameter of the metric” & importance of display Objective quality assessment of color images based on a generic perceptual reduced reference , M Carnec, P Le Callet and D Barba, Signal Processing: Image Communication, 2006

  45. Perceptualbasedapproach: a closer look PerceptualErrorMaps: Errornormalizedby visibilitythreshold ViewingDistance Model of Display Reference Image From bits to Visual Unit Visual differences & Normalization Distorted Image From bits to Visual Unit • Pooling: • - Component (color) • - Orientation • Frequency • Spatial • - Temporal Quality Score • Modeling the annoyance/ the formation of qualityjudgement • Fromquantified visible errors to impact on visualquality

  46. On PerceptualCoding: Quality, Content Features and Complexity Patrick Le Callet Université de Nantes

  47. IMAGE ENHANCEMENT USE CASE The Non Reference Case: adhocmethods and testingmethodologies Range Effect OvershootEffect Michel Saad, Patrick Le Callet and Phil Corriveau «Blind Image Quality Assessment: Unanswered Questions and Future Directions in the light of consumers needs », 2nd VQEG eLetter, 2015

More Related