220 likes | 412 Views
Parallel Scalability and Efficiency of HEVC Parallelization Approaches. Chi Ching Chi, Mauricio Alvarez-Mesa ,, Ben Juurlink , Gordon Clare, F´elix Henry , St´ephane Pateux and Thomas Schierl IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. Outline. Introduction
E N D
Parallel Scalability and Efficiency ofHEVC Parallelization Approaches Chi Ching Chi, Mauricio Alvarez-Mesa,, Ben Juurlink, Gordon Clare, F´elix Henry, St´ephanePateux and Thomas Schierl IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
Outline • Introduction • Video codec parallelization approaches • Coding efficiency analysis • Experimental evaluation • Conclusions
Introduction • While the single-core processor can decode a 1080p H.264/AVC video in real-time, it is very unlikely that processor performance will decode a 2160p50 HEVC video in real-time. • To obtain real-time HEVC decoding performance, parallelism is no longer an option but a necessity.
Introduction • H.264/AVC supports slice parallelization. • It may not achieve real-time if it receives a video with one or a few slices per frame. • The main parallelization approaches currently included in the HEVC draft (Tiles and Wavefront Parallel Processing[WPP]). • This paper presents a approach called Overlapped Wavefront(OWF).
Previous parallelization strategies • Frame-level parallelism • Slice-level parallelism • Macroblock-level parallelism
Frame-level parallelism • Frame-level parallelism consists of processing multiple frames at the same time. • Frame-level parallelism is sufficient for multicore systems with just a few cores. • If due to fast motion, motion vectors are long, there is little parallelism.
Slice-level Parallelism • Each frame can be partitioned into one or more slices. • Slices in a frame are completely independent from each other and therefore they can also be used for parallel processing. • It is useful for a frame with a few slices but not one slice per frame.
Parallelization Strategies in HEVC • Tiles • Wavefront Parallel Processing (WPP) • Overlapped Wavefront (OWF)
Tiles • The number of tiles and the location of their boundaries can be defined for the entire sequence or changed from picture to picture. • Compared to slices, Tiles have a better coding efficiency. • The rate-distortion loss increases with the number of tiles.
Overlapped Wavefront (OWF) • When a thread has finished a CTB row in the current picture and no more rows are available it can start processing the next picture instead of waiting for the current picture to finish. • The support this approach, the motion vector is contrained to ¼ of picture height.
Experimental evaluation • Environment
Conclusions • We present a detailed performance comparison of the main approaches, namely WPP ,Tiles and OWF. • Tiles performance 7% higherthan WPP on average at 12 cores. • The proposed OWF 28% higher on average than Tiles. • Achieve real-time performance for 1080p50 videos, but “only” 25.4 fps for 2160p.