1 / 6

CE6.d: Parallel Intra Coding JCTVC-F605 Jie Zhao and Andrew Segall

CE6.d: Parallel Intra Coding JCTVC-F605 Jie Zhao and Andrew Segall. Parallel Prediction Unit. Parallel Intra Prediction Intra-prediction is a serial bottleneck for high resolution applications Prediction is based on reconstructed pixels from left/top neighbors

edythe
Download Presentation

CE6.d: Parallel Intra Coding JCTVC-F605 Jie Zhao and Andrew Segall

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CE6.d: Parallel Intra CodingJCTVC-F605Jie Zhao and Andrew Segall

  2. Parallel Prediction Unit • Parallel Intra Prediction • Intra-prediction is a serial bottleneck for high resolution applications • Prediction is based on reconstructed pixels from left/top neighbors • Small blocks (4x4) are important for visual and R-D performance at high resolutions • We seek to improve parallelism to reduce complexity of worst case intra decode (high resolution with larger number of 4x4 blocks) • Our solution – Parallel Prediction Unit • Defines a CU size within a LCU that can be encoded/decoded in parallel • For CUs that are larger than a defined PPU size, traditional sequential prediction is performed • Benefit: Enables parallelism as dependencies increase within an LCU LCU CU CU CU/PU/TU CU/PU/TU CU CU/PU/TU CU/PU/TU PPU

  3. Parallel Intra Prediction Second set blocks • Goal of CE6.d • Compare the performance of three configurations of parallel intra prediction • Configuration #1: Checker-board Partition • 2X parallelism • Configuration #2: Stripe Partition • 2X parallelism • Configuration #3: No 4x4 intra prediction • Suggested at previous meeting for comparison • 8x8 intra-prediction combined with 4x4 residual • Note: We originally used bi-directional prediction for block 0 in the first configuration. We have removed to ‘seek a simple/design structure’, as requested at the last meeting. First set blocks Fig. 1 Checker-board partition Fig. 2 Stripe partition Fig. 3 No 4x4 prediction

  4. Result (PIP Stripe) • Results • “Stripe” partition performed best • All resolutions • Intra (HE/LC): 1.4% • RA (HE/LC): 0.5% • LD-B HE/LC): 0.2% • HD only • Intra (HE/LC): 1.0% • RA (HE/LC): 0.5% • LD-B HE/LC): 0.2%

  5. Results • Results (for reference) • No 4x4 prediction Stripe partition Disable 4x4 prediction Parallel intra prediction out-performs disabling 4x4 prediction by 2.6% and 3.8% (AI-HE/LC, respectively).

  6. Conclusions • Conclusions • Parallel intra prediction to reduce serial dependencies within current design • Provides parallelism for both encoder and decoder. • Parallelism is achieved by partitioning a PPU into two sets • First set predicted from boundaries of PPU • Second set predicted from all available pixels. • Impact on average BD rate (Stripe Configuration) • Intra (both HE and LC): 1.4% • Random access (both HE and LC): 0.5% • Low delay B (both HE and LC): 0.2% • Related Documents • Verification by Toshiba, Qualcomm and ETRI (JCTVC-F328, F583, F628) • Request to improve parallelization in current design from Zoran (JCTVC-F736) • Propose to adopt this technique to HM.

More Related