1 / 14

Parallel Design Methodology for Video Codec LSI with High-level Synthesis and FPGA-based Platform

DAC50, Designer Track, 156-VB543. Parallel Design Methodology for Video Codec LSI with High-level Synthesis and FPGA-based Platform. Kazuya YOKOHARI, Koyo NITTA, Mitsuo IKEDA, and Atsushi SHIMIZU NTT Media Intelligence Laboratories. Outline. Introduction Proposed Design Methodology

raoul
Download Presentation

Parallel Design Methodology for Video Codec LSI with High-level Synthesis and FPGA-based Platform

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DAC50, Designer Track, 156-VB543 Parallel Design Methodology for Video Codec LSI with High-level Synthesis and FPGA-based Platform Kazuya YOKOHARI, Koyo NITTA, Mitsuo IKEDA, and Atsushi SHIMIZU NTT Media Intelligence Laboratories

  2. Outline • Introduction • Proposed Design Methodology • Case Study: 4K HEVC Intra Codec • Evaluation • Conclusion

  3. Video Codec LSI • MPEG-2 and H.264/AVC are major standards of video coding. • We have developed MPEG-2 video codec LSI (VASA) and H.264/AVC codec LSI (SARA). • The development of video codec LSI needs many simulations. Bit Stream (Coded Image) Codec LSI Objective evaluation examples: BD-Bitrate, SSIM, PSNR Test data VASA (MPEG-2) SARA (H.264/AVC) • Coded image should be evaluated bysubjective and objective evaluation. • Degradations of some coded images are not detected by objective evaluation. • Subjective evaluation in real-time is important to find these degradations.

  4. Existing LSI Design Flow • Even behavioral design which is fastest simulation environment needs 100 times simulation time, at the existing design flow. • Fast simulation environment is important, since many simulations are needed at the video codec LSI design. Simulation Speed Existing architecture exploration loop SystemC source codes Behavioral design Fail X100 (on CPU) Verification Pass Stimulus Behavioral Synthesis RTL design Verilog-RTL codes X1,000 (on CPU) X100 (on emulator) Fail Verification Technology Library Pass Logic Synthesis Verilog-RTL codes (already verified) Gate-level design X10,000 (on CPU) X1,000 (on emulator) P & R FPGA ASIC IP core

  5. The Problems of The Video Codec LSI Development • Many simulations are needed at the development of the video codec LSI. • The simulation needs 100 times simulation time at the existing LSI design. • To resolve above problems, simulation and circuit design environments are important to check and improve codec LSI performance smoothly. • Simulation environment: FPGA-based platform. Real-time simulation becomes possible using FPGA. • Circuit design environment: High-level synthesis. Rapid prototyping becomes possible using high-level synthesis.

  6. Video Codec Design Platform • The video codec design platform is able to run large scale circuit simulation in real-time using many FPGAs. • The proposed platform enables input and output image data in real-time using some SDI interfaces. FPGA1 FPGA2 FPGA (Center) FPGA3 FPGA4 SDI interface • The proposed platform has many FPGAs, since the scale of a product level video codec LSI is very large. • This platform enables simulations of a product level circuit using many FPGAs.

  7. Proposed Video Codec Design Flow (1/2) • Proposed design flow enables rapid prototyping using high-level synthesis. • Proposed design flow enables real-time simulation using the proposed platform. GOOD • Feedback time is needed by repetition of each design steps when single architecture exploration loop is used. NOTGOOD Simulation Speed Existing architecture exploration loop Proposed architecture exploration loop SystemC source codes Behavioral design Fail X100 (on CPU) Verification Pass Stimulus Behavioral Synthesis RTL design Verilog-RTL codes X1,000 (on CPU) X100 (on emulator) Fail Verification Technology Library Pass Logic Synthesis Verilog-RTL codes (already verified) Gate-level design X10,000 (on CPU) X1,000 (on emulator) P & R FPGA ASIC IP core X1 (on video codec design platform)

  8. Proposed Video Codec Design Flow (2/2) • Circuits design is subdivided and parallel design is performed, in order to reduce feedback time by repetition of each design steps. • Using parallel design, architecture exploration is realized at high speed. Simulation Speed Existing architecture exploration loop Proposed architecture exploration loop SystemC source codes Behavioral design Fail X100 (on CPU) Verification Pass Stimulus Behavioral Synthesis RTL design Verilog-RTL codes X1,000 (on CPU) X100 (on emulator) Fail Verification Technology Library Pass Logic Synthesis Verilog-RTL codes (already verified) Gate-level design X10,000 (on CPU) X1,000 (on emulator) P & R FPGA ASIC IP core X1 (on video codec design platform)

  9. Summary of The Proposed Design Methodology The proposed parallel design methodology has three features. • High-level synthesis. • Using high-level synthesis, a target circuit architecture can be easily changed and tuned compared with a RTL design methodology. • Video codec design platform. • Using video codec design platform, a subjective image evaluation can be performed, since the proposed platform can perform simulation in real-time. • Parallel design. • Using parallel design and high-level synthesis, the function addition in smaller unit becomes possible that leads to the reduction of a feedback time. Combining these three features, an effect of subjective image quality for each function can be evaluated and used for architecture exploration.

  10. Case Study: 4K HEVC Intra Codec • HEVC (High Efficiency Video Coding) is a next generation video coding standard. • HEVC intra codec consists of three blocks, intra prediction, transform and quantization, and entropy coding block. Video Coding Transform and Quantization Input Data Intra Prediction Entropy Coding Output Stream Intra Prediction generates prediction difference image from input data and predicted image data. Transform and Quantization generates quantized values from transformed difference imageand reconstruction image from quantized values. Entropy Coding generates bit stream from quantized values.

  11. The Specifications of the HEVC Intra Codec This slide’s scope. • Prediction Mode *CU stands for Coding Unit. *PU stands for Prediction Unit. *TU stands for Transform Unit. *HM is a reference software of HEVC 18 26 34 10 0: Planar 1: DC 2

  12. Evaluation (1/2) Circuits Performances and Design Period STEP1 STEP2 LOOP#1 Subjective Evaluation Period Subjective Evaluation Period Feedback data is available • The main changed points of each block. • LOOP#1: Version up base algorithm of each block • LOOP#2: Functional expansion of IPD • LOOP#3: Functional expansion of each block STEP2 LOOP#2 • The circuit performances of each expanded function are evaluated at STEP2. • The feedback data is available from other design loops at STEP2. STEP2 LOOP#3

  13. Evaluation (2/2) • Using the proposed parallel design methodology, three design loops were able to be tried in only seven months. • Using the proposed parallel design methodology, the number of cycle*area was reduced to 1/5 in four months after preliminary design of the LOOP#1 and 1/4 in three months after preliminary design of the LOOP#2. STEP1 STEP2 LOOP#1 80% down (four months) 90% down LOOP#2 75% down (three months)

  14. Conclusion • We proposed that the new design methodology for video codec LSI. Using the proposed design methodology, we are able to reduce feedback time and run simulation and evaluate coded image in real-time. • Using the proposed design methodology, three design loops were able to be tried in only seven months. • Using the proposed design methodology, the number of cycle * area was reduced to 1/5 in four months after preliminary design of the LOOP#1 and 1/4 in three months after preliminary design of the LOOP#2. • In order to realize a HEVC codec, we need to add or expand some functional tools, checking subjective evaluation of these tools.

More Related