280 likes | 434 Views
Platform-based Design for MPEG-4 Video Encoder. Presenter: Yu-Han Chen. Video Coding Standards. Storage. Broadcasting. Storage. HDTV. MPEG2. SDTV. Telcomm. MPEG1. 1994. Resolution/Quality. Telcomm. 1992. Storage. H.261. CIF. 1990. H.263. MPEG4. Multimedia. QCIF. 1999. 1995.
E N D
Platform-based Design forMPEG-4 Video Encoder Presenter: Yu-Han Chen
Video Coding Standards Storage Broadcasting Storage HDTV MPEG2 SDTV Telcomm MPEG1 1994 Resolution/Quality Telcomm 1992 Storage H.261 CIF 1990 H.263 MPEG4 Multimedia QCIF 1999 1995 10K 100K 1M 10M bps Data Rate
Introduction • Multimedia applications are emerging • Video-phone, camcorder, surveillance, and video streaming • MPEG-4 provides a total solution for these applications • High compression ratio for limited bandwidth • Error robustness to error-prone environment • Content interactivity for more functionalities besides ‘seeing’
Proposed MPEG-4 Encoder • MPEG-4 video encoding • Platform-based system architecture • Motion encoding module • Texture encoding module
Complexity Analysis of Optimized Software Model • SPL3 foreman sequence at 30 fps • ME – full search with half-stop algorithm • DCT/IDCT – row-column decomposition
Implementation Demands • Computational power is up to 12 GIPS • ME is the most important key component • DCT/IDCT is the second one • Dedicated hardware accelerators is employed • Implementation for various features of algorithms • Software for irregular and sequential ones • Hardware for high-processing rate ones • HW/SW co-design is the most promising solution to achieve a cost-effective system
Platform for MPEG-4 Video Coding • Platform-based system includes • HYRISC, RBUS and DBUS, DMA, MEMIF • Hardware accelerators includes • ME, MC, BE(DCT/IDCT,Q,IQ,ACDCP), Bitstream Unit, Share Memory (CG, CB)
Summary of ME • Low cost and high performance hybrid motion estimation is proposed • Dynamic modes for various applications • Applications of real-time and low power • PDS (Predictive Diamond Search) mode • Applications of high compression quality • FFS (Fast Full Search) mode • Spiral full search with PDE (Partial Distortion Elimination)
Texture Encoding Module • Interleaving DCT/IDCT schedule • DCT and IDCT are performed interleaved for the same block • Sub-structure sharing technique • Applied on AC/DC prediction datapath and Q/IQ by extracting the same formula term
Sub-structure Sharing of Q/IQ and ACDC Prediction • Scalar operation : (QAC x QPA) / QPX • Share partial result (QAC x QP = M) in IQ module • Share data-path of Q for M / QPx
FFS (QP = 16 PSNR_Y=32.4012, Bits=9537) PDS (QP = 16, PSNR_Y=32.0256, Bits=9465) Subject View Worse case of PSNR drop (0.3962 dB) at the 69th frame
Conclusion • A cost-effective MPEG-4 video encoder is proposed • Hardware accelerators • A novel hybrid motion estimation architecture • A cost-effective texture block engine architecture • Platform-based system backbone • Compromise flexibility and high performance • HW/SW co-design flow and tools
DCT/IDCT Coefficient Matrix • N=8 Even Symmetric Odd Symmetric
1-D DCT and IDCT • 1-D DCT (Y=AX) • 1-D IDCT (Y=ATX) Preprocessing Data Reordering Data Reordering 8 MAC operation down to 4! Postprocessing
DCT/IDCT Architecture • DRU(Data Reordering Unit): Two parallel MAC Two 1-D operation multiplexing Preprocessing Postprocessing DCT IDCT
Multiplication of Constant Coefficients • Only 7 constant coefficients used • Sign Digit representation • Minimum nonzero term (1, -1) • Shift and Add • Avoid dedicated multiplier
Power Consumption Estimation • Case 1 – 0.18μm • Case 2 – 0.18μm, 1/8 computational power • Case 3 – 0.18μm, 1/8 computational power, gated clock