1 / 25

Wavelet-based Region-of-Interest (RoI) Video Coding

Wavelet-based Region-of-Interest (RoI) Video Coding. Derek Pang, Huizhong Chen, Sherif Halawa EE398A Course Project Winter 2010 Stanford University. Outline. Existing work in IRoI video application Overview of our proposed solution Two coding schemes : 2D+T and T+2D

swann
Download Presentation

Wavelet-based Region-of-Interest (RoI) Video Coding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Wavelet-based Region-of-Interest (RoI) Video Coding Derek Pang, Huizhong Chen, SherifHalawa EE398A Course Project Winter 2010 Stanford University

  2. Outline • Existing work in IRoI video application • Overview of our proposed solution • Two coding schemes : 2D+T and T+2D • RoI Cropping and DWT filter selection • Modified entropy coders • Modified Zero-tree • Modified EBCOT • Experimental result • Future work/ conclusion

  3. Related Work • “Optimal Slice Size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality,” [Mavlankar et al, 2007] • High bitrate • “Compression-Aware Digital Pan/Tilt/Zoom,” [Makar et al, 2008] • High complexity • Low user scalability • ClassX (formerly viewXtreme) Project. [ Agrawal et al, 2008] • High complexity

  4. Proposed Solution Wavelet-based RoI video coding • Spatial resolution scalability • Fast and direct RoI cropping support in compressed domain • Low to moderate complexity • Scalable to large number of users • R-D performance advantage over current ClassX system on RoI video transmission

  5. Wavelet-Based RoI Video Coding Two coding schemes • T+2D • 2D+T RoI Video Temporal Prediction Coder Residual DWT RoI Video DWT Coder In-Band Prediction Residual

  6. T+2D GOP Structure Background Wavelet-based residual coding A A P P P … H.264 Thumbnail T T T T T Prediction with motion vectors Prediction without motion vectors

  7. T+2D System Overview Input Video Stored Bit Stream Transform/ Quantization/ Entropy Coder Thumbnail Store Predictor Residual Anchor Video Storage Background Frame Buffer Prediction Entropy Decoder/ Inverse Quantization/ Inverse Transform/ RoI Request Entropy Decoder/ Inverse Quantization/ Inverse Transform/ RoI Bit Stream Assembler Decoded RoI Video RoI Bit Stream

  8. 2D+T GOP Structure Background Wavelet-based residual coding A A P P P … H.264 LL Band T T T T T In-Band Prediction with motion vectors In-Band Prediction without motion vectors

  9. RoI Extraction in Wavelet-domain • RoI Mask is generated for RoI extraction • Mask size depends on the length of the synthesis filters and the number of DWT levels. Mask Generation RoI Spatial Domain Wavelet Domain

  10. DWT Filter Selection • Mathematical analysis (refer to report) • Experimental analysis Filter Selected: Haar

  11. Entropy Coder - Modified EBCOT • Uniform quantizer • MQ-Coder • Bitplane coder with normal 3 coding passes (Signif., MR, clean-up) Four Major Modifications: • A post-compression RoI extraction • Variable block-size coding • On-demand block size selection • Fast zero-block coding

  12. Modified EBCOT – Block Structure

  13. Modified Zero-tree • Exploits redundancy between decomposition levels by clustering quantized coefficients into “trees”. • Zero-trees (ZTs): tree of zeros. • ZTs are abundant in wavelet-decomposed residuals. • ZTs encode a large number of zeros using very few bits. Pyramid view of Quantized coefficients Tree of base-length 4 (mode-4) Division into 4 mode-2 trees

  14. Joint Entropy-Quantizer RD Optimization Input coefficient tree Quantizer 1 Huffman codebook sel. Lagrangian Cost Quantizer 2 Huffman codebook sel. Lagrangian Cost Decision Quantizer N Huffman codebook sel. Lagrangian Cost

  15. M-ZT – RD Optimized Mode Selection Input coeff. tree (mode-32) R-D opt. Q/CB sel. Mode selection Division to mode-16 R-D opt. Q/CB sel. Mode selection Division to mode-8 R-D opt. Q/CB sel. R-D opt. Q/CB sel. Mode selection Division to mode-8 R-D opt. Q/CB sel. R-D opt. Q/CB sel. Mode selection Division to mode-8 R-D opt. Q/CB sel. R-D opt. Q/CB sel. Mode selection Division to mode-8 R-D opt. Q/CB sel.

  16. Experiment Setup • 512x512 Osgood Sequence • 90 frames • Reference benchmark • ClassX system • One resolution layer • Four 256 x 256 slices • 256 x 256 RoI • Our system • 3 decomposition levels • M-EBCOT block sizes: 4 to 64 • M-ZT block sizes : 8 to 32 Osgood Sequence

  17. Entropy Coders Performance Comparison

  18. R-D Performance for Full Video Frame ~2dB

  19. R-D Performance for 256 x 256 RoI ~2dB

  20. Visual Quality Comparsion Low Quality (T+2D EBCOT) 262 kbps 33.27 dB High Quality (T+2D EBCOT) 583 kbps 40.02 dB

  21. Conclusion • Proposed a novel RoI video coding scheme: • based on wavelet coding • fast RoI cropping in compressed domain • improved entropy coders for RoI applications • achieved moderate performance gain • Possible future work • Detailed analysis and testing • Modified zero-tree bit plane coding • Complexity evaluation

  22. Acknowledgement • Aditya Mavlankar • Mina Makar • Piyush Agrawal • Ngai-Man Cheung • Professor Bernd Girod

  23. Questions? Thank you.

  24. 2D+T System Overview Highpass Bands Residual Entropy Encoder Transform Quantization Input Video Lowpass Band Anchor Back-ground Frame Buffer H.264 Entropy Decoder Stored Bit Stream Video Storage Entropy Decoder/ Inverse Quantization RoI Bit Stream Assembler RoI Bit Stream Inverse Transform RoI Request Decoded RoI Video

  25. Modified EBCOT Coder

More Related