230 likes | 377 Views
A Pragmatic Spatial-Random-Access-Enabled Video Coding Scheme. Piyush Agrawal EE398A – Project Presentation. High resolution video - challenges. Rise of high resolution videos Better digital imaging sensors Increasing storage capacity
E N D
A Pragmatic Spatial-Random-Access-Enabled Video Coding Scheme Piyush Agrawal EE398A – Project Presentation
High resolution video - challenges • Rise of high resolution videos • Better digital imaging sensors • Increasing storage capacity • Algorithms and systems for stitching ultra-high resolution videos using multiple cameras • Challenges in using such videos • Lack of network bandwidth • Lack of high resolution display screens • Solution: Interactive Region-of-Interest video streaming
Agenda • Spatial-random-access enabled video coding • Related work • Proposed schemes • Experimental results • Discussion on pros and cons of different schemes
Spatial-random-access enabled video coding – why? • One way of providing interactive region-of-interest streaming • Decode entire high resolution video on the fly, for each user • Crop relevant part of the high resolution frame • Encode the relevant part again and transmit • Drawbacks • Multiple encodings required • Not scalable with increasing no. of simultaneous viewers • Required • A scheme which performs encoding only once and then can serve any user, any no. of times
Related work – Eusipco’07 Source: Aditya Mavlankar, Pierpaolo Baccichet, David Varodayan, and Bernd Girod. Optimal slice size for streaming regions of high resolution video with virtual pan/tilt/zoom functionality. In Proc. 15th European Signal Processing Conference (EUSIPCO07), pages 1275–1279, 2007.
Related work – PCS’09 Source: Aditya Mavlankar, Peer-to-Peer video streaming with interactive region-of-interest, PhD Dissertation (un-published) • 85% reduction in storage requirements, compared to Eusipco’07 scheme
Drawbacks • Not compliant to any video coding standards • Require custom encoder and decoder • Decoder complexity • Cross resolution layer dependencies • Scaling operation for rendering each frame • Difficult to implement multi-thread/process/CPU parallel encoder for real-time encoding • Entire frame processed as a whole
How to detect static segments? • Consecutive frame differencing • Calculate mean pixel difference value • If below a fixed threshold, declare as static • Smoothing • Video shot in bad lighting conditions – too much noise • Leads to high frame difference even with no “actual” motion • Apply Gaussian smoothing filter to each tested frame • How to find the fixed threshold • MSE of 1 gives PSNR = 48 dB • Consider two consecutive frames as original and reconstructed signal respectively • PSNR of 48 dB means the two signals look alike, i.e no motion between the two frames • Other ideas • Structural Similarity Index Measure (SSIM)
Experimental setup • Compare 4 schemes • ViewXtreme • ViewXtreme Adaptive Skip • UpwardPredictionOnly (EUSIPCO’07) • BE-LTMMCP (PCS’09) • Test video: 600 frames, classroom scene • Results only for highest resolution layer (1920x1080) • Slice size: 480x270 pixels • QP for base layer = 27 • Effects performance of BE-LTMMCP and UpwardPredictionOnly schemes • GOP size = 30 frames • Effects performance of ViewXtreme and ViewXtreme Adaptive Skip schemes • Encoded video: 30 frames per second
Encoding speed Encoding done on a quad core machine, with 4GB RAM
Pros and cons of proposed schemes • Pros • Standard compliant encoder and decoder • Simplified decoder • Highly parallel encoder possible using off-the-shelf encoding tools • Significantly better (~66%) coding efficiency, leading to small network bandwidth required • Cons • Expectedto provide lower degree of spatial-temporal-random access as compared to other schemes • Use of motion compensated prediction coding Can we confirm this?
A deeper dive into random access • Logical operations performed to render a random frame • Download bits required to decode the single random frame • Decode bits and create the reconstructed frame in memory • Render reconstructed frame on the client’s display • Rendering of reconstructed frame (step 3) – independent of coding scheme – can be ignored • ViewXtreme and BE-LTMMCP schemes differ in step 1 and 2
Differences • BE-LTMMCP • Each frame independent of another frame (on same resolution layer) • Encoded bits corresponding to only the single random frame to be downloaded • Decoding of single frame needed • Mean size of a random frame can be estimated from bits per pixel for different quality levels • ViewXtreme • No. of required bits (to be downloaded) depend on GOP structure
ViewXtreme – dependence on GOP • Frames to be downloaded • Frame 1: 1 I-frame • Frame 4: 1 I-frame + 2 P-frames + 1 B-frame • Frame 6: 1 I-frame + 4 P-frames + 1 B-frame • All frames of a GOP equally likely to be requested • Estimate no. of bits to be downloaded if median frame is requested • No B-frames used in experiments, GOP size = 30 frames • For 15th frame: 1 I-frame + 14 P-frames required • Mean size of a single I-frame and P-frame measured in experiments
Effect of decoding multiple frames • For a random frame to be displayed • On average, 15 frames to be decoded (GOP size = 30) • Benchmark on a single core (2.4Ghz) client • Decoding rate upto 500 fps for a 480x270 pixel video encoded using H.264 • Time needed to decode 15 frames = 15 * 1/500 seconds = 30 msec • Less than inter-frame interval of 33 msec (for playing video at 30 fps) • Decoding time negligible compared to data download time • Conclusion: ViewXtreme scheme provides higher degree of spatial-temporal random access
Conclusions • Proposed 2 new coding schemes for spatial-random-access • Compared with two state-of-the-art schemes • Showed that the proposed schemes outperform other schemes in terms of • Coding efficiency • Standard compliance • Encoder and decoder complexity • Degree of spatial-temporal random access • Future work • Better ways of detecting static segments • Better architectural designs for encoders running on commodity machines
Acknowledgements • Prof. Bernd Girod • Mina Makar • Aditya Mavlankar • Derek Pang Thank You!