300 likes | 309 Views
This paper explores a practical and scalable approach for transmitting segmented video sequences to multiple players using H.264, with a focus on encoding and decoding techniques. The goal is to efficiently distribute personalized viewports to a large number of players, enabling applications such as omnidirectional video, geospatial data visualization, and more.
E N D
Practical and Scalable Transmission of Segmented Video Sequences to Multiple Players using H.264 Fabian Di Fiore, Panagiotis Issaris Expertise Centre for Digital MediaHasselt University, Belgium {fabian.difiore, takis.issaris}@uhasselt.be
Introduction • The integration of multimedia streams of various nature is an important feature of many of today's games • Should not be limited to inter-person communication only • Many other interesting applications are in the queue • omnidirectional video • you control the camera and, hence, the viewport • the replacement of traditional computer-generated background images by video sequences • viewport depends on your position in a platform game • geospatial data visualization applications (battlefield, Google Earth) need for distributing personal viewports on the same video sequence to large amounts of players Introduction • Approach • Measurement Results • Conclusions motivation
Introduction Video consists of sequences of frames Introduction • Approach • Measurement Results • Conclusions contribution
Introduction Let’s focus on one such frame Introduction • Approach • Measurement Results • Conclusions contribution
Introduction We'd like to have multiple viewports... Introduction • Approach • Measurement Results • Conclusions contribution
Introduction … of possibly different sizes Introduction • Approach • Measurement Results • Conclusions contribution
Introduction … and possibly moving Introduction • Approach • Measurement Results • Conclusions contribution
Introduction … and then distribute it (to heterogeneous clients). Introduction • Approach • Measurement Results • Conclusions contribution
How can we accomplish this? • System serving video 1) - Decode - For each client • crop the required viewing area • encode it • send it in compressed form to the client 2) - Encode once(H.264) - For each client • cut out the required viewing area • send it to the client Introduction • Approach • Measurement Results • Conclusions overview
Encode for each client • Extremely CPU-intensive and thus not scalable • Possible loss of quality (if source was already compressed) Introduction • Approach • Measurement Results • Conclusions encode for each client
How can we accomplish this? • System serving video 1) - Decode - For each client • take the required viewing area • encode it • send it in compressed form to the client 2) - Encode once (H.264) - For each client • cut out the required viewing area • send it to the client Introduction • Approach • Measurement Results • Conclusions overview
H.264 • Video encoder • input: a sequence of frames • output: smaller bitstream conforming to a specification • Block-based video codecs • divide each input frame into a grid structure • each individual area is called a macroblock • resulting output bitstream consists of a highly compressed representation of these macroblocks • the representation of each macroblock in the compressed form is of variable length: a bit error in for example the first macroblock leads to the failure to decode any macroblock in the frame Introduction • Approach • Measurement Results • Conclusions H.264: general
H.264 • H.264 • aka MPEG-4/AVC • most recent ITU & MPEG video coding standard widely used (Blu-ray, DVB-S2, QuickTime, . . . ) • provides a feature called a ‘slice’ which allows macroblocks to be grouped • each slice becomes individually en/decodable • coded video bitstream can have a very simple structure using one slice per frame, however, having multiple slices is advantageous for parallel processing and network transmission errors in a slice will not prevent the correct decoding of the other slices Introduction • Approach • Measurement Results • Conclusions H.264
How can we accomplish this? • System serving video 1) - Decode - For each client • take the required viewing area • encode it • send it in compressed form to the client 2) - Encode once (H.264) - For each client • cut out the required viewing area • send it to the client Introduction • Approach • Measurement Results • Conclusions overview
Cut out the selected area • Cut out the video... Introduction • Approach • Measurement Results • Conclusions regular slice subdivision
Cut out the selected area • … by regarding the frame as being composed of many independent areas… • use regular slice subdivision over all frames Introduction • Approach • Measurement Results • Conclusions regular slice subdivision
Cut out the selected area • … and then for each client, select the areas enclosing the desired viewport. Introduction • Approach • Measurement Results • Conclusions regular slice subdivision
Cut out the selected area • Removing unnecessary slices Introduction • Approach • Measurement Results • Conclusions removing slices
Cut out the selected area • Removing unnecessary slices • deleting one slice affects all others • renumbering of macroblock numbers: slices need to be altered • dangerous due to variable bit length encoding: implies bit shifts • moving view frustrum • the renumbering of macroblocks due to the cropping, is different before and after the movement of the cropping window. So, the motion compensation process in the decoder will start using the wrong image data to reconstruct the image • video frame size could change: decoder issues Introduction • Approach • Measurement Results • Conclusions removing slices
Cut out the selected area • Replacing unnecessary slices • replace by an artificially generated slice of minimal size • no renumbering of macroblocks • no issues of bit shifting • no header modifications needed • larger decoded picture buffer size • generated on the fly • intra-frames (~ I frames): gray blocks • inter-frames (~ P frames): only skip-bits • macroblocks of subsequent ‘empty’ slices are grouped together into a single empty slice to prevent overhead of the slices’ headers Introduction • Approach • Measurement Results • Conclusions replacing slices
Cut out the selected area Replacing slices empties part of a frame Grouping empty slices Introduction • Approach • Measurement Results • Conclusions replacing slices
Concerns • Frame dependency • motion vectors can point to a replaced slice (i.e. gray area) in a previous frame • solution: restrict motion vectors to current slice area Introduction • Approach • Measurement Results • Conclusions concerns
Concerns • Grid structure • H.264 forces grouping of macroblocks to be consecutive in raster scan order • slices cannot span more than one row of macroblocks • macroblock: 16 x 16 • rectangular slices of 16 x .... (e.g., 16 x 192) • FMO • Flexible Macroblock Ordering • not supported yet by real-time decoders Introduction • Approach • Measurement Results • Conclusions concerns
Concerns • Prediction artifacts • moving the view frustum might cause motion compensation to use data we cropped out in previous frames • sudden large movements • we receive predicted slices for which we do not have the basis of the prediction (only gray artificial slices) • solution: send out more slices, located around the viewport Introduction • Approach • Measurement Results • Conclusions concerns
Measurement results and comparison • Evaluation of replacing unneeded slices by empty ones • Comparisons are made against the official reference encoder (JM) • Our implementation is built on the reference encoder as it is the only encoder fully implementing H.264 • No specific decoder needed Introduction • Approach • Measurement Results • Conclusions settings
Measurement results and comparison • panoramic video sequence • input stream • 2 clients • actual videostream sent to third client
Measurement results and comparison • Grid structure overhead caused by using 512 slices per frame Introduction • Approach • Measurement Results • Conclusions grid structure overhead
Measurement results and comparison • Bitrate reduction using slice replacement • purple: • one slice per frame (i.e. full frame) • green: • 512 slices • red: • slices replaced Using FMO and optimized encoder will have even bigger impact Introduction • Approach • Measurement Results • Conclusions bit rate reduction
Measurement results and comparison • Bitrate reduction using slice replacement (low vs hi quality) • green: • unaltered • blue: • inter slices replaced • orange/yellow: • all slices replaced • red: • replaced and grouped Introduction • Approach • Measurement Results • Conclusions bit rate reduction
Conclusion and future work • Practical way to distribute parts of a video sequence to large amounts of users • Conforming to H.264 standard • Frame extraction directly from compressed bitstream • Eliminating need for additional decoding and encoding • The overhead associated with the slice structure is mitigated by the bandwidth reduction in transmitting high-quality video streams Additional benefit will be gained when - FMO is practically usable (encoder and decoder) - using an optimized encoder (instead of reference encoder) Conclusion and future work