120 likes | 127 Views
This document provides a response to proposals for removing FMO and ASO from the Baseline. It includes an analysis of the cost, implementation, and memory requirements of FMO and ASO, as well as the arguments for their inclusion.
E N D
Our Response to Proposals to remove FMO and Arbitrary Slices (ASO) from Baseline Stephan Wenger (Teles AG) stewe@cs.tu-berlin.de Michael Horowitz (Polycom Inc.) mhorowitz@austin.polycom.com
Relevant Documents • All documents enjoy the support of Broadcom, Motorola, CableLabs, Scientific-Atlanta, and LSI Logic • JVT-D115r1.doc (by Iole Moccagatta) • Good description, though no concrete results • JVT-D121.doc (by Yasser Syed) • A collection of reflector Emails • We do not feel that this contribution adds any value above the reflector comments, which were addressed timely. • JVT-D133.doc (by Yasser Syed, late submission) • One page proposal to move the Error Resilience Tools (including FMO ) to a new Profile – no arguments • Technical arguments are all included in JVT-D115r1.doc – hence we comment on this one. Stephan Wenger / KBS / TU Berlin
Why Error Resilience in Baseline • Interoperability Aspects • Gateway Design • Trend to multi-functional devices • Trend to homogenous network structures (IP based) • Painful to later • A lesson learned with H.263 Annex K • Implementing FMO/ASO now is not that difficult • Cost burden relatively low • as will be shown momentarily • Royalty free Baseline attribute Stephan Wenger / KBS / TU Berlin
Arguments made in JVT-D115r1.doc • We concentrate here on the Proposal to move FMO/ASO to an “Error Resilience” Profile • ASO across frame boundaries can be discussed later • Key Arguments • A) There is no need for them (few/no errors) • B) FMO/ASO are too expensive for Broadcast • To A) we answer as follows: • The error free property of networks for Broadcast is not undisputed – and in practice not completely achieved. • FMO is very flexible – could be used outside the error resilience property. Examples were shown e.g. by Miska. Stephan Wenger / KBS / TU Berlin
Similarity of FMO and ASO • From a computational complexity point-of-view the two tools are roughly comparable • Both allow MBs completely out of scan order • ASO by using one-MB-sized slices • When scan-order reconstruction is chosen, in both cases bit buffer handling is required • 8 buffers for FMO • 1 buffer for ASO • FMO may require per-macroblock change of CABAC/CA-VLC contexts • FMO requires some signaling in the Parameter Sets • Rest of the Slides use FMO as the example Stephan Wenger / KBS / TU Berlin
How expensive is FMO really? • Detailed implementation description in JVT-D063 • Sneak preview was available to many of the companies opposing FMO in baseline • Technical concepts presented there seem to be undisputed • Two alternative implementations • Out-of-order reconstruction (low delay) • No CABAC/CA-VLC context switches necessary • Loop-Filtering after reconstruction • Incurs more cache misses (depending on architecture) • (Potentially) requires twice the bus bandwidth for pixel transfers • This is seen as too big a burden • Scan order reconstruction (broadcast) • Needs CABAC/CA-VLC context switching on a per-MB basis • Needs bit buffer management Stephan Wenger / KBS / TU Berlin
Scan order reconstruction: Recap • Collect all slices of a picture in buffers • Need buffer space for one coded picture • 8 bit buffer bins w/ pointers • Reconstruct MBs in scan-order • Two Slice Group example (SG 0 red / 1 blue) • MBAmap Buffer 1 1 2 2 3 3 1 2 3 4 5 6 7 8 9 a b c 4 4 5 5 6 6 7 7 8 8 9 91 2 3 4 5 6 7 8 9 a b c a a b b c c Stephan Wenger / KBS / TU Berlin
Cost of CABAC/CA-VLC context switch • 89 Contexts, one 6 bit int plus 1 flag per context • -> 89 bytes • Need to store 8 contexts – 712 bytes • Memory amount for CABAC >> CA-VLC • When implementing CABAC in software, no real cost • Pointer switch • When implementing CABAC in a Register-based solution • Is this possible/advisable? • If yes, need • Either Store/retrieve 89 bytes per MB • Or have 8 register banks w/ 89 ints (6 bits) plus flag each Stephan Wenger / KBS / TU Berlin
Bit Buffer Management 1/3 • Need to handle up to 8 bit bins • One for ASO • Each bit bin is a chained list of NALUs • NALUs are “inserted” to the bit bin and at a position determined by the MB-Adr • Get SliceGroup from MB-Adr very simple • Insert to list (for out-of-order slices) is also very simple • Note: NALUs are byte aligned • No need for bit oriented processing or copying of data • Required Memory: one coded picture Stephan Wenger / KBS / TU Berlin
Slice: 01 Slice: 17 Slice: 55 Slice: 34 Slice: 55 Bit Buffer Management 2/3 Slice: 34 Slice: 00 Slice: 14 Stephan Wenger / KBS / TU Berlin
Slice: 01 Slice: 17 Slice: 55 Bit Buffer Management 3/3 Read Pointer for SliceGroup 0 Slice: 00 Slice: 14 Read Pointer for SliceGroup 1 Stephan Wenger / KBS / TU Berlin
Disposable B Pictures and Memory Reqmnts. • This seems to be the key argument of JVT-D115 • Similar discussions on the reflector • We admit that one needs an additional coded frame memory compared to MPEG-2 architectures. However: • In JVT B-pictures are not always disposable, hence • The RAM argument made in JVT-D115 doesn’t hold • Considering the number of frame buffers typically used in JVT, this is a moderate cost (20% more memory or so) Stephan Wenger / KBS / TU Berlin