560 likes | 655 Views
Coding Efficiency and Quality Improvement for MPEG Surround Encoding. Jun. 2, 2008 Student: Shang-Yu Yeh ( 葉尚諭 ) Advisor: Dr. Hsueh -Ming Hang ( 杭學鳴 ). My Work. Design MPEG Surround Encoding Algorithms Subset coding mode Parameter band stride Parameter sets Adaptive smoothing
E N D
Coding Efficiency and Quality Improvement for MPEG Surround Encoding Jun. 2, 2008 Student: Shang-Yu Yeh (葉尚諭) Advisor: Dr. Hsueh-Ming Hang (杭學鳴)
My Work • Design MPEG Surround Encoding Algorithms • Subset coding mode • Parameter band stride • Parameter sets • Adaptive smoothing • Implementation in the Reference Software
Outline • MPEG Surround Introduction • Proposed Procedures and Experimental Results • Conclusion and Future Work • Demo
Outline • MPEG Surround Introduction • Spatial Hearing • MPEG Surround Encoder • MPEG Surround Decoder • Proposed Procedures and Experimental Results • Conclusion and Future Work • Demo
Spatial Hearing • Describing how human locate sound source in the horizontal place • Interaural Level Difference (ILD) • Interaural Time Difference (ITD) • Interaural Coherence (IC)
MPEG Surround • Low-bitrate parametric coding technology for multi-channel audio signal • Backward compatibility to stereo equipment • Standardization • CfP on SAC in March 2004 • Finalize in July, 2006 (ISO/IEC 23003-1)
MPEG Surround Encoder • Capture the spatial image of multi-channel audio • Generate a mono/stereo downmix
MPEG Surround Decoder • Synthesis multi-channel output signal • Backward compatibility
Downmix and Parameter Extraction • Two elementary blocks construct hierarchical structures • R-OTT box (Reverse One-To-Two box) • R-TTT box (Reverse Two-To-Three box)
Parameter Sets and Bands • Parameter sets: grouping of time slots • Parameter bands: grouping of subbands
R-OTT Box • Create a mono downmix from a stereo input • Extract relevant spatial parameters • Channel Level Differences (CLD) where • Inter-Channel coherence (ICC)
R-TTT Box • Create a stereo downmix from three input channels • Two way to reconstruction the 3rd signal • Prediction mode: 2 CPCs and ICC • Energy mode: 2CLDs
Quantization and Entropy Coding Schemes • Quantization - fine and coarse • Entropy coding - Differential coding + Huffman tables
Outline • MPEG Surround Introduction • Proposed Proceduresand Experimental Results • Subset coding mode • Parameter band stride • Parameter sets • Adaptive smoothing • Conclusion and Future Work • Demo
New Encoder Structure • 4 Additional modules:
Subset coding mode • 4 coding modes for each parameter subset: • Default(0) • Keep(1) • Interpolation(2) • Lossless(3) • Ref S/W implements only the Lossless mode
Subset coding mode • Flow chart • Search each mode for the least error • Compare with a threshold • Exploit correlation of time
Experimental Results • Only the Lossless mode costs bits • The bitrate reduction can be estimated:
Experimental Results • Comparisons:
Experimental Results • 2 phenomena: • Theoretical results larger than experimental results • differential coding schemes • Number of parameter sets increases => theoretical & experimental results decrease • probability distributions
Experimental Results • Distributions of DT data:
Parameter Band Stride • Parameter band cannot be adjusted • The frequency resolution is adjusted by “parameter band stride” • 4 strides for each parameter subset
Parameter Band Stride • Exploit correlation in frequency • Combined with the pairing decision • Flow chart: • 2 successive lossless subsets • 1 single subset
Parameter Band Stride • 4 possible results: • 2 successive subsets in a pair with the same stride (>1) • 2 successive subsets using different strides (>1) • 2 successive subsets in a pair with stride=1 • 1 subset coded individually
Experimental Results • The bitrate can be estimated by :
Experimental Results • 2 phenomena: • Theoretical results larger than experimental results • differential coding schemes • Number of parameter sets increases => theoretical & experimental results decrease • probability distributions
Experimental Results • Distributions of DF data
Comparisons of the 2 modules • Using coding mode is more efficient than pbstride • Compare the DT and DF data
Comparisons of the 2 modules • Using pbstride are more overestimated than using coding mode modules • Differential coding schemes
Experimental Results-Combined with Coding Mode • Bitrate reduction percentage: 25~55% • Complexity: 0.13%
Time Resolution • Describing the number of parameters for each parameter band • 2 kinds of framing: • Fixed framing: divided into equal parts • Variable framing: arbitrary divisions • 1~8 parameter sets • Requiring dynamic decision
Time Resolution • A border exists • Large difference of parameters • Calculate the differences of backward and forward extractions • Division at time slots with larger differences
Time Resolution • afd
Experimental Results • waveforms
Experimental Results • Additional bitrate: • Complexity:
Parameter Smoothing • Compensate for artifacts caused by coarse quantization • Performed at the decoder side • 1st order IIR filter
Parameter Smoothing • Flow chart • Compare smoothed coarse with fine quantized parameters • Choose the configure with the least error
Experimental Results • waveforms
Experimental Results • Bitrate variations: • Complexity:
Outline • MPEG Surround Introduction • Proposed Procedures and Experimental Results • Conclusion and Future Work • Demo
Conclusion • Implementation of some encoding procedure in the reference software • Exploit correlation along time axis and frequency axis • Bitrate reduction: 25~55% • Theoretical Estimation • Adaptive time resolution and parameter smoothing
Future Work • Modify error measures • Different band weightings • Different parameter weightings • Find a more precise evaluations of quality to fine-tune • Some other tools • Residual coding, temporal shaping…etc
Outline • MPEG Surround Introduction • Proposed Procedures and Experimental Results • Conclusion and Future Work • Demo
Filter Banks • 2 stages
OTT Box • Synthesize by a mono downmix with parameters
R-TTT Box(2/2) • Prediction mode: • 2 CPCs and 1 ICC: where • Energy Mode: • 2 CLDs:
TTT Box • Prediction Mode: • With residual signal-> 2 CPCs • Without residual signal-> use the ICC to compensate the energy loss • Energy Mode: • Energy reconstruction