290 likes | 495 Views
Audio Coding. Team Member: ChungMing Yan, Chun Tong. Overview. Mp3, AAC, Ogg Vorbis Technical specifications Test Results Sample clips Conclusion. Mp3. MPEG1 Layer 3 Audio Coding A research project in EUREKA Digital Audio Broadcasting (DAB) in 1987 A power of data reduction algorithm
E N D
Audio Coding Team Member: ChungMing Yan, Chun Tong
Overview • Mp3, AAC, Ogg Vorbis • Technical specifications • Test Results • Sample clips • Conclusion
Mp3 • MPEG1 Layer 3 Audio Coding • A research project in EUREKA Digital Audio Broadcasting (DAB) in 1987 • A power of data reduction algorithm • Standardized as ISO-MPEG Audio-Layer 3
Pros: Fast Decoding Excellent hardware support ISO standard Cons: Quality varies widely between encoders Even at highest quality, quality still suffers Mp3 (Continued)
Mp3 (Continue) • Bit Rate: Average 128 kbps or 192 kbps • Sampling Frequency: 16-24KHz (MPEG2 Layer 3) 32-48KHz (MPEG1 Layer 3) • Parameters: Birate: 1. CBR (Constant Bitrate) 2. VBR (Variable Bitrate) 3. ABR (Average Bitrate)
Mp3 (Continue) • Compression Techniques • Huffman coding • Non-linear quantization • M/S Matrixing (Mid/side matrixing) • Intensity stereo • MDCT
Mp3 (Continue) • Encoders: Lame, Audio Catalysis • Decoders: Winamp, Window Media Player, etc.
AAC • MPEG2/MPEG4 Advanced Audio Coding • Developed by MPEG group (Dolby, Frauhofer, AT&T, Sony…etc) • More over of mp3
Pros: Competitive at low and mid bitrates against other formats Decoders/Encoders work on all platforms Cons: All high-quality implementations of AAC encoding are non-free and closed source. Relatively CPU intensive AAC (Continued)
AAC (Continue) • Bit Rate: 96 kbps, 128kbps, 196kbps • Sampling Frequency:48 full-bandwidth (up to 96 KHz) Low Frequency Enhancement (LFE to 120 KHz)
AAC (Continue) • Profiles • LC (Low Complexity) • Main • Main LTP
AAC (Continue) • Compression techniques • Huffman coding • Non-linear quantization and scaling • Vector quantization • M/S matrixing (middle/side channels) for high bitrates • Intensity stereo for low bitrates • TNS (temporal noise shaping) • LTP(MPEG4 profile 2, reduce redundancy in successive frames) • MDCT • PNS (perceptual noise shaping)
AAC (Continue) • Encoders: Psytel AacEnc, Nero • Decoders: Winamp (with an AAC plug-in), QuickTime 6
Ogg Vorbis • Open source project • Free, open, unpatented from other audio coding format
Pros: Open source and patent free No loyalties even in commercial products Cons: No commercial hardware players High bitrates not fully tuned Ogg Vorbis (Continue)
Ogg Vorbis (Continue) • Bit rate: ~64kbps • Sampling Frequency: From 8 KHz (telephony) to 192 KHz (Digital Masters)
Ogg Vorbis (Continued) • Compression techniques • Huffman coding • MDCT (Cosine + Sine) • Wavelet in Vorbis II to improve quality
Ogg Vorbis (Continue) • Encoders: Besweet, OggDrop • Decoders: Winamp (with an Ogg Vorbis plug-in)
Test Result • Three music clip used • Orchestra • Music with voice • Voice only • Different bitrate setting (switches) • High bitrate • Medium variable bitrate • Low bitrate • Additional switches (voice, pns)
Sample clips • Wav • Mp3 • Ogg • AAC
filesizes • mp3: track 1 track 2 track 3 • original 2470KB 2587KB 318KB • cbr 256 (160 voice) 451 472 291 • r3mix (96) 283 381 174 • abr 48 83 91 90 • abr 32 54 60 60/60 • aac: track 1 track 2 track 3 • original 2470KB 2587KB 318KB • cbr 256 446 467 232 • abr 96 217 231 145 • abr 48 (tape, 40-59) 104 102 90 • cbr 32 57 60.4 59.2 • cbr 32 resampled 56.9 59.8 53.6 • ogg: track 1 track 2 track 3 • original 2470KB 2587KB 318KB • cbr 256 /8 389 427 304 • abr 96 /2 140 167 131 • abr 48 /-1 63.9 98.7 80.7 • abr 32 -the encoder cannot encode lower bitrate
Switches used • Mp3 • GUI automatically writes the proper command line • CBR 256 - "c:\EE3414\lame\lame.exe" -m s -b 256 -k "C:\EE3414\input.wav" "C:\EE3414\output.mp3“ • r3mix - "c:\EE3414\lame\lame.exe" --nspsytune --vbr-mtrh -V1 -mj -h -b96 --lowpass 19.5 --athtype 3 --ns-sfb21 2 -Z --scale 0.98 -X0 "C:\EE3414\input.wav" "C:\EE3414\output.mp3" • abr 48 - "c:\EE3414\lame\lame.exe" --abr 48 -b 32 -B 320/160 "C:\EE3414\input.wav“ "C:\EE3414\output.mp3“ • abr 32 - "c:\EE3414\lame\lame.exe" --abr 32 -b 32 -B 320/160 "C:\EE3414\input.wav" "C:\EE3414\output.mp3" • abr 32 voice - "c:\EE3414\lame\lame.exe" --voice --abr 32 -b 32 -B 160 "C:\EE3414\voice.wav" "C:\EE3414\voice(abr32-voice).mp3"
Switches used (Continued) • AAC • cbr 256 -production -low_ath -profile 0 -br 256 • abr 96 -production -profile 0 -br 96 –vbrhi • abr 48 -tape • abr 32 -br 32 • abr 32 -br 32 -resample 22050
Switches used (continued) • Ogg Vorbis • Lacking control parameters besides “quality” • cbr 256 GUI quality set to 8 • abr 96 GUI quality set to 2 • abr 48 GUI quality set to -1 • abr 32 such bitrate is not possible with given tools even with manual bitrate tweaking
Conclusion • ACC = Ogg > MP3 • There are very little differences, very hard to tell • Depends on application • Alternative Audio Coding • Lossless encoding • Monkey audio • Speech specific • Speez
Future research or improvements • As technology improves, there will be newer coding schemes to be examined • More extensive research of the parameters and encoding procedures • Matlab waveform analysis (object analysis) • Alternative Implementation
Resources • http://www.audiocoding.com • http://lame.sourceforge.net • http://www.vorbis.com • Team website • http://chii.servehttp.com:10240/ee3414