550 likes | 795 Views
Emerging Technologies in Multimedia Communications. 電資學院院長 杭學鳴 Dean of EECS College : Hsueh-Ming Hang 台北科技大學 Taipei Univ. of Technology. Contents. Audio and video standards Video standards evolution Emerging techniques in video coding Audio standards evolution. Video Coding Standards.
E N D
Emerging Technologies in Multimedia Communications 電資學院院長 杭學鳴 Dean of EECS College: Hsueh-Ming Hang 台北科技大學Taipei Univ. of Technology
Contents • Audio and video standards • Video standards evolution • Emerging techniques in video coding • Audio standards evolution hmhang/EECS, NTUT
Video Coding Standards Standards Typical rates Applications ITU-T (CCITT) H.261 128 384k bits/s Videophone over ISDN ISO MPEG-1 (11172-2) 1.2 Mbits/s Video CD ISO MPEG-2 (13818-2) 4–10 Mbits Digital TV/HDTV (ITU-T H.262) 20 Mbits/s Over air/networks ITU-T H.263 < 64k bits/s Videophone ISO MPEG-4 (14496-2) Low/high-rates Object-oriented ISO MPEG-7 (15938) Database Content description ITU-T H.263 v2 < 64k bits/s PSTN/wireless Videophone ITU-T H.264 (JVT,AVC) < 40k bits/s Net/wireless Videophone ITU-T H.264 ext (SVC) Multi-layer Net/wireless streaming ISDN: Integrated Services Digital Network hmhang/EECS, NTUT
MPEG Audio Standards hmhang/EECS, NTUT
Image/Video Standards • ISO/IEC JTC1 SC29 – ISO and IEC Joint Technical Committee (on Information Technology) Subcommittee 29 (Coding of audio, picture, multimedia and hypermedia) – Working Group (WG) 1: JBIG (Joint Bi-level Image Group) – 1-bit to 4/5-bit still pictures JPEG (Joint Photographic Experts Group) – 8-bit or more still pictures • ISO/IEC JTC1 SC29 – WG 11: MPEG (Moving Picture Experts Group) – Motion pictures – WG 12: MHEG (Multimedia-Hypermedia Experts Group) – Multi/Hyper-media exchange format hmhang/EECS, NTUT
Standards Organizations • CCITT – Comité Consultaitif International Télégraphique et Téléphonique (International Telegraph and Telephone Consultative Committee) • ITU – International Telecommunication Union • ISO – International Standardization Organization • IEC – International Electrotechnical Commission hmhang/EECS, NTUT
MPEG Committee • Convener: Leonardo Chiariglione • Standards: -- MPEG-1: done -- MPEG-2: done -- MPEG-4: done?! -- MPEG-7: done?! -- MPEG-21: done? -- MPEG A,B,C,D,E: on-going hmhang/EECS, NTUT
MPEG Chair Dr. Chiariglione at NCTU (2003.12) http://www.chiariglione.org hmhang/EECS, NTUT
MPEG-A,B,C • MPEG-A (ISO/IEC 23000)Multimedia Application Formats Part 1 Purpose for Multimedia Application formats Part 2 Music Player Application Format Part 3 Photo Player Application Format … Part 12 • MPEG-B(ISO/IEC 23001) MPEG Systems Technologies Part 1 Binary MPEG format for XML … Part 5 • MPEG-C (ISO/IEC 23002) MPEG Video Technologies Part 1 Accuracy specification for implementation of integer-output IDCT Part 2 Fixed point implementation of DCT/IDCT Part 3 Auxiliary Video Data Representation Part 4 Video Tool Library hmhang/EECS, NTUT
MPEG-D,E • MPEG-D (ISO/IEC 23003)MPEG Audio Technologies Part 1 MPEG Surround Part 2 Spatial Audio Object Coding Part 3 Unified Speech and Audio Coding • MPEG-E(ISO/IEC 23004) MPEG Multimedia Middleware Part 1 Architecture Part 2 Multimedia API Part 3 Component Model Part 4 Resource and Quality Management Part 5 Component Download Part 6 Fault Management Part 7 System Integrity Management Part 8 Reference Software and Conformance hmhang/EECS, NTUT
How I Got Involved? 1984: Joined AT&T Bell Labs – Visual Comm. Dept. H.261 video standard started 1988.1: MPEG started 1991.12: I joined NCTU discontinued standard activities 1999.9: NCTU formed a small group to participate in the MPEG activities hmhang/EECS, NTUT
NCTU MPEG Activity • Tihao Chiang (蔣迪豪),C.J. Tsai (蔡淳仁), Wen Peng (彭文孝) and H.-M. Hang (杭學鳴) attend MPEG meetings constantly • Tihao Chiang: Co-editor, MPEG-4 Part 7 Optimised Reference Software (Done) • C.J. Tsai: Co-editor, MPEG-21 Part 12 Multimedia Test Bed for Resource Delivery (Done) • 107 contributions (input and output documents) in the past 5 years (2002 -- 2007). [Dr. Y.-S. Tung, NTU] • Example: Call for Proposal on Scalable Video Coding (Feb. 2004) – 2 out of 14 proposals hmhang/EECS, NTUT
Image & Video Compression:JPEG AVC (H.264) hmhang/EECS, NTUT
Progress of Image/Video Coding H.261 (CCITT/ITU;1984, 88, 90) – video (videoconf.) • JPEG (1986, 89, 92) – image (Digital Camera) • MPEG-1 (1988 – 92) – video (VCD) • MPEG-2 (1990 – 94) – video (DVD, DTV) • MPEG-4 part 2 (1992 – 99) – video (Internet, WL) • H.263 (1993 – 95; ver.3: 2000) – video (WL) • JPEG2000 (1996 – 2001) – image • H.264 (MPEG-4 part 10) AVC (1998 – 03) – video (WL, HD-DVD) • AVC Amd.1 (2003 – 2008) – scalable video coding hmhang/EECS, NTUT
Scalable Bitstream Progressive approximation 300kbps PSNR=32.2 dB 500kbps PSNR=34.6 dB 1000kbps PSNR=38.2 dB GOP Header Motion Info. Image Data hmhang/EECS, NTUT
Spatial/SNR Scalability 176x144, 256Kbs 352x288, 750Kbs 704x576, 6Mbs 704x576, 1.5Mbs hmhang/EECS, NTUT
Scalable Video Coding hmhang/EECS, NTUT • Why scalable video coding? • Reliably deliver video to diverse clients over heterogeneous networks using available system resources • Types of Scalability • SNR scalability (quality) • Spatial scalability (frame resolution) • Temporal scalability (frame rate) Combined scalability 17
MPEG SVC Activity 2003.10 Call-for-Proposal (CfP) 2004.2 Proposals received (14 submitted) (M10737) (NCTU submitted two proposals) 2004.3 Evaluations: two categories (M10480) Category 1: MCTF+Wavelet (10) Category 2: AVC based (incl. AVC/MCTF) (4) 2004.3/7/10 Proposals and Refinements evaluated 2005.1 AVC became Amd 1 of MPEG-4 Part 10 Standard in 2008 hmhang/EECS, NTUT
H.264/SVC Encoder hmhang/EECS, NTUT
Hierarchical B Pictures Lower temporal layers are generated first Use reconstructed frames for prediction group of pictures (GOP) group of pictures (GOP) group of pictures (GOP) group of pictures (GOP) I I /P /P B B B B B B B B B B B B B B I I /P /P B B B B B B B B B B B B B B I I /P /P 0 0 0 0 3 3 2 2 3 3 1 1 3 3 2 2 3 3 0 0 0 0 3 3 2 2 3 3 1 1 3 3 2 2 3 3 0 0 0 0 0 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 12 12 13 13 14 14 15 15 16 16 display order display order hmhang/EECS, NTUT
Spatial Scalability Same concepts in MPEG-2/4 and H.263 -- Each spatial layer is coded with texture/ motion refinement Coding prediction scaling Coding Scalable stream prediction scaling Coding hmhang/EECS, NTUT
Fast Algorithm: Intra Prediction Base layer is coded with good quality (small Qp) Enh. Layer: IntraBL dominates Base layer is coded with poor quality (large Qp) Enh. Layer: Intra4x4/IntraBL Intra4x4 • H.-C. Lin, W.-H. Peng, and H.-M. Hang, IEEE ICIP07; IEEE ICME08. hmhang/EECS, NTUT
Examples of Research Topics:Interframe Wavelet, Contourlet Coding hmhang/EECS, NTUT
Interframe Wavelet • Algorithm proposed and improved by Profs. Jens Ohm (Achen U.) and John Woods (RPI) • Motion compensated temporal filtering (MCTF) + wavelet zero-tree coding • Key advantage: “full” scalability – temporal + spatial + SNR • Disadvantage: long delay (storage) • High bit rates: good (~ advanced video coding) Low rates: needs improvement • Many variations now hmhang/EECS, NTUT
MCTF + Wavelet MCTF (analysis) Spatial Analysis Entropy Coding Packetizer Input Video Motion Estimation Motion Info. Encoding Encoder MCTF (synthesis) Spatial Synthesis Entropy Decoding Depacketizer Output Video Motion Info. Decoding Decoder MCTF: Motion Compensated Temporal Filtering hmhang/EECS, NTUT
1 2 3 4 5 MCTF MCTF = Motion Compensated Temporal Filtering hmhang/EECS, NTUT
Temporal Subband Decomposition Video Sequence MCTF GOP (Group of Pictures) Corresponding to temporal level=4 decomposition MCTF MCTF Temporal Low-pass frame Temporal High-pass frame MCTF Frames that remain after temporal decomposition hmhang/EECS, NTUT
Wavelet decomposition provides spatial scalability ~ JPEG 2000 Rate-control! Spatial Scalability Bit-plane Coder hmhang/EECS, NTUT
R-D Optimization in Interframe Wavelet Video Multiple R-D operation points!! Inter-scaled hybrid coding!! subband-based Quantizer bits Entropy coding Block-based Predictor R-D No Feedback Path!!! Motion Coder Open-loop C.-Y. Tsai and H.-M. Hang, “rho-GGD source modeling for wavelet coefficients in image/video coding,” in IEEE ICME, 2008 hmhang/EECS, NTUT Wavelet coding structure 30
Contourlet Transform Inefficiency of separable transform hmhang/EECS, NTUT
Contourlet Representation Image decomposition using Directional Filter Bank (DFB) hmhang/EECS, NTUT
An Example of DFB • Decomposed by a DFB with 4 levels that leads to 16 subbands Image after DFB with 4 levels Original image hmhang/EECS, NTUT
DFB-Based Coding One example of mixed 2D wavelet decomposition • C.-H. Hung and H.-M. Hang, “Image Coding Using Short Wavelet-based Contourlet Transform,” IEEE ICIP, 2008 hmhang/EECS, NTUT
Audio Compression:MP3 MPEG Surround hmhang/EECS, NTUT
MPEG Audio Standards hmhang/EECS, NTUT
MPEG Audio Standards (2) hmhang/EECS, NTUT
MPEG Audio Standards (3) hmhang/EECS, NTUT
Spectral Band Replication: SBR • Typical audio signal spectrum hmhang/EECS, NTUT
SBR (2) • The high frequencies are reconstructed and adjusted hmhang/EECS, NTUT
Spatial Hearing • Three parameters describing how human locate sound source in the horizontal place • Interaural Level Difference (ILD) • Interaural Time Difference (ITD) • Interaural Coherence (IC) hmhang/EECS, NTUT
MPEG Surround • Low-bitrate parametric coding technology for multi-channel audio signal • 64 kb/s or less • Backward compatibility to stereo equipment • Standardization • CfP on Spatial Audio Coding (SAC) in March 2004 • Reference Model 0 (RM0) defined in 2005 • Rename to ”MPEG Surround” in 2005 • Finalize in July, 2006 (ISO/IEC 23003-1) hmhang/EECS, NTUT
MPEG Surround Encoder • Capture the spatial image of a multi-channel audio signal • Generate a mono/stereo downmixed signal hmhang/EECS, NTUT
MPEG Surround Decoder • Synthesis multi-channel output signal • Backward compatibility hmhang/EECS, NTUT
Multimedia Communication System hmhang/EECS, NTUT
Server-Client Structure Server Network Client hmhang/EECS, NTUT
NCTU Multimedia Test Bed • Designed for performance evaluation of scalable media streaming system • Also included: MPEG-21 digital item adaptation, MPEG-4 IPMP, … • 2002.12: NCTU donated the source codes to MPEG (M9182). • 2005.4: ISO/IEC JTC 1/SC 29, Information Technology – Multimedia Framework (MPEG-21) – Part 12: Test Bed for MPEG-21 Resource Delivery hmhang/EECS, NTUT
MPEG-21 Testbed Data flow IPMP DIA Control path hmhang/EECS, NTUT
MPEG Integrated Project Network Simulator Multimedia Database SP#1: Pre- and Post-processing Techniques of Scalable Video Streaming SP#2: Video Streaming Server and Video Database Integration video source SP#3: MPEG Multimedia Transport, Protocols, & Network Simulator SP#4: Advanced Fine Granularity Scalability Pre-processor SP#5: MPEG IPMP and Robust Video Decoder Design and Simulation VideoDatabase SP#6: Research in Multipoint Videoconferencing Technologies MPEG-4/21Scalable codec Multimedia/Conference Client Client GUI MultimediaServer Post-processor MPEG-4/21Scalable codec Main Project: MPEG Integrated Multimedia Platform and Applications MPEG-21 DIAEngine StreamingModule N-wayClient/Server Module MPEG IPMP MPEG IPMP StreamingModule N-WayConferenceClient N-WayConferenceClient hmhang/EECS, NTUT
Test Bed Demo Setup hmhang/EECS, NTUT
Reconfigurable Video Coding -- Many codecs share common/similar tools -- A collection of tools (functional units): Each tool has a single, clear functionality Free viewpoint TV (FTV) -- Watch (synthesize) video from “any” viewpoint -- Scenes are captured by multiple cameras -- Key: Reduce information; Simple/fast synthesis Scalable Speech and Audio Coding What Next in MPEG? hmhang/EECS, NTUT