290 likes | 682 Views
Overview. Video Standardization ConceptsHistoryRecent eventsStandardization projectsH.263 v1
E N D
1. Advanced Video Compression Standards Gary Sullivan
(GarySull@microsoft.com)
Microsoft Corp. Software Design Engineer
ITU-T Rapporteur of Advanced Video Coding
ITU-T Recommendation H.263 Editor
Stanford, February 15, 2001
3. Formal Standards Specification available to all at little or no cost
Anyone allowed to implement
Agreement officially by consensus, not decided by a single organization’s interests
Relatively open committee with variety of participants (including hostile competitors, with no contract to support a common agenda, often meeting with formal government approval)
In practice, each standards organization tends to have its own “personality”
4. Video CodingStandardization Organizations Two organizations dominate video compression standardization:
ITU-T Video Coding Experts Group (VCEG)
International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Question 6
ISO/IEC Moving Picture Experts Group (MPEG)
International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee Number 1, Subcommittee 29, Working Group 11
5. Dynamics of the VideoStandardization Process VCEG is older and more focused on conventional (esp. low-delay) video coding goals (e.g. good compression and packet-loss/error resilience)
MPEG is larger and takes on more ambitious goals (e.g. “object oriented video”, “synthetic-natural hybrid coding”, and digital cinema)
Sometimes the major organizations team up (e.g. ISO, IEC and ITU teamed up for both MPEG-2 and JPEG)
Relatively little industry consortium activity (DV and organizations that tweak the video coding standards in minor ways, such as DVD, 3GPP, 3GPP2, SMPTE, IETF, etc.)
Growing activity for internet streaming media outside of formal standardization (e.g., Microsoft, Real Networks, Quicktime)
6. The Scope of Picture and Video Coding Standardization Only the Syntax and Decoder are standardized:
Permits optimization beyond the obvious
Permits complexity reduction for implementability
Provides no quality guarantees – only interoperability
7. H.261: The Basis of Modern Video Compression ITU-T (ex-CCITT) Rec. H.261: The first widespread practical success
First design (late ‘90) embodying typical structure that dominates today: 16x16 macroblock motion compensation, 8x8 DCT, scalar quantization, and variable-length coding
Key aspects later dropped by other standards: loop filter, integer motion comp., 2-D VLC, header overhead
v2 (early ‘93) added a backward-compatible high-resolution graphics trick mode
Operated at 64-2048 kbps
Still in use, although mostly as a backward-compatibility feature – overtaken by H.263
8. Typical MC+DCT Video Coder
9. Video Coding Efficiency
10. MPEG-1:Practicality at Higher Bit Rates Formally ISO/IEC 11172-2 (‘93), developed by ISO/IEC JTC1 SC29 WG11 (MPEG) – use is fairly widespread, but mostly overtaken by MPEG-2
Superior quality to H.261 when operated a higher bit rates (? 1 Mbps for CIF 352x288 resolution)
Can provide approximately VHS quality between 1-2 Mbps using SIF 352x240/288 resolution
Technical features: Adds bi-directional motion prediction and half-pixel motion to H.261 design
11. MPEG-2/H.262: Even Higher Bit Rates and Interlace Formally ISO/IEC 13818-2 & ITU-T H.262, developed (‘94) jointly by ITU-T and ISO/IEC SC29 WG11 (MPEG) – Now in wide use for DVD and standard and high-definition DTV (the most commonly used video coding standard)
Primary new technical features: support for interlaced-scan pictures and scalability
Essentially the same as MPEG-1 for progressive-scan pictures, and MPEG-1 forward compatibility required
Not especially useful below 4 Mbps (range of use normally 5-30 Mbps)
12. H.263: The Next Generation ITU-T Rec. H.263 (v1: 1995): The next generation of video coding performance, developed by ITU-T – the current best standard for practical video telecommunication (has overtaken H.261 as dominant videoconferencing codec)
Superior to H.261 at all bit rates
Wins by a factor of two at very low rates
Versions 2 (late 1997/early 1998) & v3 (2000) later developed
13. MPEG-4: Baseline H.263and Many Creative Extras MPEG-4 (v1: early 1999), formally ISO/IEC 14496-2: Contains the H.263 design and adds all prior features and various creative new extras
Includes segmented coding of shapes, zero-tree wavelet coding of still textures, coding of synthetic and semi-synthetic content, etc.
v2 (early 2000) & v3 (early 2001) later added
14. MPEG-4 and H.263 Standardization Dynamics MPEG-4 project launched soon after H.263 completed
MPEG-4 project was very ambitious and was planned to be significantly different from H.263
Compatibility with H.263 was not initially planned in MPEG-4 (although it eventually turned out to be significantly compatible!)
ITU-T decided to extend its H.263 quickly and compatibly rather than join up with longer, more ambitious, potentially-incompatible MPEG-4 effort for the features the ITU wanted
Much cross-fertilization of ideas and people in projects
15. Detailed Recent History In Video Coding Standardization ITU-T Events
H.263v1 completed late ‘95
H.263+ project (H.263 v2) technically final Sept ‘97
H.263++ project (H.263 v3) technically final July ‘00
H.26L project underway (test version available)
ISO/IEC Events
MPEG-4 v1 completed early ’99
MPEG-4 v2 completed early ’00
MPEG-4 v3 completed early ‘01
Potential for new work under evaluation
16. H.263++ New Version 3 FeaturesPart 1 of 2 Annex U: Fidelity enhancement by macroblock and block-level reference picture selection – a significant improvement in compression quality
Annex V: Packet Loss & Error Resilience using data partitioning with reversible VLCs (roughly similar to MPEG-4 data partitioning, but improved by using reversible coding of motion vectors rather than coefficients)
17. H.263++ New Version 3 FeaturesPart 2 of 2 Annex W:Additional Supplemental Enhancement Information
IDCT Mismatch Elimination (specific fixed-point fast IDCT)
Arbitrary binary user data
Text messages (arbitrary, copyright, caption, video description, and URI)
Error Resilience:
Picture header repetition (current, previous, next+TR, next-TR)
Spare reference pictures for error concealment
Interlaced field indications (top & bottom)
20. MPEG-4 Version 3 Just Completed (part 1 of 2) “Studio Profile”
Various additions oriented toward professional use of video within specialized studio environments
Adds 4:2:2 and 4:4:4 sampling structures
Adds more MPEG-2 elements to MPEG-4
“Fine Granularity Scalability Streaming Video Profile”, a new form of scalable video coding
Uses a scalable enhancement layer
Temporal prediction in enhancement layer is stopped to prevent temporal error propagation
Enhancement layer coded by bit-planes to form a “progressive-transmission” bitstream
21. MPEG-4 Version 3Just Completed (part 2 of 2) “Advanced Simple Profile”, a combination of v1 features, containing:
“Simple Profile” features
B pictures
MPEG-2-style quantization
Interlace features (at higher levels only)
¼-pel motion
Global motion comp
Single stream support in new “level 0”
22. ITU-T VCEG H.26L ProjectGoals (Completion 2002) Compression beyond capability of H.263vN
Real-time low-cost complexity
Delay reduction
Enhanced error and packet loss resilience
Bit-rate adaptivity (e.g. scalability & BR reduction)
Spatio-temporal resolution adaptivity
Robustness to source material behavior
23. H.26L Status Test Model Long-Term Number 6: Designed January ’01 (Eibsee), description and software soon available on the ‘net
TML-5 software and spec availalbe (Geneva, November ’00)
Gain goal over 1999 standards:50% savings in bits for same fidelity!(at all bit rates)
24. The H.26L TML-6 DesignPart 1 of 4 Still using a hybrid of DPCM and transform coding as in prior standards. Common elements include:
16x16 macroblocks
Conventional sampling of chrominance and association of luminance and chrominance data
Block motion displacement
Block transforms (not wavelets or fractals)
Scalar quantization
Variable-length coding
25. The H.26L TML-6 DesignPart 2 of 4 Motion Compensation:
Multiple reference pictures (per H.263++ Annex U)
B picture support (per several prior standards)
Multihypothesis concept being evaluated
1/4 sample accuracy motion (sort of per MPEG-4, could possibly go to 1/8 pel)
6x6 tap filtering to 1/2 sample accuracy, bilinear filtering to 1/4 sample accuracy
Various block sizes and shapes for motion compensation (7 segmentations of the macroblock)
“Funny position” with heavier filtering
Affine motion under consideration
26. The H.26L TML-6 DesignPart 3 of 4 Intra Coding Structure:
Directional spatial prediction (6 types for luma, one for chroma)
Alterations under consideration
Transform
Variable block size for intra (16x16, 8x8, 4x4)
Technically not exactly a DCT, but an integer transform closely approximating a DCT
Based primarily on 4x4 transform size (all prior standards used 8x8)
Expanded to 8x8 for chroma by 2x2 DC transform
Adaptive block size under consideration
27. The H.26L TML-6 DesignPart 4 of 4 Two inverse scan patterns
Logarithmic step size control
Smaller step size for chroma (per H.263 Annex T)
Universal variable-length coding (configurability under consideration)
Adaptive arithmetic coding under strong consideration
In-loop deblocking filter
Distinct Network Adaptation Layer (NAL) design for network transport
Inter-sequence transitional pictures under consideration
28. Future Work in MPEG MPEG to assess new video technology and address digital cinema needs
Calls for proposals issued
Tests to be conducted in next few months
ITU-T VCEG bringing H.26L as reference
Exploring potential future cooperative work between VCEG and MPEG
32. Windows Media TechnologiesVideo-Related Features WMV 8 Codec: A big step forward in compression performance
Screen Codec: Outstanding compression
(near) Lossless !
640 x 480, 10 Fps < 20 Kbps (modem)
800 x 600, 15 Fps < 45 Kbps (ISDN/LAN)
Advanced Streaming Format (ASF) file format
Digital Rights Management (DRM): Critical for Content Providers
33. Future Trends Prediction is difficult - especially of the future.
– Bohr (1885-1962)
If we do not succeed, then we run the risk of failure.
– Quayle (Phoenix Rep. Forum, 1990)
34. Principles of Rate-DistortionTheory Errors using inadequate data are much less than those using no data at all.
– Charles Babbage (1792-1871)
A little inaccuracy sometimes saves tons of explanation.
– Saki (H.H. Munro, 1870-1916, The Comments of Moung Ka)
35. On Rate-Distortion Optimization Rate-distortion optimization and searching techniques will increase in importance
Most enhancements take the form of an expanded range of choices
More choices implies more need for searching and optimization
Lagrange multiplier optimization provides an understandable, straightforward framework
Recent understanding of coupling of step size and Lagrange multiplier makes it straightforward
36. Some Future Projections Coding Efficiency will continue to improve (Proof by existence):
4x4 coding
Long-term memory
Enhanced motion accuracy
Enhanced motion models
Enhanced intra coding
People continue to come up with good ideas (and relatively predictable ones!)
37. What Area will Yield the Most Improvement? Although “prediction is difficult”, it is the area that will yield the most performance improvement
Today’s coded motion model is primitive
Several motion model improvement areas have yet to be fully exploited
Waveform difference coding gain is limited
38. Won’t This be Unnecessary when Megabits become free? The need for better compression will not be reduced
Got more bits? Give me higher resolution.
Got more bits? Give me more channels.
Improving worth effort? 20% of a lot is a lot.
Bit rates have a slower doubling time than computing power.
39. Increasing “Layers” of Standardization In olden days: Design a system for a network with a video coder as part of that system design.
Now:
Standardize a “language” of syntax with maximum flexibility and a rich feature set
Standardize how to configure the standard
Standardize how to encapsulate the standard data on a network
Standardize digital rights management for the data
Standardize the system to carry the data
40. Other Kinds of Layers Continuing interest in Layered coding:
Scalability in MPEG-2, H.263+, and MPEG-4
Layered coding ongoing work (Microsoft, MPEG Enhanced FGS)
Mixed success toward products
Motivation 1: The bit-rate scalability dream
Motivation 2: The limitations of resolution
Motivation 3: The error resilience need
41. Conclusions There will be plenty of need for further work.
There will be plenty of need for more processing power.
There will be plenty of need for more bits.
There will be plenty of need for good ideas.
And those good ideas will come.
Dream no small dreams, for they have no power to move the hearts of men.
- Goethe (1749-1842)