650 likes | 864 Views
Roadmap. Introduction Intra-frame coding Inter-frame coding Object-based and scalable video coding * Why object-based? motion segmentation, shape coding, R-D optimization scalability issues Spatial/temporal/quality scalabilities. Object-based Video Coding.
E N D
Roadmap • Introduction • Intra-frame coding • Inter-frame coding • Object-based and scalable video coding* • Why object-based? • motion segmentation, shape coding, R-D optimization • scalability issues • Spatial/temporal/quality scalabilities EE569 Digital Video Processing
Object-based Video Coding • Waveform-based coding discussed so far uses a simple source model (e.g., H.261/263/264, MPEG-1/-2) • Does not consider the semantic content (e.g. objects and their shape) of the video • Object-based video coding identifies objects (or regions) in a video and encodes them. Potential benefits may include • Improved coding efficiency • Improved visual quality (e.g., no blocking artifacts) • Content description • Content-based interactivity • Also called “content-dependent video coding” • The buzz word for MPEG-4 but less successful than expected (so the important question is to understand why it does not work so well) EE569 Digital Video Processing
Essential Tasks in Object-based Video Coding • Object/region segmentation • Separate pixels based on their color, texture, motion characteristics • Closely related to motion detection and segmentation • Intrinsically ill-defined and desperate for a breakthrough • 2D shape modeling and coding • Not all shapes are equally probable • Subtle implications into video coding (hidden pitfalls) • 2D texture modeling and coding • Extension of existing block-based MCP into region-based • Deformable textures (tradeoff between spatial and temporal prediction) EE569 Digital Video Processing
Object/Region Segmentation • The major challenge in content/object-based coding • Common approaches for segmentation in a still image: gray-level thresholding, clustering, edge detection, region growing, splitting and merging • Object segmentation in video • Motion information can be utilized, but how? • Should we trust more on motion or spatial clues? EE569 Digital Video Processing
Motion-based Segmentation • Motion-based segmentation: to segment an image using motion information • We can first estimate the motion field and then segment the motion field • However, estimation and segmentation are like two sides of the same coin + EE569 Digital Video Processing
A Mind-bothering Example Frame 1 Frame 2 It is easy to convince yourself that tree branches are moving, But how do we know the sky is still? What if it were also moving at the same speed (shouldn’t we observe the same intensity patterns because sky is a smooth region)? EE569 Digital Video Processing
Implications into Video Coding • True motion representation might be useful to computer vision and motion perception, but it is not indispensable in video coding • The fundamental reason lies in the relationship between motion representation and video coding: how to tolerate the uncertainty in motion? • The same issue remains in object-based image coding: how to tolerate the uncertainty in shape? (we will discuss this in more detail later) EE569 Digital Video Processing
Simplified Segmentation: Change Detection • To detect the changing parts in a video, from time ti to time tj , we compute a difference image and threshold the difference by T f (x, y, tj) f (x, y, ti) • dij(x,y) can be further processed, e.g., to remove isolated 1’s, or to group 1’s that are close by to each other EE569 Digital Video Processing
Change Detection: Pros and Cons • Simple to implement; fast • Detects all changes • Detects even unwanted changes • Positive and negative changes detected (occlusion) • Difficult to quantify motion • Requires a static reference frame EE569 Digital Video Processing
Change Detection: An Example • Monitor the traffic EE569 Digital Video Processing
If without a static reference frame • Background extraction methods • Ad-hoc median detector (your CA#6) • To eliminate the impact of (small) moving objects, use the “robust estimator” approach to iteratively remove the outliers • More sophisticated approaches involve the modeling of background by mixture of Gaussian distributions and graph-cut based optimization EE569 Digital Video Processing
Simplified Segmentation: Global Motion Estimation • Planar homography (feature-based) • Homogeneous coordinates • Conditions for planar homography • Homography estimation from feature correspondence • Hierarchical model-based GME (feature-less) • Directly minimize an energy function (the MSE of MCP errors) • Solve the optimization problem in a coarse-to-fine fashion (more robust and efficient) EE569 Digital Video Processing
Plane Homography EE569 Digital Video Processing
Model-based GME Target function for minimization Solution: Gauss-Newton method where Bergen, J. R., Anandan, P., Hanna, K. J., and Hingorani, R. “Hierarchical Model-Based Motion Estimation.” In Proc. of the Second European Conference on Computer Vision, pp. 237-252, 1992 EE569 Digital Video Processing
Multi-resolution GME EE569 Digital Video Processing
Numerical Example EE569 Digital Video Processing
Summary for Change Detection and Global Motion Estimation • Motion segmentation becomes relatively easier to solve when either camera is still or background objects belong to a plane • Latest advances include a joint motion segmentation and estimation using level-set methods (PDE-based formulation) Mansouri, A.-R.; Konrad, J., "Multiple motion segmentation with level sets," Image Processing, IEEE Transactions on , vol.12, no.2, pp. 201-220, Feb 2003 EE569 Digital Video Processing
2-D Shape Modeling and Coding • Bitmap coding: a binary map specifying whether or not a pixel belongs to an object • A special case of the general alpha-map • Contour coding: code only the contour of the object or the region • Chain codes • Polygon approximation • Spline approximation EE569 Digital Video Processing
Image Matting (Soft segmentation) Not for coding but for interactive editing EE569 Digital Video Processing
2-D Texture Modeling and Coding* Shape-adaptive DCT Shape-adaptive wavelet transform EE569 Digital Video Processing
Roadmap • Introduction • Intra-frame coding • Review of JPEG • Inter-frame coding • Conditional Replenishment (CR) • Motion Compensated Prediction (MCP) • Scalable video coding • 3D subband/wavelet coding and recent trend EE569 Digital Video Processing
Scalable vs. Multicast • What is scalable coding? foreman.yuv foreman.yuv foreman128k.cod foreman.cod foreman256k.cod foreman512k.cod foreman1024k.cod 128 256 512 1024 Multicast Scalable coding EE569 Digital Video Processing
Spatial scalability EE569 Digital Video Processing
Temporal scalability Frame 0,1,2,3,4,5,… Frame 0,4,8,12,… Frame 0,2,4,6,8,… 7.5Hz 15Hz 30Hz EE569 Digital Video Processing
SNR (Rate) scalability PSNRavg=40dB PSNRavg=30dB PSNRavg=35dB PSNRi: PSNR of frame i EE569 Digital Video Processing
Scalability via Bit-Plane Coding sign bit A=(a0+a12+a222+ … … +a727) Least Significant Bit (LSB) Most Significant Bit (MSB) Example A=129 sign=+,a0a1a2 …a7=10000001 sign=-, a0a1a2 …a7=00110011 A=-(4+8+64+128)=-204 EE569 Digital Video Processing
Why DPCM Bad for Scalability? Frame number 3 1 2 … Base layer Ibase P P P Enhancement Layer 1 Ienh1 P P P Enhancement Layer 2 Ienh2 P P P suffer from drifting problem suffer from coding efficiency loss EE569 Digital Video Processing
Efficiency gap Enhancement layer variable bit-rate Base layer 20 kbps Fine Granular Scalability (FGS) H.264 with/without FGS option Foreman sequence (5fps) ~2dB gap EE569 Digital Video Processing
3D Wavelet/Subband Coding y t x 2D spatial WT+1D temporal WT EE569 Digital Video Processing
H 7 H 6 5 H H H H H H H H 4 H H H H H H H H 3 2 1 0 Wavelet Video Coder Originalvideoframes LH LH LLH LLL Spatial WaveletTransform TemporalWavelet Transform Embedded Quantization & Entropy Coding • [Taubman & Zakhor,1994] [Ohm, 1994][Choi & Woods, 1999] [Hsiang & Woods, VCIP ’99] . . . and others EE569 Digital Video Processing
Motion-Adaptive 3D Wavelet Transform Recall Haar transform lifting-based implementation Motion-adaptive Haar transform W,W-1: forward and backward motion vector EE569 Digital Video Processing
Low Band Even Frames Analysis: P U Motion Compensation Odd Frames High Band Low Band Even Frames Synthesis: P U [Secker & Taubman, 2001] [Popescu & Bottreau, 2001] Odd Frames High Band Lifting EE569 Digital Video Processing
MC Wavelet Coding vs. H.264/AVC 38 36 Non-scalable H.264/AVC 34 32 30 Luminance PSNR (dB) 28 26 Scalable MC 5/3 Wavelet • Sequence: Mobile CIF • H.264/AVC • high complexity RD control • CABAC • PBBPBBP . . . • 5 prev/3 future reference frames • data courtesy of M. Flierl 24 22 20 2.0 1.8 1.6 0.6 1.4 0.4 1.2 0.2 1.0 0.8 [Taubman & Secker, VCIP 2003]courtesy D. Taubman bit-rate (Mbps) EE569 Digital Video Processing
Wavelet Synthesis with Lossy Motion Vector Videoin Videoout Inverse Wavelet Transform MC Wavelet Transform Embedded Encoding Decoder Minimize J=D+lR Embedded Encoding Decoder Motion Estimator Minimize J=D+lR [Taubman & Secker, ICIP03] EE569 Digital Video Processing
40 38 Non-embedded single-rate 36 34 Video PSNR (dB) 32 Embedded wavelet coefficients Lossless motion 30 28 Embedded wavelet coefficientsLossy motion 26 CIF Foreman 24 0 200 400 600 800 1000 1200 - Bit Rate (kbps) R-D Performance with Lossy Motion Vector [Taubman & Secker, VCIP 2003]courtesy D. Taubman EE569 Digital Video Processing
Internet video streaming Surprising Success of ITU-T Rec. H.263 . . . and what is was used for. What H.263 was developed for . . . ?? Analog videophone EE569 Digital Video Processing
Access SW What is Streaming Video? Receiver 1 • Download mode: no delay bound • Streaming mode: delay bound Access SW Domain B Domain A Data path Domain C Access SW Internet Source Receiver 2 RealPlayer cnn.com EE569 Digital Video Processing
Outline • Challenges for quality video transport • An architecture for video streaming • Video compression • Application-layer QoS control • Continuous media distribution services • Streaming server • Media synchronization mechanisms • Protocols for streaming media • Summary EE569 Digital Video Processing
Time-varying Available Bandwidth Access SW Receiver No bandwidth reservation Access SW Domain B R>=56 kb/s Domain A Data path R<56 kb/s 56 kb/s RealPlayer Source cnn.com EE569 Digital Video Processing
Time-varying Delay Access SW Receiver Access SW RealPlayer Domain B Domain A Data path Delayed packets regarded as lost 56 kb/s Source cnn.com EE569 Digital Video Processing
Effect of Packet Loss Access SW Receiver No packet loss Access SW Domain B Domain A Data path Loss of packets No retransmission Source EE569 Digital Video Processing
Unicast vs. Multicast Multicast Unicast Pros and cons? EE569 Digital Video Processing
Access SW Heterogeneity For Multicast • Network heterogeneity • Receiver heterogeneity Receiver 2 256 kb/s Access SW What Quality? Domain B Domain A Domain C Internet Gateway Ethernet Telephone networks 1 Mb/s Source Receiver 1 64 kb/s Receiver 3 What Quality? EE569 Digital Video Processing
Outline • Challenges for quality video transport • An architecture for video streaming • Video compression • Application-layer QoS control • Continuous media distribution services • Streaming server • Media synchronization mechanisms • Protocols for streaming media • Summary EE569 Digital Video Processing
Architecture for Video Streaming EE569 Digital Video Processing
D D D + + Video Compression Layer 0 64 kb/s Layer 1 256 kb/s Layered Coder Layer 2 1 Mb/s Layered video encoding/decoding. D denotes the decoder. EE569 Digital Video Processing
Access SW Application of Layered Video Receiver 2 256 kb/s IP multicast Access SW Domain B Domain A Domain C Internet Gateway Ethernet Telephone networks 1 Mb/s Source Receiver 1 64 kb/s Receiver 3 EE569 Digital Video Processing
Application-layer QoS Control • Congestion control (using rate control): • Source-based, requires • rate-adaptive compression or • rate shaping • Receiver-based • Hybrid • Error control: • Forward error correction (FEC) • Retransmission • Error resilient compression • Error concealment EE569 Digital Video Processing
Congestion Control • Window-based vs. rate control (pros and cons?) Window-based control Rate control EE569 Digital Video Processing
Source-based Rate Control EE569 Digital Video Processing