400 likes | 558 Views
A Proxy Smoothing Service for Variable-Bit-Rate Streaming Video. Jennifer Rexford AT&T Labs - Research Florham Park NJ. http://www.research.att.com/~jrex. Joint work with Subhabrata Sen, Don Towsley, and Andrea Basso. Outline. Background and motivation
E N D
A Proxy Smoothing Service for Variable-Bit-Rate Streaming Video Jennifer Rexford AT&T Labs - Research Florham Park NJ http://www.research.att.com/~jrex Joint work with Subhabrata Sen, Don Towsley, and Andrea Basso
Outline • Background and motivation • Burstiness of compressed video streams • Smoothing techniques for stored video • Online smoothing of variable-bit-rate video • Sliding-window smoothing algorithm • Performance evaluation on MPEG traces • Integration of smoothing with prefix caching • Caching initial frames of popular video streams • Resource allocation across multiple streams • Prototype proxy smoothing service • Software design of proxy service in Windows NT • MPEG-2 PC-based video streaming testbed • Conclusions and ongoing work
Video Streaming Applications • Live, interactive video • Video teleconferencing, video phones, etc. • Tight delay constraints to support interactivity • Stored, non-interactive video • Movies, distance learning, Web videos, etc. • Video recorded in advance; loose delay constraints • Live, non-interactive video • Course lectures, news, sporting events, conferences • Video not recorded in advance; loose delay constraints
Challenges of Video Streaming • High bandwidth requirements of compressed video • 4-6 Megabits/second for high quality MPEG2 streams • Burstiness of frame sizes on several time scales • MPEG group-of-pictures structure (I, P, B frames) • Differences in action and detail within/across scenes • Bandwidth limitations on clients and links • 10 or 100 Mbps shared local area network • 27 Mbps cable channel, 1.5 Mbps ADSL • Lack of end-to-end control of path from source • Poor delay, throughput, and loss in the Internet
Approaches to Handling Variability • Constant-bit-rate encoding of each stream • Adjust quality of encoding to stay at constant rate • Quality degradation during scenes with action & detail • Statistical multiplexing of variable rate streams • Rely on mixing to reduce the aggregate peak rate • Limited effectiveness on access links • Selective discard of packets/frames in stream • Discard packets/frames during transient congestion • Noticeable degradation in video quality • Transcoding or layered encoding to reduce bit rate • Re-encode the video stream at different quality at proxy • Quality degradation; hard to transcode at link speeds
Smoothing Stored Video For prerecorded video streams: • All video frames stored in advance at server • Prior knowledge of all frame sizes (fi, i=1,2,..,n) • Prior knowledge of client buffer size (b) • workahead transmission into client buffer 2 1 b bytes n Client Server
Smoothing Constraints Given frame sizes {fi} and buffer size b • Buffer underflow constraint (Lk = f1 + f2 + … + fk) • Buffer overflow constraint (Uk = min(Lk + b, Ln)) • Find a schedule Sk between the constraints • O(n) algorithm minimizes peak and variability U number of bytes rate changes S L time (in frames)
Limitations of Smoothing Model • Assumes prerecorded stored video • but… need to support live and precorded video • Assumes smoothing is performed by server • but… server is in the domain of another provider • Assumes end-to-end control of the network • but… the Internet is decentralized • Assumes server knows the client buffer size • but… the client may be in a different domain
Online Smoothing Source or proxy can delay the stream by w time units: Larger window w reduces burstiness, but… • Larger buffer at the source/proxy • Larger processing load to compute schedule • Larger playback delay at the client stream with delay w streaming video b bytes Client Source/Proxy
proxy client Ai Si Di-w B b Online Smoothing Model • Arrival of Aibits to proxy by time i in frames • Smoothing buffer of B bits at proxy • Smoothing window (playout delay) of w frames • Playout of Di-w bits by client by time i • Playout buffer of b bits at client • transmission of Si bits by proxy by time i
Online Smoothing • Must send enough to avoid underflow at client • Si must be at least Di-w • Cannot send more than the client can store • Si must be at most Di-w + b • Cannot send more than the data that has arrived • Si must be at most Ai • Must send enough to avoid overflow at proxy • Si must be at least Ai - B max{Di-w, Ai - B} <= Si <= min{Di-w + b, Ai}
Online Smoothing Constraints Source/proxy has w frames ahead of current time t: don’t know the future number of bytes U L ? time (in frames) t t+w-1 Modified smoothing constraints as more frames arrive...
Smoothing Star Wars GOP averages 30-second window 2-second window • MPEG-1 Star Wars,12-frame group-of-pictures • Max frame 23160 bytes, mean frame 1950 bytes • Client buffer b=512 kbytes
Reducing Computational Complexity • No need to compute schedule at every time unit • Limited information from new frame arrivals • Limited impact on trajectory of the schedule • Execute online algorithm every a time units • Perform O(w) work every a time units • Limit number of rate changes • Performance implications • Very small increases in peak and variance of rates • Setting a = w/2 performs almost as well as a = 1
Parameters in Smoothing Model • Algorithm parameters • Window w (in number of frame slots) • Client buffer size b (in bytes) • Source/proxy buffer size B (in bytes) • Computation interval a (in frames) • Frame-size prediction interval p (in frames) • Performance metrics • Peak rate of the smoothed stream • Coefficient of variation (standard-deviation/mean) • Effective bandwidth (given buffer and loss rate)
Peak Rate vs. Window Size (varying client buffer size for MPEG-1 Wizard of Oz) • Dramatic decrease in bandwidth variability • Online algorithm approaches offline scheme • Ten-second window gives most of the gain
Peak Rate vs. Client Buffer(varying window size for MPEG-1 Wizard of Oz) • Significant reductions with a few Mbytes of buffer • Diminishing returns for larger client buffer sizes • Window size w should scale with buffer size b
Proxy vs. Client Buffer(varying prediction under 512-kbyte total buffer & 30-frame window) • Need buffer at each end for good performance • Even buffer for large P, more at proxy for small P • Simple prediction schemes are very effective
Prefix Caching to Avoid Start-Up Delay • Avoid start-up delay for prerecorded streams • Proxy caches initial part of popular video streams • Proxy starts satisfying client request more quickly • Proxy requests remainder of the stream from server • smooth over large window without large delay • Use prefix caching to hide other Internet delays • TCP connection from browser to server • TCP connection from player to server • Dejitter buffer at the client to tolerate jitter • Retransmission of lost packets • apply to “point-and-click” Web video streams
New Questions • Video streaming protocol • How to get the proxy in the path? • How to receive an initial copy of the prefix? • How to retrieve the remaining frames of the video? • Smoothing model • What changes in the smoothing constraints? • What changes in the basic performance properties? • Proxy resource allocation • How much prefix is needed to hide Internet delays? • How to allocate between caching and smoothing? • How to allocate resources across multiple streams?
Protocol Issues • Ensuring that requests go through the proxy • Configuration of proxy in client browser or player • Placement of transparent proxy in the path • Caching of the initial frames of the video • Server replication of the prefix • Proxy prefetching of the prefix • Proxy caching of prefix after first request • Transparent retrieval of remaining frames • Range request operation in HTTP 1.1 • Absolute positioning in RTSP
Changes to Smoothing Model • Separate parameter s for client start-up delay • Prefix cache stores the first w-s frames • Arrival vector Ai includes cached frames • Prefix buffer does not empty after transmission • Send entire prefix before overflow of bs • Frame sizes may be known in advance (cached) Ai bs Si Di-s bc bp
Performance Evaluation • Comparison to original online smoothing model • Pro: can have large window and small start-up delay • Pro: performance is virtually indistinguishable • Con: storing prefix nearly doubles buffer requirement • Con: may be difficult to smooth at beginning of video • Allocation of prefix and smoothing buffers • Small prefix buffer limits size of smoothing window • small window w restricts workahead smoothing • Large prefix buffer limits size of smoothing buffer • small bs requires aggressive transmission schedule
Peak Rate vs. Window Size(varying total proxy buffer size for MPEG-1 Wizard of Oz) • Convex, cup-shaped curve of peak rate vs. buffer • Simple binary search for optimal allocation • Heuristic: pick largest w that does not constrain bs
Peak Rate vs. Prefix Buffer Size(varying total proxy buffer size for MPEG-1 Wizard of Oz)
Allocating Resources Across Streams • Performance issues • Limited buffer (M) and/or bandwidth (B) at proxy • Collection of V videos with different popularity • Videos with different sequences of frame sizes • Optimization problem • Allocate prefix buffer bp for each video v =1,…, V • Allocate smoothing buffer bs for each of nv requests • Obey constraint on buffer (M) or bandwidth (B) • Minimize the usage of the other resource (M or B)
Simplifying the Problem • Complex resource allocation problem • Assign bp, bs, and w for each video v • Buffer requirement: sumv{bp(v) + nv * bs(v)} • Bandwidth requirement:sumv{nv * peak(v)} • Reduce problem to selecting w for each video • Select same bs and w across all requests for v • Select prefix buffer bp as first w-s frames • Select bs as max smoothing buffer for window w
Greedy Algorithm • Further simplifying the problem • Selecting w determines bp(v), bs(v), and peak(v) • Consider the nv*peak(v) vs. bp(v)+nv*bs(v)curve • Curve is piecewise-linear, convex, non-increasing • Greedy algorithm for buffer constraint M • Select the videowith steepest initial slope • Assign buffer space to this video for max gain • Repeat until reaching the buffer constraint M • Greedy algorithm for bandwidth constraint B • Repeat until not exceeding bandwidth constraint B
Illustration of Greedy Algorithm #2 #1 #3 bandwidth for video 2 bandwidth for video 1 #4 #6 #5 buffer for video 1 buffer for video 2
Building a Smoothing Proxy • Performance results • Memory: a few megabytes of RAM is sufficient • CPU: 1-2 msec to smooth 30 sec (300 MHz PC) • Bandwidth: 2-4 Mbps feasible on personal computer • Solution with off-the-shelf components • 300 MHz Pentium Pro with 192 megabytes of RAM • Input and output on 10 megabit/second Ethernet • Windows NT operating system with WinSock 2.0
Reality Sets In • Video stream is packetized, not a fluid • Smoothing constraints must be applied to packets • Proxy cannot transmit the stream at arbitrary rates • System does not have support for traffic shaping • Cannot control the inter-packet spacing at fine scale • E.g., 2 msec spacing for 15-packets frames (30 fps) • Interrupt latency, timer jitter, and data copying • Limited control over time expiration times • Latency in processing I/O and timer operations • Need to avoid extra copying of video frames
Time-Sharing the Processor • Reception of incoming packets • Smooth over more frames by receiving often • Avoid double-copy from kernel to user space • Avoid the worst-case scenario of overflow • Computation of smooth schedule • Must run often enough to maximize smoothing • Fortunately, does not need to read or write data • Transmission of packets according to schedule • Must run often enough to control packet spacing • Avoid the bad case of sending a large burst • Avoid the worst case of client underflow
Key Design Decisions • Single thread of control • No operating system control over fine-grain sharing • High-performance counter for timing operations • Timers are too inaccurate (tens of milliseconds) • How often should the counter be checked? • Overlapped I/O to avoid double copying • Receive and send directly to/from the user-space buffer • How many outstanding sends and receives? • Explicit pacing of packet transmissions • How often should the send routine be invoked?
LiveNet MPEG-2 Testbed(developed by Andrea Basso, Glenn Cash, and Reha Civanlar) • MPEG-2 encoder • MPEG-2 encoder board (MPEGXpress) • Software to read into buffers and stream into network • Real-time packetizer • Parses MPEG-2 stream and divides frames into slices • Packing slices into Real-Time Protocol (RTP) packets • MPEG-2 decoder • Software for packet reception and error concealment • MPEG-2 decoder board (DarimVision)
Conclusions • Online smoothing model • Applicable to many non-interactive applications • Significantly lowers burstiness of compressed video • Enables high-quality video across access networks • Prefix caching • Hides start-up delay for smoothing and other operations • Effective resource allocation schemes at the proxy • Practical application • Transparent to the origin video source/server • Implementation with commercial off-the-shelf parts • Integration with MPEG-2 and Real-Time Protocol
Ongoing Work • Prototyping the proxy smoothing service • Completion of implementation of proxy service • Performance evaluation of parameterized system • Combining smoothing with other mechanisms • Discard, transcoding, feedback, and retransmission • Exploiting prefix cache to hide additional latency • Measurement of Web-initiated video streaming • Collection of video packet traces in AT&T WorldNet • Study of potential for (partial) caching at the proxy