360 likes | 497 Views
ShadowStream : performance Evaluation as a Capability in Production Internet Live Stream Network. ACM SIGCOMM 2012 2012.10.15 Cing -Yu Chu. Motivation. Live streaming is a major Internet application today Evaluation of live streaming Lab/ testbed , simulation, modeling Scalability realism
E N D
ShadowStream: performance Evaluation as a Capability in Production Internet Live Stream Network ACM SIGCOMM 2012 2012.10.15 Cing-Yu Chu
Motivation • Live streaming is a major Internet application today • Evaluation of live streaming • Lab/testbed, simulation, modeling • Scalability • realism • Live testing
Challenge • Protection • Real views’ QoE • Masking failures from real viewers • Orchestration • Orchestrating desired experimental scenarios (e.g., flash-crowd) • Without disturbing QoE
Modern Live Streaming • Complex hybrid systems • Peer-to-peer network • Content delivery network • BitTorrent-like • Tracker peers watching same channel overlay network topology • Basic unit: pieces
Modern Live Streaming • Modules • P2P topology management • CDN management • Buffer and playpoint management • Rate allocation • Download/upload scheduling • Viewer-interfaces • Share bottleneck management • Flash-crowd admission control • Network-friendliness
Metrics • Piece missing ratio • Pieces not received by the playback deadline • Channel supply ratio • Total bandwidth capacity (CDN+P2P) to total streaming bandwidth demand
MISleading results Small-Scale • EmuLab: 60 clients vs. 600 clients • Supply ratio • Small: 1.67 • Large: 1.29 • Content bottleneck!
MISleading results Small-Scale • With connection limit • CDN server’s neighbor connections are exhausted by those clients that join earlier
MISleading results Missing Realistic Feature • Network diversity • Network connectivity • Amount of network resource • Network protocol implementation • Router policy • Background traffic
MISleading results Missing Realistic Feature • LAN-like network vs. ADSL-like network • Hidden buffers • ADSL has larger buffer but limited upload bandwidth
Streaming Machine • self-complete set of algorithms to download and upload pieces • Multiple streaming machines • experiment (E) • Play buffer
R+E to Mask Failures • Another streaming machine • For protection • repair (R)
R+E to Mask Failures • Virtual playpoint • Introducing a slight delay • To hide the failure from real viewers • R = rCDN • Dedicated CDN resources • Bottleneck
R = production • Production streaming engine • Fine-tuned algorithms (hybrid architecture) • Larger resource pool • More scalable protection • Serving clients before experiment starts
Problem ofR = production • Systematic bias • Competition between experiment and production • Protect QoE higher priority for production underestimate experiment
PCE • R = P + C • C: CDN (rCDN) with bounded resource • P: production • δ
PCE • rCDN as a filter • It “lowers” the piece missing ratio curve of experiment visible by production down by δ
Implementation • Modular process for streaming machines • Sliding window to partition downloading tasks
Streaming hypervisor • Task window management: sets up sliding window • Data distribution control: copies data among streaming machines • Network resource control: bandwidth scheduling among stream machines • Experiment transition
Task window Management • Informs a streaming machine about the pieces that it should download
Data Distribution Control • Data store • Shared data store • Each streaming machine pointer
Network resource control • Production bears higher priority • LED-BAT to perform bandwidth estimation • Avoid hidden buffer network congestion
Experiment Orchestration • Triggering • Arrival • Experiment Transition • Departure
Specification and Triggering • Testing behavior pattern • Multiple classes • Each class • Arrival rate function during interval • Duration function L • Triggering condition tstart
Arrival • Independent arrivals to achieve global arrival pattern • Network-wide common parameters • tstart, texp and λ(t) • Included in keep-alive message
Experiment Transition • Current t0, join at ae,i [t0, ae,i] • Connectivity Transition • Production neighbor’s production (not in test) • Production rejoins
Experiment Transition • Playbuffer State Transition • Legacy removal
Departure • Early departure • Capturing client state snapshot • Using disconnection message • Substitution • Arrival process again • Only equal or more frequent than the real viewer departure pattern
Evaluation • Software Framework • Experimental Opportunities • Protection and Accuracy • Experiment Control • Deterministic Replay
Software Framework • Compositional Run-time • Block-based architecture • Total ~8000 lines of code • Flexibility
Experimental Opportunities • Real traces from 2 living streaming testing channel (impossible in testbed) • Flash-crowd • No client departs
Protection and Accuracy • EmuLab (weakness) • Multiple experiment with same settings • 300 clients • δ ~ 4% • Buggy code!
Experiment Control • Trace-driven simulation • Accuracy of distributed arrivals • Impact of clock synchronization • Up to 3 seconds
Deterministic Replay • Minimize logged data • Hypervisor • Protocol packet: whole payload • Data packet: only header