200 likes | 386 Views
Fault Tolerant Video-On-Demand Services. Tal Anker, Danny Dolev, Idit Keidar, The Transis Project. VoD Service. VoD Service provider. Movies disk(s). Video Stream. Requests. VoD: Full VCR control 1 video stream per client. Client C 1. High Availability. Multiple servers
E N D
Fault Tolerant Video-On-Demand Services Tal Anker, Danny Dolev, Idit Keidar, The Transis Project
VoD Service VoD Service provider Movies disk(s) Video Stream Requests • VoD: Full VCR control • 1 video stream per client Client C1
High Availability • Multiple servers • at different sites • Fault tolerance: • servers can crash • Managing the load: • new servers can be brought up / down • load should be re-distributed “on the fly” • migration of clients
Failed Server Client C2 Client C2 server server VoD Service server VoD Service server server server Client C1 Client C1 The challenges • Low overhead • Transparency • How do clients know whom to connect to? • “abstract” service • Clients should be unaware of migration
Buffer Management andFlow Control • Overcome jitter, message re-ordering and migration periods • Re-fill buffers quickly after migration • avoid buffer overflow • Minimize buffers • minimize pre-fetch bandwidth • Dynamically adjust transmission rate to client capabilities • Re-negotiation of QoS
Features of our solution • Use group communication in the control plane • connection establishment • fault tolerance and migration • Flow control explicitly handles migration • Low overhead • ~1/1000 of the bandwidth • Negligible memory and CPU overhead • Commodity hardware and publicly available network technologies
Environment • Implementation • UDP/IP over 10 Mbit/s switched ethernet • Transis • Sun Sparc and BSDI PC’s as video servers • Win NT machines as video clients • MPEG1 & 2 hardware decoders • Machine and Network Failures
Implementing the abstract service • Use group communication • clients communicate with a well known group name (logical entity) • unaware of the number and identity of the servers in the group • Servers periodically share information about clients (every 1/2sec) • If a server crashes (or is overloaded), another server transparently takes over
Group Communication • Reliable Group Multicast (Group Abstraction) • Message Ordering • Dynamic Reconfiguration • Membership with Strong Semantics (Virtual Synchrony) Systems: Transis, Horus, Ensemble, Totem, Newtop, RMP, ISIS, Psync, Relacs
Transis Allows Simple Design Group abstractionforconnection establishment and transparent migration Reliable group multicast allows servers to consistently share information Membership services detects conditions for migration Reliable messages for control • Server takes ~2500 C++ code lines • Client takes ~4000 C code lines (excluding GUI and display)
Flow Control • Feedback based flow-control (sparse): • FC messages are sent to the logical server (session group) • Clients determines the changes in the flow:
Emergency Flow Control • When the server receives an emergency message: • The server change the fps rate: fps = latest-known-fps + emergency quantity • The emergency quantity decays every second (by a factor) • While the quantity is above zero, the server ignores FC messages from the client
Performance Measurements • On HUJI Network (LAN) • Servers at TAU and clients at HUJI (WAN) • The measurements show the system is robust and support our transparency claims
Summary • Scalable VoD service • Load balancing • Tolerating machine and network failure • All the above are achieved practically for free: • ~1/1000 of the total bandwidth • Negligible memory and CPU overhead
Thanks to ... • Gregory Chockler • The other members of the Transis project