180 likes | 319 Views
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems. Jack Lee Yiu-bun, Raymond Leung Wai Tak Department of Information Engineering The Chinese University of Hong Kong. Contents. 1. Introduction 2. Challenges 3. Server-less Architecture
E N D
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department of Information Engineering The Chinese University of Hong Kong
Contents • 1. Introduction • 2. Challenges • 3. Server-less Architecture • 4. Performance • 5. Conclusion
1. Introduction • Traditional Client-server Architecture • Clients connect to server and request for streaming • Server capacity limits the system capacity • Cost increases with system scale • Server-less Architecture • Motivated by the availability of powerful user devices • Each user node contributes to the system • Memory • Network bandwidth • Storage • Costs shared by users
1. Introduction • Composed of clusters • Each node serves as a mini server
2. Challenges • Video Data Storage • Retrieval and Transmission Scheduling • Fault Tolerance • Distributed Directory Service • Heterogeneous User Nodes • System Adaptation – node joining/leaving
3. Server-less Architecture • Storage Policy • Video data is divided into fixed-size blocks and then distributed among nodes in the cluster (data striping) • Low storage requirement, load balanced • Capable of fault tolerance using redundant blocks (discussed later)
3. Server-less Architecture • Retrieval and Transmission Scheduling • Round-based scheduler • Retrieval scheduling in terms of macro rounds composed of GSS groups (micro rounds) • Transmission lasts for one macro round
3. Server-less Architecture • Fault Tolerance • Recover from not a single node failure, but multiple simultaneously node failures as well • Redundancy by Forward Error Correction (FEC) Code • e.g. Reed-Solomon Erasure Code (REC)
4. Performance Evaluation • Reliability Analysis • Find out the system mean time to failure (MTTF) • Assuming independent node failure/repair rate • Tolerate up to h failures by redundancy • Analysis by Markov chain model
4. Performance Evaluation • Redundancy Level • Defined as the proportion of nodes serving redundant data • Redundancy level versus number of nodes on achieving the target system MTTF
4. Performance Evaluation • System Response Time • Sum of the scheduling delay and the prefetch delay • Prefetch Delay • Time required to receive the first group of blocks from all nodes • Increases linearly with system scale – not scalable • Ultimately limits the cluster size • What is the Solution? • Multiple parity groups
4. Performance Evaluation • Multiple Parity Groups • Instead of single parity group, the redundancy is encoded with multiple parity groups • Playback begins after receiving the data of first parity group
4. Performance Evaluation • Multiple Parity Groups • Performance gain: shorten the prefetch delay • Drawback: higher redundancy level to maintain the same system MTTF • Tradeoff between response time and redundancy level
4. Performance Evaluation • System Response Time • Increases with cluster size • Shortened by using multiple parity groups
4. Performance Evaluation • System Dimensioning • What are the system configurations if the system • achieves a MTTF of 10,000 hours, and • keeps under a response time constraint of 5 seconds?
5. Conclusion • Server-less Architecture • Scalable • Acceptable redundancy level to achieve reasonable response time in a cluster • Further scale up by forming new autonomous clusters • Reliable • Fault tolerance by redundancy • Comparable reliability as high-end server by the analysis using Markov chain • Cost-Effective • Costs shared by all users
5. Conclusion • Future Work • Distributed Directory Service • Heterogeneous User Nodes • Dynamic System Adaptation • Node joining/leaving • Data re-distribution
End of Presentation • Thank you.