290 likes | 299 Views
This paper explores a server-less architecture for building scalable, reliable, and cost-effective video-on-demand systems. The architecture eliminates the need for dedicated servers and instead utilizes user nodes, serving as both clients and mini-servers. Various challenges, such as data placement, retrieval and transmission scheduling, fault tolerance, and system adaptation, are addressed. The performance evaluation includes assessing storage capacity, network capacity, disk access bandwidth, and system response time.
E N D
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Raymond Leung and Jack Y.B. Lee Department of Information Engineering The Chinese University of Hong Kong
Contents • Introduction • Server-less Architecture • Performance Evaluation • System Scalability • Summary
Introduction Client-Server Architecture • Traditional client-server architecture • clients connect to server for streaming • system capacity limited by server capacity
Introduction Motivation • Limitation of client-server system • system capacity limited by server capacity • high-capacity server is very expensive • Availability of powerful client-side device, or called set-top box (STB) • home entertainment center - VCD/DVD player, digital music jukebox, etc. • relatively high processing capability, and local HD storage • Server-less architecture • eliminates the dedicated server • each user node (STB) serves both as a client and as a mini-server • fully distributed storage, processing, and streaming
Architecture Server-less Architecture • Basic principles • dedicated server is eliminated • users are divided into clusters • video data is distributed to nodes in a cluster
Architecture Challenges • Data placement policy • Retrieval and transmission scheduling • Fault tolerance • Distributed directory service • System adaptation and dynamic reconfiguration • etc.
Architecture Data Placement Policy • Block-based striping • video data is divided into fixed-size blocks and then distributed among nodes in the cluster • low storage requirement, load balanced • capable of fault tolerance using redundant unit(s)
Architecture Retrieval and Transmission Scheduling • Round-based Schedulers • retrieves data block in each micro-round • transmission starts at the end of micro-round
Architecture Retrieval and Transmission Scheduling • Disk retrieval scheduling • Grouped Sweeping Scheme1 (GSS) • able to control the tradeoff between disk efficiency and buffer requirement • Transmission scheduling • Macro round length • time required that every node sends out a data block of Q bytes • depends on system scale, data block size and video bitrate Tf– macro round length n – number of nodes within a cluster Q – data block size Rv – video bit-rate 1P.S. Yu, M.S. Chen & D.D. Kandlur, “Grouped Sweeping Scheduling for DASD-based Multimedia Storage Management”, ACM Multimedia Systems, vol. 1, pp. 99 –109, 1993
Architecture Retrieval and Transmission Scheduling • Transmission scheduling • Micro round length • under the GSS scheduling, the GSS group duration within each macro round • depends on macro round length and number of GSS groups Tg – micro round length Tf– macro round length g – number of GSS groups
Architecture Fault Tolerance • Node characteristics • lower reliability than high-end server • shorter mean time to failure (MTTF) • system fails if any one of the nodes fails • Fault tolerance mechanism • erasure correction code to implement fault tolerance • Reed-Solomon Erasure code2 (RSE) • retrieve and transmit coded data at higher data rate • recover data blocks at the receiver node 2A. J. McAuley, “Reliable Broadband Communication Using a Burst Erasure Correcting Code”, in Proc. ACM SIGCOMM 90, Philadelphia, PA, September 1990, pp. 287–306.
Architecture Fault Tolerance • Redundancy • encode redundant data from video data • recover lost data in case of node failure(s)
Performance Evaluation Performance Evaluation • Storage capacity • Network capacity • Disk access bandwidth • Buffer requirement • System response time
Performance Evaluation Storage Capacity • What is the minimum number of nodes required to store a given amount of video data? • For example: • video bitrate: 150 KB/s • video length: 2 hours • storage required for 100 videos: 102.9GB • If each node can allocate 1GB for video storage, then • 103 nodes are needed (without redundancy); or • 108 nodes are needed (with 5 nodes added for redundancy) • This sets the lower limit on the cluster size.
Performance Evaluation Network Capacity • How many nodes can be connected given a certain network switching capacity? • For example: • video bitrate: 150KB/s • If the network switching capacity is 32Gbps, and assume 60% utilization • up to 8388 nodes (without redundancy) • Network switching capacity is not a bottleneck.
Performance Evaluation Disk Access Bandwidth • Recall the retrieval and transmission scheduling: • Continuous data transmission constraint: • must finish retrieval before transmission in each micro-round • need to quantify the disk retrieval round length and verify against the above constraint
– maximum retrieval round length -- fixed overhead – maximum seek time for k requests W-1 – rotational latency rmin – minimum transfer rate Q – data block size Performance Evaluation Disk Access Bandwidth • Disk retrieval round length • time required retrieving data blocks for transmission • depends on seeking overhead, rotational latency and data block size • suppose k requests per GSS group • Continuous data transmission constraint:
Performance Evaluation Disk Access Bandwidth • Example: • Disk: Quantum Atlas 10K3 • Data block size (Q): 4KB • Video bitrate (Rv): 150KB/s • Number of nodes: N • GSS group number (g): N (reduced to FCFS scheduling) • Micro round length: • Disk retrieval round length: 0.017s < 0.027s • Therefore the constraint is satisfied even if FCFS scheduler is used. 3G. Ganger and J. Schindler, “Database of Validated Disk Parameters for DiskSim”, http://www.ece.cmu.edu/~ganger/disksim/diskspecs.html
Performance Evaluation Buffer Requirement • Receiver buffer requirement • double-buffering scheme: • one for storing data received from the network plus locally retrieved data blocks • another one for video decoder • Sender buffer requirement • under GSS scheduling:
Performance Evaluation Buffer Requirement • Total buffer requirement versus system scale • Data block size: 4KB, Number of GSS groups: g=N
Performance Evaluation System Response Time • System response time • time required from sending out request to playback begins • scheduling delay + pre-fetch delay • Scheduling delay under GSS • time required from sending out request to data retrieval starts • can be analyzed using urns model • detailed derivation available elsewhere4 • Prefetch delay • time required from retrieving data to playback begins • one micro round to retrieve a data block and one macro round to transmit the whole block to the client node 4Lee, J.Y.B., “Concurrent push-A scheduling algorithm for push-based parallel video servers”, IEEE Transactions on Circuits and Systems for Video Technology, Volume: 9 Issue: 3 , April 1999, Page(s): 467 -477
Performance Evaluation System Response Time • For example: • Data block size: 4KB
System Scalability System Scalability • Not limited by network or disk bandwidth • prefers FCFS disk scheduler over SCAN • Limited by system response time • prefetch delay increases linearly with system scale • example: response time of 5.615s at a scale of 200 nodes • Solution • forms new clusters to expand system scale • uses smaller block size (limited by disk efficiency)
Summary Summary • Server-less architecture proposed for VoD • dedicated server is eliminated • each node serves as both a client and a mini-server • inherently scalable • Challenges addressed: • data placement policy • retrieval and transmission scheduling • fault tolerance • Performance evaluation • acceptable storage and buffer requirement • scalability limited by system response time
End of Presentation Thank you Question & Answer Session
Appendix Reliability • Higher reliability achieved by redundancy • each node has independent failure and recovery rate, and respectively • let state i be the system state where i out of the N nodes failed • at state i, the changing rate to state (i+1) and (i-1) are and respectively • assume the system can tolerate up to h failures using redundancy • the system state diagram is shown as follows:
Appendix Reliability • System mean time to failure (MTTF) • can be analyzed by continuous time Markov Chain model • solving the expected time from state 0 to state (h+1) in previous diagram,
Appendix Impact of Redundancy • Bandwidth requirement (without redundancy) • (N-1) received from network and one locally retrieved from disk • Bandwidth requirement (with h redundancy) • additional network bandwidth will be needed for transmitting the redundant data Rv – video bit-rate
Appendix Impact of Redundancy • Data block size (without redundancy) • block size: Q bytes • Data block size (with h redundancy) • block size: