410 likes | 577 Views
Athena: A fault-tolerant, efficient and applicable routing mechanism for data centers. Lijun Lyu, Junjie Xie, Yuhui Deng, Yongtao Zhou. Department of Computer Science, Jinan University, Guangzhou, P.R. China. Agenda. Motivation Challenges Related work Our idea System architecture
E N D
Athena: A fault-tolerant, efficient and applicable routing mechanism for data centers Lijun Lyu, Junjie Xie, Yuhui Deng, Yongtao Zhou Department of Computer Science, Jinan University, Guangzhou, P.R. China
Agenda • Motivation • Challenges • Related work • Our idea • System architecture • Evaluation • Conclusion
Motivation • The Explosive Growth of Data • IDC: 1,800EB data in 2011, 40-60% annual increase Larger Data Center • Google: 19 data centers > 1 million servers Higher traffic • Cisco forecasts that annual traffic in global data centers will nearly triple over the next 5 years and reach 7.7ZB by the end of 2017 Google Data Center
Challenges • Data Center Network • Node increment Scalability? • Failuresare common Fault tolerance? • Google MapReduce in a 4,000-node cluster: • 5 nodes failduring a job • 1 disk failsevery 6 hours • Bandwidth-hungry services Network capacity? Infrastructure services: MapReduce, GFS, … Network applications: Cloud disk, Video, …
Related work • Tree-based Structure • Traditional tree • Bandwidth bottleneck, Single points of failure, Expensive • Modified tree: Fat-tree • High capacity • Limited scalability Fat-tree Traditional Tree-based Structure
Related work • Othernovel, hybrid network structures • Physical topology • Level-based, but not tree-based • Recursively defined • Routing mechanism • No routers, withouttraditional internet routing mechanism • Put routingintelligence on servers • Take advantage of structural properties • Typical structures • DCell, FiConn, BCube, Totoro… Our paper emphasizes on the routing mechanisms of hybrid structures!
Related work • Physical structures • DCell • FiConn • BCube • Totoro
Related work • Routing mechanisms
Our idea: ARM • What we achieve: Athena Routing Mechanism • Routing algorithm • Based on Dynamic Programming • Find the shortest path with lower complexity than classic algorithms • Support Multi-path • Path probing mechanism • Bypass the failed nodes & links • Traffic-aware • Properties • More resilient, shorter latency, higher capacity, Lower complexity
System architecture • Athena Routing Mechanism • Implement on the structure of Totoro • Compare with the original Totoro Fault-tolerant Routing Algorithm (TFR) and Shortest Path Algorithm (SPA, based on Floyd-Warshall). • Applicable to DCell, FiConn, BCube… • Similar topology: level-based, recursively defined.. • Put routing intelligence on servers
System architecture • Totoro • Two-port servers • Low-end switches • Level-based • Recursively defined two-port NIC Totoro Structure of One Level
System architecture • Building Totoro • Connect N servers to an N-port switch • Here, N=4 • Basic partition: Totoro0 • Intra-switch • A Totoro0 Structure
System architecture • Building Totoro • Available ports in Totoro0: c.Here, c=4 • Connect n Totoro0s to n-port switches by using c/2 ports • Inter-switch A Totoro1 structure consists of n Totoro0s.
System architecture • Building Totoro • Connect n Totoroi-1s to n-port switches to build a Totoroi • Recursively defined • Half of available ports ⇒ Open&Scalable • The number of paths among Totorois is n/2 times of the number of paths among Totoroi-1s ⇒ Multi-redundant links⇒ High network capacity
System architecture Please refer to [7] for details. Xie, J., Deng, Y., Zhou, K.: Totoro: A scalable and fault-tolerant data center network by using backup port. In: Network and Parallel Computing. Springer (2013) 94–105 Totoro2structure with N = 4, n = 4, K = 2.
System architecture • Athena Routing Algorithm (ARA) • Based on Dynamic Programming (DP) • Applicable to problems which exhibit the properties of • Overlapping subproblems • Optimal substructure • Recursively calculate
System architecture • Steps of ARA: • Suppose src and dst belong to two partitions. • Get all paths connecting these two partitions. • For each path, recursively calculate it. • Store all paths. • Sort all path by length. • Remove the extra paths. This function is based on the corresponding structural properties. Cartesian product
System architecture • Case study of ARA • work out the path from src to dst
System architecture • Case study of ARA • Step. 1: srcand dstbelong to two different sub-partitions respectively
System architecture • Case study of ARA • Step. 2: there exist two paths between these two sub-partitions
System architecture • Case study of ARA • Step. 3: for Path 1, recursively work out the sub-paths in these sub-partitions, and join them for a full path
System architecture • Case study of ARA • Step. 4: similarly, work out the full path for Path 2
System architecture • Case study of ARA • Step. 5: add all paths into the result set
System architecture • Case study of ARA • Step. 5: sort the paths by lengths
System architecture • Case study of ARA • Step. 5: remove the extra paths (here, we suppose the size of set to return is 1, i.e., it is the shortest path)
System architecture • Path Probing Mechanism • Source host sends the probing request packets • Destination host sends probing reply packets • Intermediate serversrecord the link capacities in the probing packets and forward them
System architecture • Path Probing Mechanism • Detect the failed paths No extra rerouting technique is required • Detect the link capacity Support load balance…
System architecture • Protocol Implementation • ARM Packet format • Path-probing packet • Data packet
System architecture • Protocol Implementation • Protocol • 2.5-layer protocol • How an intermediate server determines the next hop? • A fact: two adjacent servers in a path only differ at one “bit” • Hence, we only store the different “bit”s in the vector.
Evaluation • Evaluating Path Failure & Average Path Lengths • ARM vs. TFR vs. SPA TFR: the original Totoro Fault-tolerant Routing algorithm SPA: Shortest Path Algorithm, Floyd-Warshall, performance bound • Evaluating Resource Usage
Evaluation • Evaluating Path Failure & Average Path Lengths • Experimental parameters
Evaluation • Evaluating Path Failure • Path failure ratio vs. server/rack failure ratio • The performance of ARM/TFR are almost identical to that of SPA!
Evaluation • Evaluating Path Failure • Path failure ratio vs. switch failure ratio • The performance of ARM is almost identical to that of SPA! • But TFR isn’t.
Evaluation • Evaluating Path Failure • Path failure ratio vs. link failure ratio • When a high link failure occurs: • ARM achieves slightly better capacity than TFR. • Performance gap between ARM and SPA still exists! SPA traverse all feasible links in the whole structure until finding a valid path! This is a tradeoff that ARM makes to facilitate algorithmic complexity and save computation resources.
Evaluation • Evaluating Average Path Lengths ARM: Better than TFR. Almost identical to SPA. Shorter than SPA, this is because the path failure ratio of ARM is a bit higher than that of SPA, thus our total path length is shorter.
Evaluation • Evaluating Resource Usage • Experimental parameters
Evaluation • Evaluating Resource Usage CPU: Increase by 10 per second Peak value of 28% at 18s Benefited from the cache 28% +10nodes/s Memory: For each host, it only costs 164KB at most. 0% 18s
Conclusion • More resilient • Shorter latency • Higher capacity • Lower complexity • In the future work, we will focus on the implementation of ARM in DCell, FiConn and other structures!
Athena: A fault-tolerant, efficient and applicable routing mechanism for data centers Thanks!