1 / 42

1 Department of Computer Science, Jinan University

Totoro: A Scalable and Fault-Tolerant Data Center Network by Using Backup Port. Junjie Xie 1 , Yuhui Deng 1 , Ke Zhou 2. 1 Department of Computer Science, Jinan University 2 School of Computer Science & Technology, Huazhong University of Science & Technology. Agenda. Motivation Challenges

ranger
Download Presentation

1 Department of Computer Science, Jinan University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Totoro: A Scalable and Fault-Tolerant Data Center Network by Using Backup Port Junjie Xie1, Yuhui Deng1, Ke Zhou2 1 Department of Computer Science, Jinan University 2School of Computer Science & Technology, Huazhong University of Science & Technology

  2. Agenda • Motivation • Challenges • Related work • Our idea • System architecture • Evaluation • Conclusion

  3. Motivation • The Explosive Growth of Data ⇒ Large Data Center • Industrial manufacturing, E-commerce, Social network... • IDC: 1,800EB data in 2011, 40-60% annual increase • YouTube : 72 hours of video are uploaded per minute. • Facebook : 1 billion active users upload 250 million photos per day. Image from http://www.buzzfeed.com

  4. Feb.2011,《Science》:On the Future of Genomic Data。 • Feb.2011,《Science》: Climate Data Challenges in the 21st Century Jim Gray : The global amount of information would double every 18 months (1998).

  5. Challenges • IDC report: Most of the data would be stored in data centers. • Large Data Center ⇒ Scalability • Google: 19 data centers>1 million servers • Facebook, Microsoft, Amazon… : >100k servers • Large Data Center ⇒ Fault Tolerance • Google MapReduce: • 5 nodes fail during a job • 1 disk failsevery 6 hours Therefore, the data center network has to be very scalable and fault tolerant Google Data Center

  6. Related work • Tree-based Structure • Bandwidth bottleneck, Single points of failure, Expensive • Fat-tree • High capacity, • Limited scalability Fat-tree Tree-based Structure

  7. DCell is a level-based, recursively defined interconnection structure. • It requires multiport (e.g., 3, 4 or 5) servers. • DCell scales doubly exponentially with the server node degree. • It is also fault tolerant and supports high network capacity. • Downside: It trades-off the expensive core switches/routers with multiport NICs and higher wiring cost. • DCell • Scalable, • Fault-tolerant, • High capacity, • Complex, • Expensive C. Guo, H. Wu, K. Tan, L. Shi, Y. Zhang and S. Lu. DCell: A Scalable and Fault-Tolerant Network Structure for Data Centers. In: Proc. of the ACM SIGCOMM’08, Aug 2008

  8. FiConn utilizes servers with two built-in ports and low-end commodity switches to form the structure. • FiConn has a lower wiring cost than DCell. • Routing in FiConn also makes a balanced use of links at different levels and is traffic-aware to better utilize the link capacities. • Downside: it has lower aggregate network capacity. • FiConn • Scalable, • Fault-tolerant, • Low capacity D. Li, C. Guo, H. Wu, K. Tan, and S. Lu. FiConn: Using Backup Port for Server Interconnection in Data Centers. In: Proc. of the IEEE INFOCOM, 2009. Other architectures: Portland, VL2, Camcube…

  9. Our idea: Totoro • What we achieve: • Scalability: Millions of servers • Fault-tolerance: Structure&Routing • Low cost: Commodity devices • High capacity: Multi-redundant links Totoro Structure of One Level

  10. structure with N = 4, n = 4, K = 2.

  11. Architecture: • Two-port servers • Low-end switches • Recursively defined • Building Algorithm two-port NIC k-level Totoro

  12. Connect N servers to an N-port switch • Here, N=4 • Basic partition: Totoro0 • Intra-switch A Totoro0 Structure

  13. Available ports in Totoro0: c.Here, c=4 • Connect n Totoro0s to n-port switches by using c/2 ports • Inter-switch A Totoro1 structure consists of n Totoro0s.

  14. Connect n Totoroi-1s to n-port switches to build a Totoroi • Recursively defined • Half of available ports ⇒ Open & Scalable • The number of paths among Totorois is n/2 times of the number of paths among Totoroi-1s ⇒ Multi-redundant links ⇒ High network capacity

  15. Totoro: A Scalable and Fault-Tolerant Data Center Network by Using Backup Port • Totoro Interconnection Network: Building Algorithm 0 TotoroBuild(N, n, K) { 1 Define tK = N * nK 2 Define server = [aK, aK-1, …, ai, …, a1, a0] 3 For tid = 0 to (tK - 1) 4 For i = 0 to (K – 1) 5 ai+1 = (tid / (N * ni)) mod n 6 a0 = tid mod N 7 Define intra-switch = (0 - aK, aK-1, …, a1, a0) 8 Connect(server, intra-switch) 9 For i = 1 to K 10 If ((tid – 2i-1 + 1) mod 2i == 0) 11 Define inter-switch (u - bK-u, …, bi, …, b0) 12 u = i 13 For j = i to (K - 1) 14 bj = (tid / (N * nj-1)) mod n 15 b0 = (tid / 2u) mod (N / n * (n/2)u) 16 Connect(server, inter-switch) 17 } The key: work out the level of the outgoing link of this server

  16. Building Algorithm Millions of servers

  17. Totoro Routing • Totoro Routing Algorithm (TRA) • Basically, Not Fault-tolerant • Totoro Broadcast Domain (TBD) • Detect & Share link states • Totoro Fault-tolerant Routing (TFR) • TRA + Dijkstra algorithm (Based on TBD)

  18. Totoro Routing Algorithm (TRA) • Divide & Conquer algorithm • Path from src to dst?

  19. Totoro Routing Algorithm (TRA) • Step 1: src and dst belong to two different partitions respectively

  20. Totoro Routing Algorithm (TRA) • Step 2: Take a link between these two partitions

  21. Totoro Routing Algorithm (TRA) • m and n are the intermediate servers • The intermediate path is from m to n

  22. Totoro Routing Algorithm (TRA) • Step 3:src(dst)and m(n) are in the same basic partition, just return the directed path

  23. Totoro Routing Algorithm (TRA) • Step 3:Otherwise, return to Step 1 to work out the path fromsrc(dst)to m(n)

  24. Totoro Routing Algorithm (TRA) • Step 4:Join the P(src, m), P(m, n) and P(n, dst) for a full path

  25. Totoro Routing Algorithm (TRA) • The performance of TRA is close to the SP under the conditions of different sizes. • Simple & Efficient The mean value and standard deviation of path length in TRA and SP Algorithm in Totorou of different sizes. Muis the maximum distance between any two servers in Totorou. tu indicates the total number of servers

  26. Totoro Broadcast Domain (TBD) • Fault-tolerance ⇒ Detect and share link states • Time cost & CPU load ⇒ Global strategy is impossible • Divide Totoro into several TBDs Green: inner-server Yellow: outer-server

  27. Totoro Fault-tolerant Routing (TFR) • Two strategies: • Dijkstra algorithm within TBD • TRA between TBDs • Proxy: a temporary destination • Next hop: the next server on P(src, proxy/dst)

  28. Totoro Fault-tolerant Routing (TFR) • If the proxy is unreachable

  29. Totoro Fault-tolerant Routing (TFR) • Reroute the packet to another proxy by using local redundant links

  30. Evaluation • Evaluating Path Failure • Totoro vs. Shortest Path Algorithm(Floyd-Warshall) • Evaluating Network Structure • Totoro vs. Tree-based structure, Fat-Tree, DCell & FiConn

  31. Evaluating Path Failure • Types of failures • Link, Node, Switch & Rack failures • Comparison • TFR vs. SP • Platform • Totoro1(N=48, n=48, K=1, tK=2,304 servers) • Totoro2 (N=16, n=16, K=2, tK=4,096 servers) • Failures ratios • 2% - 20% • Communication mode • All-to-all • Simulation times • 20 times

  32. Evaluating Path Failure • Path failure ratio vs. node failure ratio. • The performance of TFR is almost identical to that of SP • Maximize the usage of redundant links when a node failure occurs

  33. Evaluating Path Failure • Path failure ratio vs. link failure ratio. • TFR performs well when the link failure ratio is small (i.e., <4%). • The performance gap between TFR and SP becomes larger and larger. • Not global optimal • Not guaranteed to find out an existing path • A huge performance improvement potential

  34. Evaluating • Path failure ratio vs. switch failure ratio. • TFR performs almost as well as SP in Totoro1 • The performance gap between TFR and SP becomes larger and larger in the same Totoro2

  35. Evaluating Path Failure • Path failure ratio vs. switch failure ratio. • Path failure ratio of SP is lower in a larger-level Totoro • More redundant high-level switches help bypass the failure

  36. Evaluating Path Failure • Path failure ratio vs. rack failure ratio. • In a low-level Totoro, TFR achieves results very close to SP. • The capacity of TFR in a relative high-level Totoro can be improved.

  37. Evaluating Network Structure • Low degree • Approaches to but never reach 2 • Lower degree ⇒ Lower deployment and maintenance overhead. N: the number of ports on an intra-switch n:the number of ports on an inter-switch T : the total number of servers . For Totoro, there is

  38. Evaluating Network Structure • Relative large diameter • Smaller diameter ⇒ More efficient routing mechanism • In practice, the diameter of a Totoro3 with 1M servers is only 18. • This can be improved.

  39. Evaluating Network Structure • Large bisection width • Large bisection width ⇒ Fault-tolerant & Resilient • Take a small number of k, the bisection width is large. • BiW=T/4, T/8, T/16 when k = 1, 2, 3.

  40. Conclusion • Scalability: • Millions of servers & Open structure • Fault-tolerance: • Structure&Routing mechanism • Low cost: • Two-port servers & Commodity switches • High capacity: • Multi-redundant links Totoro is a viable interconnection solution for data centers!

  41. Future Work • Fault-tolerance: • Structure • How to be more resilient? • Routing under complex failures: • More robust rerouting techniques? • Network capacity • Data locality: • Mapping between servers and switches? • Data storage allocation policies?

  42. Totoro: A Scalable and Fault-Tolerant Data Center Network by Using Backup Port Thanks!

More Related