1 / 39

The Little Engine(s) That Could: Scaling Online Social Networks

The Little Engine(s) That Could: Scaling Online Social Networks. B99106017 圖資三 謝宗昊. Outline. Background SPAR Evaluation and Comparison Conclusion. Outline. Background SPAR Evaluation and Comparison Conclusion. New challenge for system design.

karah
Download Presentation

The Little Engine(s) That Could: Scaling Online Social Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Little Engine(s) That Could:Scaling Online Social Networks B99106017 圖資三 謝宗昊

  2. Outline • Background • SPAR • Evaluation and Comparison • Conclusion

  3. Outline • Background • SPAR • Evaluation and Comparison • Conclusion

  4. New challenge for system design • Online Social Networks(OSNs) are hugely interconnected • OSNs grow rapidly in a short period of time • Twitter grew by 1382% between Feb And Mar 2009 • Cause costly re-architecting for service • Conventional vertical scaling is not a good solution • Horizontal scaling leads to interconnecting issue • Performance bottleneck

  5. Full Replication

  6. Random Partition (DHT)

  7. Random Partition (DHT) with replication of the neighbors

  8. SPAR

  9. Designer’s Dilemma • Commit resources to develop the feature + Appealing feature to attract new user - “Death by success” • Ensure the scalability first + Low risk on “Death by success” - Hard to compete with other creative competitors

  10. Outline • BackGround • SPAR • Evaluation and Comparison • Conclusion

  11. SPAR • A Social Partitioning And Replication middle-ware for social applications.

  12. What does SPAR do and not do? DO • Solves the Designer’s Dilemma for early stage OSNs • Avoids performance bottlenecks in established OSNs. • Minimizes the effect of provider lock-ins NOT DO • Not designed for the distribution of content such as pictures and videos • Not the solution for storage or for batch data analysis such as Hadoop

  13. How does SPAR do it? • Provides local semantics • Handles node and edge dynamics with minimal overhead • Serves more requests while reducing network traffic

  14. Problem Statement • Maintain local semantics • Balance loads • Be resilient to machine failures • Be amenable to online operation • Be stable • Minimize replication overhead

  15. Why not graph/social partitioning? • Not incremental • Community detection is too sensitive • Reduce inter-partition edges ≠ Reduction of replicas

  16. Description • Node addition/removal • Edge addition/removal • Server addition/removal

  17. Node addition/removal Node Addition • New node to the partition with fewest master replicas Node Removal • The master and all slaves should be removed • The node have and edge with it should be updated

  18. Edge addition/removal Edge Addition: Three possible configurations • No movement of master • Master of u goes to the partition containing master of v • Master of v goes to the partition containing master of u Edge Removal • Remove the replica of u in the partition holding the master of node v if no other requires it

  19. Edge addition

  20. Server addition/removal Server Addition: Two solution • Force redistribution for load balance immediately • Redistributing the master by node/edge processes Server Removal • The highly connected nodes choose the server first

  21. Implementation • SPAR is a middle-ware(MW) between datacenter and application • SPAR includes 4 components: • Directory service (DS) • Local Directory Service (LDS) • Partition Manager (PM) • Replication Manager (RM)

  22. Implementation

  23. Outline • Background • SPAR • Evaluation and Comparison • Conclusion

  24. Evaluation methodology Metrics • Replication overhead • K-redundancy requirement Dataset • Twitter: 12M tweet generated by 2.4M users • Facebook: 60,290 nodes and 1,545,686 edges • Orkut: 3M nodes and 223M edges

  25. Evaluation methodology Algorithm for comparison • Random Partitioning • Solutions used by Facebook, Twitter • Graph Partitioning (METIS) • Minimize inter-partition edges • Modularity Optimization (MO+) algorithm • Community detection

  26. Evaluation of replication overhead

  27. SPAR Versus Random for K=2

  28. Dynamic operations and SPAR

  29. Dynamic operations and SPAR

  30. Adding/Removing Server • Adding server has two policies: • Wait for new arrivals to fiull up the server • Re-distribute existing master from other server into the new server

  31. Adding/Removing Server • Removing Server • Average number of movements: 485k • Overhead increases from 2.74 to 2.87 • Reduce overhead to 2.77 if additional 180k transmissions • Painful but not common to scale down

  32. SPAR IN THE WILD • Testbed: 16 low-end commodity servers • Pentium Duo CPU 2.33GHz • 2GB of RAM • Single hard drive • Evaluation with Cassandra • Evaluation with MySQL

  33. Evaluation with Cassandra

  34. Evaluation with MySQL

  35. Outline • Background • SPAR • Evaluation and Comparison • Conclusion

  36. Conclusion • Preserving local semantics has many benefit • SPAR can achieve it in low replication overhead • SPAR can deal with the dynamics experienced by an OSN gracefully • Evaluation with RDBMS(MySQL) and a Key-Value Store(Cassandra) shows that SPAR offer significant gains in throughput(req/s) while reducing network traffic • To sum up, SPAR would be a good solution for Scaling OSNs.

  37. Reference • http://www.cl.cam.ac.uk/~ey204/teaching/ACS/R202_2011_2012/presentation/S6/Arman_SPAR.pdf - ArmanIdani

  38. Q&A time

  39. Thanks for Listening

More Related