1 / 52

Online Friends

Online Friends. Presented by Dipannita Dey and Andy Vuong Scribed by Ratish Garg. Social Hash: an Assignment Framework for Optimizing Distributed Systems on Social Networks. Presented By Dipannita Dey(ddey2). 2. K4. K2. K3. K6. N3. K1. N2. N1. Background: Consistent Hashing.

chapel
Download Presentation

Online Friends

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Online Friends Presented by Dipannita Dey and Andy Vuong Scribed by Ratish Garg

  2. Social Hash: an Assignment Framework for Optimizing Distributed Systems on Social Networks Presented By Dipannita Dey(ddey2) 2

  3. K4 K2 K3 K6 N3 K1 N2 N1 Background: Consistent Hashing 0 Read/write K1 Coordinator N – nodes/servers storing data K – objects/requests Client 3

  4. Assignment Problem 4 Taken from original slide

  5. Assignment Problem Optimization Putting possible data-records accessed by a single query in a single storage component Grouping similar user requests 5 Adapted from original slide

  6. Requirements Challenges • Map similar objects to one cluster • Assignment Stability • Adaptive • Minimal Response time • Load Balancing • Effects of similarity on load balance • Addition and removal of objects • Scale • Dynamic workload • Heterogeneous components in infrastructure Change at modest rate Enormous Predictable 6

  7. Social Hash Framework (Conceptual Model) (C) N > 1 N:= |G|/|C| Conceptual entities (G) N = 1 7 Adapted from paper

  8. Key Contributions • Forming Groups of relatively cohesive objects in Social Graph • Separation of Optimization on social network from Adaptation to Changes in Workload and Infrastructural components • Use of ‘Graph Partitioning’ for static assignment • Use of ‘Query History’ to construct bi-partite graph (upon which graph partitioning is applied) • Reduced ‘Cache miss rate’ by 25% • Cut the average response Latency in half 8

  9. Social Hash Framework (Actual Model) 9 Taken from paper

  10. Static Assignment 10

  11. Dynamic Assignment • Adapt to maintain workload balance by changing group to component mapping • Group to component ratio(N) controls trade-off between static and dynamic assignment • N >> 1 -> Dynamic assignment • Factors affecting load-balancing strategies • Accuracy in predicting future load • Dimensionality of load • Group transfer overhead • Assignment Memory 11

  12. HTTP Request Routing Optimization • Static assignment • Uni-partite graph representing friendship • Maximize Edge locality • Dynamic assignment • Existing Consistent Hashing • Tradeoff b/w Edge-locality and no. of groups • Production results • 21k groups • Maintained edge locality over 50% • Updated every week (1% degradation) 12 Plot from paper

  13. Experimental Observation 13 Plots from paper

  14. Storage Sharding Optimization • Static assignment • Bi-partite graph representing recent queries • Minimize Fanout • Group to component ratio = 8 • Dynamic assignment • Based on Historical Load Patterns • Production results • 2% increase in fanout • Static assignments every few months • Average Latency decreased by 50% • CPU utilization decreased by 50% 14 Plot from paper

  15. Experimental Observation 15 Plots from paper

  16. Doubts/ Questions 1. Fanout vs Parallelism? - Utilize machine parallelism with low fanout 2. Why custom graph is used? - Graph [queries -> Data Records] which bests represents the distributed problem 3. How frequently does dynamic assignments change and how that impact performance? - Application- specific (depends on group to component ratio) 4. How does dynamic assignments affect TAO cache miss rate? - Cache warms up faster because of similar requests 16

  17. Thoughts 1. Application-specific 2. Assumption – workload and infrastructure change at modest rate 3. Use historical data patterns in dynamic assignment algorithm 4. No details into the customization above Apache Giraph 5. Not sure about dynamic assignments for replicas 6. Some experimental results with network bandwidth vs fanout will be nice 17

  18. Take away • Patterns in assignment problem • Proposed Social Hash-framework for solving assignment problem on Social Network • 2-level schema to decouple optimization from workload and infrastructural changes • Lower fanout may provide better performance 18

  19. TAO: Facebook’s Distributed Data Store for the Social Graph Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, Mark Marchukov Dmitri Petrov, Lovro Puzar, Yee Jiun Song, Venkat Venkataramani Facebook, Inc. (2013 USENIX) Presented by Andy Vuong 19

  20. TAO: The Associations and Objects • Geographically distributed graph system used in production at Facebook • Paper published in 2013 • Three Contributions • Efficient and available read-mostly access to a changing graph • Objects and Associations Model • TAO & Evaluation 20

  21. The Social Graph 21 Image: http://www.freshminds.net/wp-content/uploads/2012/02/Picture1.png

  22. The Social Graph: Old Stack + + 22

  23. Problems with Memcached? • Key, value store • Distributed control logic • Expensive read-after-write consistency 23

  24. The Social Graph: New Stack TAO + 24

  25. TAO arrives • Read-efficient distributed graph caching system to help serve the social graph • Built on top of a “associations and objects” model 25

  26. The Associations and Objects Model 26

  27. Facebook focuses on people, actions, and relationships • Objects are typed nodes: • (id) -> (otype, (key -> value)*) • Associations are typed directed edges: • (id1, atype, id2) -> (time, (key -> value)*) • Associations may be coupled with an inverse edge. Edges may or may not be symmetric 27

  28. 28

  29. Actions are either objects or associations • Repeatable actions are best suited as objects • Associations model actions that happen at most once or actions that record state transitions 29

  30. TAO Data API • Provides simple object API for creating, retrieving, updating, and deleting an object • Provides simple association API for adding, deleting, and editing an association • Association Query API Provided: • assoc_get • assoc_count • assoc_range • assoc_time_range 30

  31. Architecture & Implementation 31

  32. Architecture: Storage Layer • MySQL as the persistent store • Divide data into logical shards • Database server responsible for one or more shards • All the objects in the database have a shard_id that identifies the hosting shard • Association stored on shard of its source 32

  33. Architecture: Caching Layer • Multiple cache servers from a tier • TAO further divides the cache layer into two levels: • Leaders • Followers • Leaders are cache coordinators responsible for direct communication with persistent storage (read misses, writes) • Client communicates with closest follower tier 33

  34. Architecture: Cache Per-Region Tier Setup Followers Followers Leaders MySQL Clients 34 Cache Layer Storage Layer

  35. Architecture: Master / Slave 35

  36. Write sent to master leader Consistency Messages delivered in B Master region sends read misses, writes to master DB Read misses to replica DB Architecture: Master / Slave Replication 36

  37. Consistency • TAO provides eventual consistency • Asynchronously sends cache maintenance messages from leader to the followers • Changeset / Inverse Edges 37

  38. Fault Tolerance • Per destination timeout • Database Failures • Leader Failures • Followers Failures 38

  39. Optimizations • Shard Loading • High Degree Objects 39

  40. Evaluation 40

  41. 6.5 million requests 41

  42. Distribution of the return values from assoc_count • Distribution of the number of associations from range queries 1% > 500k 42

  43. Distribution of the data sizes for TAO query results • Throughput of an individual follower 39.5% associations queried by clients contained no data peak query rate rises with hit rate 43

  44. Client-observed TAO latency for read requests • Writes latency across two data centers 58.1 msec on away (avg RTT) 44

  45. Related Work • Shares features with • Trinity, a in-memory graph datastore • Neo4j, an open-source graph database w/ ACID semantics • Twitter’s FlockDB for its social graph • Akamai groups clusters into regional groups similarly to FB’s follower and leader tiers • Scaling Memcache at Facebook Paper 45

  46. Questions? 46

  47. Discussion - Tao Backing storages other than MySQL can be more efficient for read-heavy workload? Why do they use MySQL as their underlying DB instead of a graph DB like Neo4j? Is this design applicable to other data serving services? Are there other large graph-based or non-graph-based datasets that this could be extended for? Data Naturally in graph form so why not graph db?

  48. Discussion - Tao Why have a leader cache? It would be interesting to see how these systems behaves for the 'viral' like social media scenarios. By definition, these are global in nature and spread quickly. Do such interactions perform poorly? Followup: Leader cache size? Is that a huge overhead since its providing data to follower cache?

  49. Discussion - Tao How to restore the leader’s consistency after failure?

  50. Discussion - Social Hash What kind of applications are suited to smaller or larger values of n?(group to component ratio) How to trade off between static and dynamic schedule or how to find an optimal group/component number? I failed to find a clear description of this.

More Related