1 / 31

Shruti: A Self-Tuning Hierarchical Aggregation System

Shruti: A Self-Tuning Hierarchical Aggregation System. Praveen Yalagandula HP Labs Mike Dahlin University of Texas at Austin. Motivation. Distributed information aggregation Building block for many large-scale applications Examples: Resource scheduling, File location, Multicast, etc.

dismukes
Download Presentation

Shruti: A Self-Tuning Hierarchical Aggregation System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shruti: A Self-Tuning Hierarchical Aggregation System Praveen Yalagandula HP Labs Mike Dahlin University of Texas at Austin

  2. Motivation • Distributed information aggregation • Building block for many large-scale applications • Examples: Resource scheduling, File location, Multicast, etc. • An important issue: when do you aggregate? • Proactive (push): Aggregate on updates/writes • E.g.: Astrolabe, Ganglia • Reactive (pull): Aggregate on probes/reads • E.g.: MDS-2, Sophia • Hybrid: Aggregate partially on writes and complete on reads • E.g.: DHT based systems • SDIMS: First system with flexibility • But, Applications need to know read-write patterns a priori

  3. Contributions • Shruti: Self-tuning aggregation system • Tune the aggregation aggressiveness • Based on observed read/write patterns • Goal: Minimize communication costs • Lease based technique • Maintain lease invariants for correct answers • Handle node and network failures • Optimization: Default up-lease initial state

  4. Outline • Motivation for self-tuning aggregation • Background: SDIMS • Shruti: Architecture • Leases • Leasing policy • Default lease state • Reconfigurations • Evaluation • Summary

  5. SDIMS f(f(a,b), f(c,d)) A2 • Hierarchical aggregation system • Physical machines are leaves • Virtual nodes  groups • Attribute, value • E.g., (CPU,3Mhz), (Mem,2GB) • Aggregation function (f) • E.g., MAX, MIN, AVG, CONCAT • DHT based system for constructing multiple trees f(a,b) f(c,d) A1 A0 d c a b Praveen Yalagandula and Mike Dahlin, “SDIMS: A Scalable Distributed Information Management System”, SIGCOMM 2004

  6. SDIMS: UP and DOWN knobs Update-Up Up=all Down=0 Policy Setting Update-All Up=all Down=all Update-Local Up=0 Down=0

  7. SDIMS: UP and DOWN knobs Update-Up Up=all Down=0 Policy Setting Update-All Up=all Down=all Update-Local Up=0 Down=0

  8. SDIMS: UP and DOWN knobs Update-Up Up=all Down=0 Policy Setting Update-All Up=all Down=all Update-Local Up=0 Down=0

  9. SDIMS: UP and DOWN knobs Update-Up Up=all Down=0 Policy Setting Update-All Up=all Down=all Update-Local Up=0 Down=0

  10. SDIMS: UP and DOWN knobs Application Developer Up=u Down=d Update-Up Up=all Down=0 Policy Setting Update-All Up=all Down=all Update-Local Up=0 Down=0

  11. Shruti • Self-tunes aggregation aggressiveness • Goal: minimize communication cost • Number of messages for updates and probes • Tracks update and probes at each node • Decides when to send updates up or down • Employs a Lease-based architecture

  12. Leases Update-Up Up=all Down=0 A B

  13. Leases Update-Up Up=all Down=0 • A lease from node A to node B for an aggregate • Will forward all updates in future • So, B need not contact A on probes • Until B relinquishes OR B dies OR A revokes the lease A B

  14. Lease: Invariants for correctness • A node can lease if and only if it already gets updates • Upward Path: A node can lease its local aggregate iff • It is a leaf or • It has leases from all its children • Downward Path: A node can lease to a child iff • It has lease from the parent Incorrect state Correct Incorrect state Correct

  15. Leasing policy • When to grant and when to relinquish • Intuition: Useful to grant only if probe rate is more than the update rate • Costs per operation on a link • probe = 2 messages • update = 1 message • Grant a lease if probe rate >= 0.5*update rate • Else, relinquish Request Response Update

  16. } AND only when invariants allow Shruti: Leasing Policy • Set and release based on number of messages observed • Policy defined with two knobs • setThresh • relThresh • Set a lease • If #probes since last update >= setThresh • Relinquish a lease • If #updates since last probe >= relThresh

  17. Example with two nodes • setThresh=1, relThresh=2 B A Update Probe Number of probes since last update = 1 == setThresh Response Response + LEASE Probe Response Update Probe Response Update Update Number of updates since last probe = 2 == relThresh Relinquish

  18. Default initial lease state • Default = no leases  Initial probes cost O(N) • Common case: Sparse attributes • Only few nodes are interested in an attribute • Examples: File location, Multicast, etc., • Default initial state: Start with leases up to the root • Initial updates and probes incur O(log N) msgs

  19. Handling failures • Goal: revert back to an invariant-satisfying state • Losing a child or the parent  OK(No violations) • Acquiring a child  OK(Default lease state assumption) • Acquiring a new parent  Can violate invariants • A solution: Revoke leases Violates invariants X X

  20. Outline • Motivation for self-tuning aggregation • Background: SDIMS • Shruti: Architecture • Leases • Leasing policy • Default lease state • Reconfigurations • Evaluation • Summary

  21. Evaluation • Simulation experiments • 512 node system initialized with [attribute=“dummy”, value=0] • Aggregation • Function: summation operation • Update: increments value of the attribute • Probe: global aggregate aka sum of values at all nodes • Cases • Uniform read-write ratio across nodes • Spatial Heterogeneity: zipf-like distribution across nodes • Temporal Heterogeneity: varying read-write rates with time • Failure handling

  22. Uniform RW ratio across nodes Simulation with 512 nodes Update-None Update-All Up=all, Down=3 Up=3, Down=0 Avg Message Count Update-Up Shruti Read-to-write ratio

  23. Uniform RW ratio across nodes Update-None Up=3, Down=0 Shruti Update-Up Up=all, Down=3 Update-All

  24. Uniform RW ratio across nodes Update-None Up=3, Down=0 Update-All Update-Up Up=all, Down=3 Shruti

  25. Spatial heterogeneity Zipf-like distribution across nodes Update-None Update-All Up=all, Down=3 Up=3, Down=0 Update-Up Shruti

  26. Temporal heterogeneity Three Phases (reads:writes) 100:1 1:100 1:100 Shruti(reads) Static(reads) Static(writes) Shruti(writes)

  27. Temporal heterogeneity 1:100 1:100 100:1 Three Phases (reads:writes)

  28. Failure handling (1024 node system) Start with NO leases set Root node fails All leases set towards the prober

  29. Summary • Shruti: Self-tuning hierarchical aggregation system • Goal: Minimize communication costs • Lease based mechanism • Satisfy invariants to ensure consistency in the results • Default lease state for sparse attributes • Revert to invariant-satisfying state on failures • Applicable to more general aggregation over spanning trees • [Plaxton et al IPDPS’07] prove the competitive ratio with optimal offline algorithm and consistency properties

  30. Multiple attributes: Uniform Writes, Zipf-like Distribution of Reads

More Related