1 / 32

Commensal Cuckoo : Secure Group Partitioning for Large-Scale Services

Commensal Cuckoo : Secure Group Partitioning for Large-Scale Services. Siddhartha Sen and Mike Freedman Princeton University. Scalable p eer -to-peer service. Peer-to-peer service. untrusted participants. Shard data/ functionality. Clients. Scalable p eer -to-peer service.

ardice
Download Presentation

Commensal Cuckoo : Secure Group Partitioning for Large-Scale Services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Commensal Cuckoo: Secure Group Partitioning for Large-Scale Services Siddhartha Sen and Mike Freedman Princeton University

  2. Scalable peer-to-peer service Peer-to-peer service untrusted participants Shard data/ functionality Clients

  3. Scalable peer-to-peer service How do we make it reliable? Byzantine Fault Tolerant (BFT) f < 1/3 Mask failures with replication untrusted participants • Observe: • Ff • Want small groups F < 1/4 f < 1/3 Clients f < 1/3

  4. Prior work using many small groups • Systems: • [Rampart95], [SecureRing98], [OceanStore00], [Farsite02], [CastroDGRW02], [Rosebud03], [Myrmic06], [Fireflies06], [Salsa06], [SinghNDW06], [Halo08], [Flightpath08], [Shadowwalker09], [Census09] • Theory: • [HildrumK03], [NaorW07] Problem: Assume randomly or perfectly distributed faults (i.e., static)

  5. Rosebud [RL03] 1 0 BFT group Consistent hashing ring

  6. Rosebud [RL03] 1 0 • Unrealistic: • Don’t know faulty nodes • Best case is uniformly random •  (1) faults per group • Real adversary is dynamic! BFT group F = f < 1/3

  7. Join-leave attack f > 1/3 1 0 Vanish system compromised by join-leave attack (2010) join leave

  8. Prior work tolerating join-leave attacks • [FiatSY05], [AwerbuchS04], [Scheideler05] • State-of-the-art is cuckoo rule [AwerbuchS06, AwerbuchS07] • Problems: • Impractical (large constant factors) • Groups must be impractically large or F trivially low

  9. Goal: Provably secure + practical group partitioning scheme • Contributions: • Demonstrate failures of prior work • Analyze and understand failures • Devise algorithm that overcomes them • Assumptions • Correct nodes randomly distributed and stable • Adversary controls global fraction F of nodes in system, rejoins them maliciously • System fails when one group fails, i.e. f 1/3

  10. Cuckoo rule (CR) [AS06] 1 0 1 2 F < f < 1/3 3 4

  11. Cuckoo rule (CR) [AS06] 1 0 primary join random location in [0,1) random locations in [0,1) 1 2 k-region • For poly(n) rounds, all regions of size O(log n)/n have: • O(log n) nodes • f < 1/3 Adversary strategy: rejoin from least faulty group join 3 4 secondary join secondary join leave

  12. Cuckoo rule (CR) [AS06] • In summary: • On primary join, cuckoo (evict) nodes in immediate k-region to selected random ID • Select new random IDs for cuckood nodes, join them as secondary joins (i.e., no subsequent cuckoos) • Ignore implementation issues: • Route messages securely • Verify messages from other groups • Bootstrap the system, handle heavy churn

  13. CR tolerates very few faults in practice Group size = 64, Rounds = 100,000

  14. What if we allow larger groups? Increased group size in powers of 2

  15. CR: Evolution of a faulty group Expected faulty fraction per group N = 4096, F 5%, Group size = 64, k = 4

  16. Why does this happen? 1 0 closely-spaced primary joins = bad news primary joins create holes faulty group! 1 2 3 4 empty k-regions cuckoo less

  17. CR: Cuckoo size is erratic clumps Expected cuckoo size holes N = 4096, F 5%, Group size = 64, k = 4

  18. CR: Primary join spacing is erratic Expected secondary joins N = 4096, F 5%, Group size = 64, k = 4

  19. Cuckoo rule is “parasitic”

  20. New algorithm (Fixing CR) • Holes and clumpiness: • Cuckoo k nodes chosen randomly from group • Scale k relative to average group size (larger groups cuckoo more, smaller groups cuckoo less) • Inconsistently spaced primary joins: • Group vets join attempt, deny if insufficient secondary joins since last primary join

  21. “Commensal” cuckoo rule Commensalism. A symbiotic relationship in which one organism derives benefit while causing little or no harm to the other.

  22. Commensal cuckoo rule (CCR) 1 0 received secondary join! too few secondary joins cuckoo k random nodes primary join accepted 1 2 holes don’t matter 3 4 (recall CR cuckood only 1 node)

  23. Commensal cuckoo rule (CCR) • In summary: • On primary join to selected random ID, if fewer than k secondary joins since last primary join, start over with new random ID • Otherwise, cuckoo k nodes weighted by group size, join them as secondary joins (i.e., no subsequent cuckoos)

  24. Techniques are synergistic • Join vetting forces adversary to join distinct groups  all groups joined (roughly) • Weighted cuckoos ensure sufficient secondary joins  O(1) join attempts needed

  25. Cuckoo size is consistent CR: CCR:

  26. CCR: Primary join spacing is consistent

  27. CCR tolerates significantly more faults f < 1/3

  28. CCR tolerates significantly more faults • How to use BFT with f < 1/2? • Idea: Separate correctness from availability • Group is correct, but unresponsive • Use other groups to revive group! f < 1/2

  29. Join vetting has deeper benefits • Security vulnerability in CR: adversary retries a primary join (w/o causing cuckoos) until gets location it likes • CCR avoids problem: group won’t accept primary join if insufficient secondary joins • Don’t care how many previous attempts or where

  30. Summary • CR suffers from random bad events, which CCR avoids by derandomizing • Cuckoos weighted by group size • Primary join attempts vetted by groups • CCR tolerates F 7% for f < 1/3 F 18% for f < 1/2

  31. Extensions (A complete solution) • Route messages securely • O(1)-hop routing • Verify messages from other groups • Distributed key generation, threshold signatures  constant public/private key per group • Bootstrap the system, handle heavy churn • Choose target group size at onset (e.g. 64); Split/merge locally • Handle DoS and data layer attacks • Reactive approach, e.g. reactive replication

  32. Conclusion • Secure group membership partitioning for open P2P systems • Most previous systems assumed (impossible) perfect distribution, ignored join-leave attacks • CCR can handle much higher fractions of faulty nodes than prior algorithms

More Related