Exploring Replication Strategies for P2PSIP Systems

Replication for P2PSIP Bruce Lowekamp 12/5/2007

Replication vs Erasure Codes • This talk discusses “replication.” For P2PSIP, where the primary resource is a AoR->URI or similar mapping, this is the right approach. For storage of larger objects, mechanisms such as erasure codes are more appropriate. Most of the same issues of placement and responsibility are still applicable with only slight modifications.

The Questions • What should replication achieve? • What peer is responsible for replication? • What layer is responsible? • How many replicas? • Where to put the replicas?

Goals of Replication • Persistence • Data retained after crash of responsible peer • “Permanence” after storing peer leaves • Security • DoS by responsible peer • Routing attacks along path to responsible peer • Load Balancing • Queries for a popular resource distributed between replicas

Popular Replication Schemes • Neighbors/Successors • Persistence with loss of responsible peer • DHT-dependent definition • Successor vulnerable to routing attacks • Typically depends on responsible peer • Random replicas • Fixed offsets or text addition before hashing • Paths to different replicas are independent • Harder to replace on loss of responsible peer • Query-path • Replicas added along path to responsible peer • Good for load balancing

Who Replicates? • Storing (owner) peer • Cares the most about the outcome • Should bear work responsibility • Can’t refresh after leaving • Responsible peer • “Responsible” • Easiest support for “permanent” resources • Low overhead for neighbors • Higher overhead for random replicas • Replicas • Can pull/refresh what you replicate

What Layer Replicates? • DHT • Appropriate replication scheme related to DHT algorithm • Application • Knows requirements • Persistence • Reliability • Popularity • Knows when owner is present • Only layer that can generate new signatures (if needed)

Possible Solutions? • Neither DHT or Application layer can solve all problems for different types of data. • Proposal: We need both • DHT-level replication always appropriate, can’t be done by application-layer. • Only application-layer knows enough about requirements to justify/define further replication. • Responsible/replica peer needs to make application-layer decisions about its resources if they have different persistence requirements. Not just DHT-level. • Interaction between these layers may be useful to detect malicious peers.

Exploring Replication Strategies for P2PSIP Systems