100 likes | 125 Views
Explore the design considerations for a robust peer-to-peer system storing application states such as mail services. The paper discusses architecture, configuration services, admission control, node monitoring, and configuration information dissemination for fault tolerance. It emphasizes correctness, reconfiguration, and state transfer.
E N D
The Design of a RobustPeer-to-Peer System Reference: SIGOPS European Workshop 25/09/2002 Rodrigo Rodrigues(MIT) Gisik Kwon Dept. of Computer Science and Engineering Arizona State University
Motivation • What if we wanted to store the state of the application in a P2P system? • E.g., mail service, archiving • Need a robust P2P storage system • Current algorithms do not provide reliable service: • No admission control • Trust configuration information • Vulnerable to malicious routing • Tolerance to Byzantine faults
Configuration Service P2P Nodes Architecture • Configuration Services(CS) • Selected statically(a subset of P2P nodes) or dynamically • P2P nodes • set of servers, not clients • Common to equip with • Co-processor, read-only disk, watch-dog timer, Cryptographic co-processor
Configuration Service • Four main functions: • Admission control • Node monitoring • Deciding on a new configuration • Propagating configuration information • CS nodes carry BFT protocol [CL99]
Admission Control • Prevents single user controlling large fraction of the ID space • Maintains the list of node • <ID, IP, public key> • Hard to do in volunteer-based system • We assume servers • Nodes have public / private keys
Node Monitoring • Detection fault node • Fail-stop failure: ping protocol • Byzantine failure: hard! • proactively recovered frequently -> restart from the correct code in read-only disk -> copy state from other nodes • All CS do its own monitoring for all P2P nodes
Configuration Information • What to propagate? • Incomplete config is a limitation of p2p • Disseminate entire config • Transmit using diffs • When to propagate? • Server machines: not very often (e.g., every hour) • Configs include start and expiration times • Periodically deciding the new config • Using non-deterministic choices validation • Send eviction request to other CS -> validation
Application Storage Storage Storage Lookup Lookup Lookup Client Server Server Architecture – P2P Nodes • Lookup layer • Periodically receive the latest config from CS • Notify to storage layer • Storage layer • Uses Byzantine quorums to store data • Data placement algorithm assigns 3 f + 1 replicas to each data item
Correctness Condition • “At any moment, any group of 3 f +1 replicas of a data item contains no more than f faulty replicas” • Reconfiguration must be frequent enough to preserve this invariant
State Transfer • During reconfiguration, set of replicas may change • Replica finds out of new configuration: • Know old configuration • Pulls data items from old replicas (as seen before)