250 likes | 345 Views
Scalability, Accountability and Instant Information Access for Network Centric Warfare. Yair Amir, Claudiu Danilov, Danny Dolev, Jon Kirsch, John Lane, Jonathan Shapiro. Department of Computer Science Johns Hopkins University. Chi-Bun Chan, Cristina Nita-Rotaru, Josh Olsen David Zage.
E N D
Scalability, Accountability and Instant Information Access for Network Centric Warfare Yair Amir, Claudiu Danilov, Danny Dolev, Jon Kirsch, John Lane, Jonathan Shapiro Department of Computer Science Johns Hopkins University Chi-Bun Chan, Cristina Nita-Rotaru, Josh Olsen David Zage Department of Computer Science Purdue University http://www.cnds.jhu.edu
Dealing with Insider Threats Project Goals: • Scaling survivable replication to wide area networks. • Overcome 5 malicious replicas. • SRS goal: Improve latency by a factor of 3. • Self imposed goal: Improve throughput by a factor of 3. • Self imposed goal: Improve availability of the system. • Dealing with malicious clients. • Compromised clients can inject authenticated but incorrect data - hard to detect on the fly. • Malicious or just an honest error? Can be useful for both. • Exploiting application update semantics for replication speedup in malicious environments. • Weaker update semantics allows for immediate response.
State Machine Replication • Main Challenge: Ensuring coordination between servers. • Requires agreement on the request to be processed and consistent order of requests. • Byzantine faults: BFT [CL99]: must contact 2f+1 out of 3f+1 servers and uses 3 rounds to allow consistent progress. • Benign faults: Paxos [Lam98,Lam01]: must contact f+1 out of 2f+1 servers and uses 2 rounds to allow consistent progress.
State of the Art in Byzantine ReplicationBFT [CL99] Baseline technology
The Paxos ProtocolNormal Case, after leader election [Lam98] Key: A simple end-to-end algorithm
Steward: Survivable Technology for Wide Area Replication A site Clients • Each site acts as a trusted logical unit that can crash or partition. • Effects of malicious faults are confined to the local site. • Threshold signatures prove agreement to other sites. • Between sites: • Fault-tolerant protocol between sites. • There is no free lunch – we pay with more hardware… Server Replicas o o o 3f+1 1 2 3
Challenges (I) • Each site has a representative that: • Coordinates the Byzantine protocol inside the site. • Forwards packets in and out of the site. • One of the sites act as the leader in the wide area protocol • The representative of the leading site is the one assigning sequence numbers to updates. • How do we select and change the representatives and the leader site, in agreement ? • How do we transition safely when we need to change them ?
Challenges (II) • Messages coming out of a site during leader election are based on communication between 2f+1(out of 3f+1) servers inside the site. • There can be multiple sets of 2f+1 servers. • In some instances, multiple correct but different site messages can be issued by a malicious representative. • It is sometimes impossible to completely isolate a malicious server behavior inside its own site. • This behavior can happen in two instances: • The servers inside a site propose a new leading site. • The servers inside a site report their individual status with respect to the global site progress. • Developed a detailed proof of correctness of the protocol.
Main idea • Sites change their local representatives based on timeouts. • Leader site representative has a larger timeout . • allows for communication with at least one correct rep. at other sites. • After changing f+1 leader site representatives, servers at all sites stop participating in the protocol, and elect a different leading site.
Steward: First Byzantine Replication Scalable to Wide Area Networks • A second iteration implementation • Based on the complete theoretical design. • Follows closely the pseudocode proven to be correct. • We benchmarked the new implementation against the program metrics. • The code successfully passed the red-team experiment. • We believe it is theoretically unbreakable.
Testing Environment • Platform: Dual Intel Xeon CPU 3.2 GHz 64 bits 1 GByte RAM, Linux Fedora Core 3. • Library relies on Openssl : • Used OpenSSL 0.9.7a 19 Feb 2003. • Baseline operations: • RSA 1024-bits sign: 1.3 ms, verify: 0.07 ms. • Perform modular exponentiation 1024 bits, ~1 ms. • Generate a 1024 bits RSA key ~55ms.
Evaluation Network 1: Symmetric Wide Area Network • Synthetic network used for analysis and understanding. • 5 sites, each of which connected to all other sites with equal bandwidth/latency links. • One fully deployed site of 16 replicas; the other sites are emulated by one computer each. • Total – 80 replicas in the system, emulated by 20 computers. • 50 ms wide area links between sites. • Varied wide area bandwidth and the number of clients.
Write Update Performance • Symmetric network. • 5 sites. • Steward: • 16 replicas per site. • Total of 80 replicas (four sites are emulated). • Actual computers: 20. • BFT: • 16 replicas total. • 4 replicas in one site, 3 replicas in each other site. • Update only performance (no disk writes).
Read-only Query Performance • 10 Mbps on wide area links. • 10 clients inject mixes of read-only queries and write updates. • None of the systems was limited by bandwidth. • Performance improves between a factor of two and more than an order of magnitude. • Availability: Queries can be answered locally, within each site.
Evaluation Network 2:Practical Wide-Area Network Boston MITPC • Based on a real experimental network (CAIRN). • Modeled on our cluster, emulating bandwidth and latency constraints, both for Steward and BFT. Delaware 4.9 ms San Jose 9.81Mbits/sec UDELPC TISWPC 3.6 ms 1.42Mbits/sec ISEPC 1.4 ms 1.47Mbits/sec ISEPC3 100 Mb/s <1ms 38.8 ms 1.86Mbits/sec ISIPC4 Virginia ISIPC 100 Mb/s < 1ms Los Angeles
CAIRN Emulation Performance • Link of 1.86Mbps between East and West coasts is the bottleneck • Steward is limited by bandwidth at 51 updates per second. • 1.8Mbps can barely accommodate 2 updates per second for BFT. • Earlier experimentation with benign fault 2-phase commit protocols achieved up to 76 updates per second.
Wide-Area Scalability (3) • Selected 5 Planetlab sites, in 5 different continents: US, Brazil, Sweden, Korea and Australia. • Measured bandwidth and latency between every pair of sites. • Emulated the network on our cluster, both for Steward and BFT. • 3-fold latency improvement even when bandwidth is not limited.
Performance metrics • The system can withstand f (5) faults in each site. • Performs better than a flat solution that withstands f (5) faults total. • Quantitative improvements - Performance • Between twice and over 30 times lower latency, depending on network topology and update/query mix. • Program metric met and exceeded in most types of wide area networks, even when write updates only are considered. • Qualitative improvements - Availability • Read-only queries can be answered locally even in case of partitions. • Write updates can be done when only a majority of sites are connected (as opposed to 2f+1 out of 3f+1 connected servers).
Red Team Experiment • Excellent interaction both with red-team and white-team. • Performance evaluation – symmetric network • Several points on the performance graphs presented were re-evaluated. • results were almost identical. • Thorough discussions regarding the measuring methodology and presenting the latency results • validated our experiments. • Five crash faults were induced in the leading site • Performance slightly improved !!!
Red Team Experiment (2) • Steward under attack • Five sites, 4 replicas each. • Red team had full control (sudo) over five replicas, one in each site. • Compromised replicas were injecting: • Loss (up to 20% each) • Delay (up to 200ms) • Packet reordering • Fragmentation (up to 100 bytes) • Replay attacks • Compromised replicas were running modified servers that contained malicious code.
Red Team Experiment (3) • The system was NOT compromised! • Safety and liveness guarantees were preserved. • The system continued to run correctly under all attacks. • All logs from all experiments are available. • Most of the attacks did not affect the performance. • The system was slowed down when the representative of the leading site was attacked. • Speed of update ordering was slowed down to a factor of 1/5. • Speed was not low enough to trigger defense mechanisms. • Crashing the corrupt representative caused the system to do a view change and re-gain performance.
Red Team Experiment (4) Lessons learned: • We re-built the entire system having in mind the red-team attack: we learned a lot even before the experiment. • The overall performance of the system could be improved by not validating messages that are not needed (after 2f+1 messages have been received). • Performance under attack could be improved substantially with further research.
Next Steps:Throughput Comparison (CAIRN) [ADMST02] Not Byzantine!!!!!
Next Steps: • Performance during common operation: • We believe that wide-area throughput performance can be improved by at least a factor of 5 by using a more elaborate replication algorithm between wide area sites. • Performance under attack: • So far, we only focused on optimizing performance in the common case, while guaranteeing safety and liveness at all times. Performance under attack is extremely important, but not trivial to achieve. • System availability and safety guarantees: • A Byzantine-tolerant protocol between wide-area sites would guarantee system availability and safety even when some of the sites are completely compromised.
Scalability, Accountability and Instant Information Access for Network-Centric Warfare New ideas First scalable wide-area intrusion-tolerant replication architecture. Providing accountability for authorized but malicious client updates. Exploiting update semantics to provide instant and consistent information access. Impact Schedule Resulting systems with at least 3 times higher throughput, lower latency and high availability for updates over wide area networks. Clear path for technology transitions into Military C3I systems such as the Army Future Combat System. System integration & evaluation Component analysis & design Comp. eval. Component Implement. C3I model, baseline and demo Final C3I demo and baseline eval June 04 Dec 04 June05 Dec 05 http://www.cnds.jhu.edu/funding/srs/