1 / 25

The virtue of dependent failures in multi-site systems

The virtue of dependent failures in multi-site systems. Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in System Dependability (HotDep), Yokohama, Japan, June 2005. Collection of sites across a WAN Multiple processors per site Storage nodes

macy
Download Presentation

The virtue of dependent failures in multi-site systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The virtue of dependent failures in multi-site systems Flavio Junqueira andKeith Marzullo University of California, San Diego Workshop on Hot Topics in System Dependability (HotDep), Yokohama, Japan, June 2005

  2. Collection of sites across a WAN Multiple processors per site Storage nodes Computing nodes Share resources E.g. BIRN, Geon, TeraGrid Failures Processors unavailable Services do not mask failures Improve availability under failures Replication Minimize overhead Multi-site systems

  3. Introduction • Failures in multi-site systems • Processor failures • Site failures • Processors of the site become unavailable • A new failure model • Availability through replication • Replica placement • Operations on replicas: quorums • Replicated data: quorum update • Replicated functionality: state-machine using Paxos • Quorum constructions • Failure model in practice • Implement the model • Site availability in BIRN • Model for processor failures within a site Software and hardware faults • Misconfigured software • Shared resources • Storage • Power circuits • Cooling pipes • Air conditioning • Network

  4. A dependent failure model • Threshold model • Limit on the number of processor failures • Simple • Model well homogeneous processors that fail independently • Multi-site: sites unavailable frequently enough • Processor failures are not IID • All processors become unavailable • The multi-site threshold model • Two components • Threshold on the number of site failures (fs) • One threshold per site on processor failures (t) • Assumptions • Sites are homogeneous • Processors within a site are homogeneous • Processor failure = crash

  5. Quorum systems • Quorum system Q • Quorum system: set of quorums • Quorum: set of processors • Intersection property: every pair of quorums in Q intersect • Algorithms: access a quorum • Example: Majority system • n processors • Every subset of size (n+1)/2 is a quorum • Optimal availability for IID processor failures

  6. QSite Select at least (2fs +1) sites: S Select at least (2t +1) processors from each site in S Quorum Majority of sites in S Majority of processors in each site An example (fs = 1, t = 1) Quorums A quorum construction: QSite Site 1 Site 2 Site 3

  7. Properties of multi-site threshold model hold Same replicas for QSite and Majority Availability fsunavailable sites Remaining fs + 1 sites tunavailable processors Majority: no quorum available Requires: Available: QSite: one quorum available QSite has better availability Majority is not optimal Quorum sizes QSite produces smaller quorums Reduces load Increases capacity QSite vs. Majority

  8. QSite, fs = 2, t = 1: 5 sites 3 processors per site 6 processors per quorum Compromise availability Quorums Reducing quorum sizes and sites Site 1 Site 2 Site 3 Site 4

  9. Site availability • Goals • Show that sites are unavailable frequently enough • Threshold on the number of site failures • BIRN - Biomedical Informatics Research Network • Test bed projects centered around brain imaging • Currently: 19 universities, 26 research groups • Availability • Monthly basis • Pings (BIRN-CC) • Storage broker logs • Site availability • Jan/04-Aug/04 • Availability under 100% • On average in 5 out of the 8 months

  10. BIRN site availability 10 sites experience at least one outage One site under 97%

  11. Threshold on unavailable sites • Worst-case scenario • Assumption: independent site failures • nmost unavailable sites in each month • Probability that all n sites are unavailable • Each 1% of unavailability is approximately 7 hours

  12. Modeling failures in a site • Homogeneous set of processors • Independent processor failures • Identical probability of failure • Processors are repaired • Repair probabilities change with number of failures • Markov chain • From the model: threshold on the number of failures (t) • Desired degree of availability • Stationary probabilities

  13. An example • Three processors per site • Probabilities • Failure probability much smaller than repair probabilities • Repair probabilities increase with failures t = 1 Availability  0.001

  14. Discussion & Future work • Multi-site systems: important class of distributed systems • Share resources • Collaboration among distant groups • Improve availability through replication • A useful abstraction: quorum systems • Algorithms built on top of quorum systems • Dependent failures • Site failures • Enables smaller, higher available quorums • Lessons to learn • Considering dependent failures may improve results • Models are not necessarily complex • Future work • Validate model, evaluate constructions in practice, more constructions, etc.

  15. END

  16. Equations

  17. Software and hardware faults • Software incompatibility, misconfiguration • Shared resources (e.g. storage) • Power failures • Broken pipes • Loss of air conditioning • Network problems Introduction • Failures in multi-site systems • Processor failures • Site failures • Processors of the site become unavailable • A new failure model • Availability through replication • Replica placement • Operations on replicas: quorums • Replicated data (quorum update) • Replicated functionality (state-machine using Paxos) • Quorum constructions • Failure model in practice • Implementability of the model • Real system for site availability (BIRN) • Model for processor failures within a site

  18. Software incompatibility, misconfiguration Shared resources (e.g. storage) Power failures Broken pipes Loss of air conditioning Network problems Introduction • Failures in multi-site systems • Processor failures • E.g. HW failures • Site failures • Strategies for replica placement • Large number of sites and nodes • Updates • Naïve approach: every non-faulty replica up to date • Quorum update: contact a quorum of processors • Distributed shared register (replicated data) • Multiple copies of a data set (Quorum Update) • E.g. Brain images (BIRN); Geological data (Geon) • Consensus (replicated functionality) • State-machine approach (Paxos algorithm) • E.g.: Parallel computation (TeraGrid)

  19. Why sites fa • Software incompatibility, misconfiguration • Shared resources (e.g. storage) • Power failures • Broken pipes • Loss of air conditioning • Network problems

  20. Quorums in a multi-site system • Data replication • Multiple copies of data sets • Functionality replication • State-machine approach • Paxos (Coteries for Classic Paxos) • Question: How do we choose nodes to replicate? • Flat organization • Organization into sites

  21. Quorum systems • Quorum system Q • Quorum system: set of quorums • Quorum: set of processors • Intersection property: every pair of quorums in Q intersect • Algorithms: access a quorum when executing some operation • Examples • Majority system: • n processors • Every subset of size (n+1)/2 is a quorum • Optimal availability for IID processor failures • Multi-colored: colors as sites Processors Quorums

  22. Quorum systems (cont.) • In multi-site systems • Replicated data • Multiple copies of a data set (Quorum update) • E.g. Brain images(BIRN); Geological data (Geon) • Replicated functionality • State-machine approach (Paxos algorithm) • E.g.: Parallel computation (TeraGrid) • Quorums for multi-site systems • Replicating on every node is excessive • Quorum construction • Set of processors to replicate on • Quorums

  23. Examples of quorum systems • Majority system: • n processors • Every subset of size (n+1)/2 is a quorum • Multi-colored: colors as sites • Majority has optimal availability for independent and identically distributed processor failures (IID) Universe Quorum patterns

  24. BIRN site availability 10 sites have at least one outage One site under 97%

  25. Discussion & Future work • Multi-site systems: important class of distributed systems • Share resources • Collaboration among distant groups • Improve availability through replication • A useful abstraction: quorum systems • Algorithms built on top of quorum systems • Dependent failures • Site failures • Enables smaller, higher available quorums • Future work • Validate multi-site threshold model • Evaluate proposed constructions in practice • More constructions • More issues with dependent failures

More Related