1 / 12

.99999

.99999. Dan Oberst, Princeton University. Some Definitions. Reliability Metrics: Percent Uptime. Reliability Gotchas. 2 hour outage in 1 year Requires 23 years of 100% uptime for .99999 99% Availability (88 hours/year) One 3+ day outage One ~7 hour outage every month

latona
Download Presentation

.99999

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. .99999 Dan Oberst, Princeton University

  2. Some Definitions • Reliability Metrics: Percent Uptime Dan Oberst, Princeton University

  3. Reliability Gotchas • 2 hour outage in 1 year • Requires 23 years of 100% uptime for .99999 • 99% Availability (88 hours/year) • One 3+ day outage • One ~7 hour outage every month • One ~1½ hour outage every week • Reliability isn’t the whole story Dan Oberst, Princeton University

  4. The Weakest Link • No system can be more reliable than any of its components • System reliability is product of component reliability Dan Oberst, Princeton University

  5. Beyond Uptime • Scheduled Uptime • How much can you afford to be down? • = How much do you need to plan to be up? • 24x7, 24x6.75, 18x7, etc. • RTO (Recovery Time Objective) • How long before the system is back? • How long can you afford to be without it? • RPO (Recovery Point Objective) • How much lost work? Dan Oberst, Princeton University

  6. Example Service Levels Dan Oberst, Princeton University

  7. How’re We Doin’? • Gartner CIO Poll • How would you rank your most critical applications in unplanned downtime in the past year? Dan Oberst, Princeton University

  8. Dan Oberst, Princeton University

  9. How’re We Doin’? (cont.) • How would you rank your most-critical application in planned downtime during the past year? Dan Oberst, Princeton University

  10. Getting to .99999 • Enhanced Availability • Redundancy • RAID • High Availability • Clustering • Remote mirroring • Fault-Tolerant • All resources (including application) replicated Dan Oberst, Princeton University

  11. Five Nines • It’s hard, it’s expensive. • Match the reliability to the service. • Improve the component with the fewest nines. • Find the cheapest nines in the chain. • Review assumptions. • Practice3!! • Moore’s Law is your friend. Dan Oberst, Princeton University

  12. Resources • CIO Update: Poll Shows Application Availability Levels Have Increased, D. Scott, Gartner Article G00120892, 12 May, 2004. • Real-Time Enterprise: Business Continuity and Availability, D, Scott, J. Krischer, Gartner Research Note SPA-18-1683, 24 September, 2002. • Performance Tuning Active Call Center for Enterprise Applications, Sunny Beach Technology, Inc. White Paper, 7 January, 2001, http://www.sunny-beach.net. Dan Oberst, Princeton University

More Related