1 / 35

Five 9s for SANs w/o Breaking the Bank

Five 9s for SANs w/o Breaking the Bank. Presented by Marc Staimer President & CDS (Chief Dragon Slayer) Dragon Slayer Consulting. Agenda. What is Five 9s? How this relates to SANs Reality Check What you should do. What is Five 9s & What does it Really Mean?. Five 9s Generally Defined.

pgeraldine
Download Presentation

Five 9s for SANs w/o Breaking the Bank

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Five 9s for SANsw/o Breaking the Bank Presented by Marc Staimer President & CDS (Chief Dragon Slayer) Dragon Slayer Consulting

  2. Agenda • What is Five 9s? • How this relates to SANs • Reality Check • What you should do

  3. What is Five 9s & What does it Really Mean?

  4. Five 9s Generally Defined • 99.999% is another term for “High Availability”

  5. What does “Availability” mean? • Availability is the proportion of time that a system can be used for productive work

  6. Then what does “five 9s” mean? • Scheduled & Unscheduled downtime does not exceed ~5 minutes per year • Perspective: Annual downtime = • Less time than it takes to drink a cup of coffee • 1/6th the time of the average daily commute

  7. What about Four 9s or less? • Four 9s = ~ an hour of downtime/yr • Three 9s = ~ 9 hours of downtime/yr • Two 9s = ~ 4 days (88 Hours) of downtime/yr

  8. Can you live two, three, or four 9s? …it Depends • On the Application • The types of outages you can live with • The cost of downtime for those applications • The cost of high availability such as five 9s.

  9. Application Availability Dependencies • Mission criticalness • Productivity loss from downtime • Alternatives

  10. Outage dependencies • You may be able to live w/two 9s if: • There are 88 separate outages of 1 hour each through the year • It is a different story if it is 1 outage nearly 4 days • This could put a business out of business

  11. Cost of downtime • The cost of app downtime can be prohibitive

  12. Direct costs of downtimeper Gartner Group Industry Average Loss/Hr. • Brokerage Operations $6,450,000 • Credit Card Authorizations $2,600,000 • E-commerce $240,000 • Package Shipping Services $150,250 • Home Shopping Channels $113,750 • Catalog Sales Center $90,000 • Airline Reservation Center $89,500 • Cellular Service Activation $41,000 • ATM Service Fees $14,500

  13. Collateral damage of downtime is moreper Gartner Group Company Direct Cost Collateral Damage • eBay > $5,000,000 Dramatic Mkt cap reduction • ATT > $10,000,000 ~$40 million in rebates +SLAs • Collateral damage is more serious than temporary loss of business • Collateral damage severity increases as business moves online

  14. Old rule of thumb: 1st 80% 20% of Cost Last 20% 80% of Cost Per IMEX Research Making “availability” five 9s, has cost too

  15. Per IMEX Research There must be tradeoffs

  16. Excessive System Costs Annual Business Downtime Cost System Cost Excessive Downtime Costs System Uptime Requirements 90% 99% 99.90% 99.99% 99.999% 100% Percent Available Finding the crossover point is key

  17. How: Thorough Environment Knowledge • Systems • Hardware • Software • Data • Productivity • Direct cost of downtime and collateral damage

  18. What about disasters & downtimeNot if, when • There will eventually be a major interruption of your business environment

  19. Test, test, test • Whatever your business continuity plans • Make sure you can recover your business in the event of a failure • Test, test, test • One end-user claims to backup to tape every month, except he backs up onto the same tape every time, even when the system asks for a new tape

  20. Reasons cited by European Enterprises for invocation of Business Continuity Plans From 1997-2000 • Hardware Failure 60% • Software 16% • Power Outage 7% • Bomb 3% • Fire 3% • Flooding 3% • Environmental 2% • Telecom Failure 1% • Denied access 1% • Miscellaneous 4%

  21. Reasons cited by USA Enterprises for invocation of Business Continuity Plans From 1997-2000 • Regional Event 40% • Hardware Failure 36% • Software 10% • Power Outage 4% • Bomb 2% • Fire 2% • Flooding 2% • Environmental 1% • Telecom Failure 1% • Denied access 1% • Miscellaneous 1%

  22. How does all this relate to SANs?

  23. SANs have become the critical path of “high availability” or five 9s. • When an application server fails • Only the users using that app are affected • When shared storage goes down • Users of the applications using that storage are affected • When the SAN goes down • All users are affected

  24. Complete availability vs. high availability w/reduced capabilities • Five 9s w/no loss of capabilities • Full Bandwidth all the time w/no pr • Five 9s w/reduced capabilities • Reduced Bandwidth • Higher probability of path congestion • Similar to differences between RAID 0,1 & RAID 5

  25. Director class switches Full bandwidth between Initiators & target storage Even with a failure in the Director or fabric FC/9000 Five 9s SANs with full capabilities

  26. Five 9s SANs with reduced capabilities • Core/edge networking • Oversubscribed B/W • Path failures mean • Auto failover • Reduced B/W • Increased possibilities of congestion

  27. 96 Port Resilient Core/Edge Fabric 128 Port Fault Tolerant Director Fabric or 128 Port Dual 64 Port Directors Edge Core Using 16-port Core Switch switches Edge Switch Fabric Comparison or Red Herring?

  28. Directors - five 9s fully capable Cost ~ $2,500/port Mask failures Apps never know it fails Full B/W even with failures Simple to set up & manage Fault tolerant Network: up to 239 switches/directors Up to 256 ports/director Can be Core or Edge switch Switches - five 9s, w/reduced failure mode capabilities Cost ~ $1,000/port Oversubscribed B/W Congestion statistically unlikely Failures mean loss of B/W More difficult to set up/manage Fault resilient Network: up to 239 switches Up to 64 ports/switch Can be Core or Edge Switch FC/9000 Directors vs. Core/Edge Switches

  29. Core/edge & Directors are not mutually exclusive Models can & should be mixed Some apps cannot handle fabric disruptions of any kind Some fabrics can never ever have reduce capacity Some apps do not have to have full B/W all the time FC/9000 FC/9000 Reality Check

  30. Fabric Design “five 9s” Factors • The larger the switch/director nodes • The less likely there will be inter-switch/director traffic • The more oversubscribed your fabric can be w/o increased risk • The more important “HA” becomes in the node itself • FSPF has limited failover capabilities • The loss of a path in the fabric (ISL failure) will cause failover • Failover may not be fast enough to avoid SCSI device timeout • Edge device retransmissions or failover must be designed in

  31. The Key is determining where to implement with what & when • Use the same ROE as before • Thorough knowledge of the data & environment • Hardware, software, systems, etc. • Match the type of SAN to the application

  32. What you should do • Educate yourself about your data & environment • Design your SANs to meet the needs of the business • Provide five 9s with full capability for those apps that need it • Provide five 9s with less than full capability for those apps that don’t need it • Making your entire SAN environment completely five 9s w/no loss of capabilities could be cost prohibitive

  33. Upgrade / Architectural change Design Maint. Implementation Add / Change/Remove /Mgt /Trouble shoot Data Collection Transition Data Analysis Release toProduction ArchDevelop Prototype and Test SAN Design Methodology

  34. Other tools you can use • Interactive online high availability interrogator • Helps determine the cost of your downtime • White papers • http://www.available.com

  35. Marc Staimer marcstaimer@earthlink.net 503-579-3763

More Related