290 likes | 411 Views
IOS Update for SwiNOG 4th. 17 th April 2002. Chris Martin Systems Engineer Cisco Switzerland. Agenda. Cust.Sat Survey / Quality Initiatives High Availability. Customer Sat Survey. Five Cisco IOS SW Quality Goals Embraced throughout Cisco. Goal 1 - Reduce regression defects
E N D
IOS Update for SwiNOG 4th 17th April 2002 Chris Martin Systems Engineer Cisco Switzerland
Agenda • Cust.Sat Survey / Quality Initiatives • High Availability
Five Cisco IOS SW Quality GoalsEmbraced throughout Cisco Goal 1 - Reduce regression defects Goal 2 - Reduce customer-found defects Goal 3- Reduce total outstanding defects (backlog) in a timely manner Goal 4 - Increase software release clarity and feature consistency Goal 5 - Provide feature and maintenance releases with predictable schedules and quality
Goals of IOS Repackaging • Simplify software selection process • Eliminate massive feature set confusion • Reduce internal cost
VOICE A T M FW The Legacy - circa 1996 B e g a n s i m p l y ... PLUS CRYPTO Enterprise (Includes Desktop and IP) SNA F u n c t i o n a l i t y Desktop (includes IP) IP • 37 Feature sets and 2500 images
80% 70% IP 60% Less than 10%of Feature sets count for 90%of revenue(platforms: C800, C1600, C1700, C2500, C2600, C3600, C5x00, & C7x00) 50% 40% 30% 20% 10% IP+ 0% IOS Revenue by Feature Sets IOS Revenue by Feature Sets(Based on # of systems shipped
IOS Technology Packaging3 Programs • “Jenny Craig” - streamline IOS code by deprecating older legacy protocols no longer in use • “IOS Reformation” – Realign IOS to today’s market needs & simplify image selection process • “IOS Inquisition” - End of life older images which are business justified – about 60%
Agenda • Cust.Sat Survey / Quality Initiatives • High Availability / Resilient IP
The High Costs of Downtime • The average downtime costs incurred in the past 12 months:$21.6 Million Ranges from $500,000 to $298M Equates to an average of $2,169 per minute • % having experienced downtime costs in the past 12 months:98% Source: Sage Research, Aug. 2001
46 Minutes 8 Hours 23 Minutes 4 Hours 53 Minutes 5 Minutes 30 Seconds Carrier Class Means High Availability What Is High Availability? High Availability means an average end user will experience less than five minutes down time per year Downtime Per Year Availability DPM 99.900% 1000 99.950% 500 99.990% 100 99.999% 10 1 99.9999% High Availability means five 9’s or more
How is Availability Calculated? • Availability (%) is calculated by tabulating end user outage time, typically on a monthly basis. • Some customers prefer to use DPM (Defects Per Million) to represent network availability.
23.9 Software upgrade Parts replacement 23.7 17.4 Site relocation 16.7 New device installation Device replacement 15.1 Device maintenance 15 8.4 Other 0 5 10 15 20 25 30 hours Scheduled Downtime
The Edge is the Most Vulnerable • The Core is redundant enoughto disguise failures. • The Edge is a Single Point of Failure. • The Edge is what the customer sees.
The Edge is the Most Vulnerable To Customers Failures here may affect thousands of customers
Phase 1 Target Phase 2 Target Components of Downtime Detect failure C O M P O N E N T S Switchover to redundant RP or Relaod RP Reload image, parse config, identifyLC in router Final Initialization, take control of bus Reload LC image Restore connectivity (I.e. Frame Relay, PPP, etc) Converge route table and inform LC of new forwarding information Restored Relative Time
Delivering HA Features in Phases Reduce MTTR Maintain Sessions Planned Outages • Phase 1 • c7500 SLCR • Reduce RP failover time (RPR/RPR+) • Fast S/W Upgrade • Faster FR recovery • Phase 2 • Non StopForwarding • (BGP, OSPF, ISIS) Stateful Switchover • (cHDLC, PPP, ATM, FR) • Phase 3 • Additional protocol support (EIGRP, MLPPP, MPLS, IPv6,TBD) • Additional platform support (c6500/C7600) • Phase 4 • In Service Software Upgrades Delivered EFT Single Line Card Reload (SLCR) Route Processor Redundancy (RPR)
Initial Supported Platforms • Phases 1 & 2 • Cisco 12000 • Cisco 10000 ESR • Cisco 7500 • Phase 3 • Cisco 6500/7600 • Future • C7300, AS5850, MGX8850, C10000ubr
RPR+ Evolution • High System Availability (HSA): Two RPs – if Active RP fails the system reboots and theStandby becomes active • RPR: Two RPs, Standby becomes active very quickly. However, line cards are reloaded.
RPR+ Evolution • RPR+: Two RPs, Standby becomes active very quickly and without reloading line cards. • RPR+ is a stepping stone for SSO and NSF
Stateful Switchover (SSO) • RPR+ Maintains link stateSession state (I.e. Frame Relay, PPP, ATM, MPLS) is lost during RP switchover. Resulting in “dropped calls” and time to re-establish connections.
Stateful Switchover (SSO) • Stateful Switchover passes state information from the Active RP to the Standby RP. Resulting in maintaining sessions during a RP switchover.
SSO Protocol Support • Initial • PPP, cHDLC, ATM, Frame Relay • Now being developed • MLPPP • MPLS VPN and TE • Planned • Multicast • Looking for input
NSF Protocols Support • Initially OSPF and BGP • Immediately after IS-IS • Then EIGRP (for initial C6500 support but will support appropriate router platforms as well)
Standards • All work has been submittedto the IETF • ISIS - draft-shand-isis-restart-00.txt • BGP - draft-ietf-idr-restart-01.txt
Find it on the WebLearn More About HA High System Availability (HAS @ C7500): http://www.cisco.com/warp/public/cc/pd/rt/7500/prodlit/haibd_ov.htm http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/12cgcr/fun_c/fcprt3/fc_hsa.htm http://www.cisco.com/warp/partner/synchronicd/cc/pd/iosw/iore/iore111/prodlit/hsa1_in.htm Whitepaper on High Availability on Cat6k: http://www.cisco.com/warp/partner/synchronicd/cc/pd/si/casi/ca6000/tech/hafc6_wp.htm High Availability @ the Edge (C10000): http://www.cisco.com/warp/partner/synchronicd/cc/pd/rt/10000/prodlit/c1hae_wp.htm Route Processor Redundancy Plus (C12000): http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/120newft/120limit/120st/120st17/rpr_plus.htm
Presentation_ID 29 29 29 © 2001, Cisco Systems, Inc. All rights reserved.