340 likes | 489 Views
High Availability High Performance Systems. an e-business perspective Digitask Consultants, Inc. digitask@digitask.com (212) 682-6652. What is High Availability?. Uptime Levels (%). Annual Downtime. Availability Classification. Fault Tolerant 99.9999 < 1 minute
E N D
High AvailabilityHigh Performance Systems an e-business perspective Digitask Consultants, Inc. digitask@digitask.com (212) 682-6652
What is High Availability? Uptime Levels (%) Annual Downtime Availability Classification Fault Tolerant 99.9999 < 1 minute Extremely High Availability 99.999 5 minutes Fault Resilient High Availability 99.99 53 minutes High Availability 99.9 8.8 hours Commercial Availability 99.5 43.8 hours Sources: Gartner Group, Transaction Processing Performance Council, Compaq
Opportunity! Why Should I Care?
Average Cost per Hour of Downtime Industry Application Cost of Downtime Financial Brokerage operations ??????? $ 6,500,000 Financial Credit Card Sales ??????? $ 2,600,000 Media Pay Per View ??????? $ 150,000 Retail Home Shopping (TV) ??????? $ 113,000 Retail Catalog Sales ??????? $ 90,000 Transportation Airline Reservations ??????? $ 89,500 Source: Gartner Group and Contingency Planning Research
Top 6 Reasons for Server Failure • Software defects/failures • Planned administrative downtime OS upgrades, DB administration, etc. • Operator error • Hardware outage/maintenance • Building/site disaster fires, sprinkler systems • Metropolitan disaster storm, floods Survey of IS managers Source: Gartner Group
The Solution... Compaq Clusters OpenVMS Tru64 UNIX TruClusters Windows NT
Performance & Availability Production Server Available Server Availability TRU64 UNIX AlphaServer Systems Foundation TruClusters - Yesterday V 1.x Each builds upon the other
HSZx0 HSZx0 Private Disks System Disk Private Disks System Disk TruCluster 1.x Memory Channel Interconnect
TruCluster 5 Memory Channel Interconnect
TruCluster Server Version 5.0 • Single system image cluster • Shared file system • Dramatically easier management • Simpler application availability and scalability
/ /usr /var /... /... /... /... /... /... TruCluster 5.0 Feature Summary • Easier management • Clusterwide file system • Cluster alias • Application availability facility • Cluster wide storage • Support for larger & more flexible configurations • No requirement for symmetric configurations • No need for private storage (all storage can be on shared buses)
UNIX Workstation X11 Web / Java PC Tru64 UNIX System Management (LAN) SingleSystem SNMP SNMP WBEM WBEM Cluster (LAN / WAN) CLI Script Tru64 UNIX Management Tru64 UNIX Management (LAN / WAN)
/ /usr /var /... /... /... /... /... /... Cluster Management The best cluster management is the management you NEVER have to do! TruCluster V5.0 Traditional UNIX Clusters
/ /usr /var /... /... /... /... /... /... Cluster File System • Single cluster-wide namespace with a single shared root • Same view from all cluster members • Mechanism to address member-specific files • Client/Server model initially • Layers on existing file system • AdvFS, NFS, UFS (r/o), CDFS • Transparent file system failover and recovery • Integrated with cluster alias for NFS server
System and Storage Management • CFS is an enabling technology • Most management operations “just work” • Single copy of most configuration files • Device names are consistent cluster-wide • Storage devices are available everywhere • Fewer things to manage • Operating system and applications installed once per cluster • Automatic disk and file system failover • Single security domain • Base and enhanced security
System and Storage Management • BUT… Still must manage some things separately • Kernel tuning, process tuning • Network adapter, tty configuration • Licensing
Cluster Alias Client Client Client Router • Cluster appears as single system to network • Can support multiple aliases • Single host name to clients • Transparent handling of node and adapter failures • Dynamic load balancing • Network services • Efficient forwarding over cluster interconnect Cluster - canine 1.1.1.0 Retriever AlphaServer labrador 1.1.1.1 AlphaServer golden 1.1.1.2 AlphaServer basset 1.1.1.3 AlphaServer bluetick 1.1.1.4 Hound
Application Support • Applications need only be installed once in the cluster although may be licensed per node • Single instance applications • May only run on one member of a cluster at a time • Multiple copies would conflict with each other • Typical old-style ASE applications
Application Application Application Single Instance Applications Channel Memory Interconnect
Application Support • Multiple instance applications • May run on multiple or all cluster members • Multiple copies don’t conflict • Some ASE applications can now run on multiple members
Application Application Application Application Multi-Instance Applications Channel Memory Interconnect
Cluster Application Availability • Provides application failover or restart within the cluster • Application and resource dependencies • Application profile determines failover policy and dependencies • Mechanism for application-specific monitoring • Monitoring of applications via ‘check’ entry in action script • Command line and GUI-based management • ASE application start/stop scripts easily migrate with minimal changes
Application Support • Cluster aware applications • Use cluster features such as the Distributed Lock Manager • Coordinate storage r/w access from multiple nodes
Multi-Instance Applications Application Application Application Application Channel Memory Interconnect
Load Balancing Dynamic load balancing of client connections Cluster Management Applications are installed once for entire cluster Configuration changes made once for the cluster Users are authorized once for all cluster nodes Installation and Configuration Rolling upgrade of o/s Single system image Cluster-wide file system Cluster alias Single security domain Cluster-wide naming of storage devices Single event manager/error log TruCluster Advantages Over Other UNIX Clusters
Hardware specifics Interconnect speed (6-12x faster) Maximum number of nodes: 8 Largest Node supported: ~125,000 tpm-C Smallest node: AlphaServer 800 (<$7,000) Support for Switched Fibre Channel Support for simultaneous direct access to database tables Available API for parallel resource locking Available up-time guarantee 99.99% TruCluster Advantages Over Other UNIX Clusters
Bottom Line - Management Traditional UNIX Clusters Single Systems $ TruCluster Server V5.0 Number of Nodes Tru64 UNIX TruClusters cost less to manage
Bottom Line - Reliability Uptime Guarantees99.99% Plus joint effort by COMPAQ & the customer Business Critical Custom For Eligible Alpha systems Availability Review On-site Spares Installation Priority Executive Package Intimacy of Partnership Priority PremierPackage Customer need for high availability Tru64 UNIX TruClusters are more reliable
Bottom Line - Size Tru64 UNIX TruClusters are more scaleable
Bottom Line - Industry Opinion TruCluster V5.0 "Nines are necessary, but not sufficient. Simple, straightforward use is also vital... Here Compaq has excelled, going the distance in building multi-system scalability, reliability, and manageability into the heart of UNIX." Jonathan Eunice, Illuminata, Inc., 4/99
Thank You John Zimmerman johnz@digitask.com (212) 682-6652