Clustering & Fibre Channel

Clustering & Fibre Channel Next wave in PC Computing

Trends in Clustering • Today clusters are a niche Unix market. But, Microsoft will bring clusters to the masses • Microsoft has announced NT clusters • SCO has announced UnixWare clusters • Sun has announced Solaris / Intel clusters • Novell has announced Wolf Mountain clusters • In 1998, 2M Intel servers will ship • 100K in clusters • In 2001, 3M Intel servers will ship • 1M in clusters (IDC’s forecast) • Clusters will be a huge market and RAID is essential to clusters • Fibre Channel and storage area networks are ideal for use in clusters.

Why is the IT Market Moving To Clustering • Growth In On-Line Applications • E-Commence • On-Line Transaction Processing • Web Servers • Real Time Manufacturing • Server Must Be On Line • 24 Hours a Day, 7 Days A Week, 365 Days a Year • New Levels Of Fault Tolerance • Performance Must Scale “Without Our Data, Our Business Is Dead”

What Are Clusters? Group of independent servers which Function as a single system Appear to users as a single system And are managed as a single system Clusters are “virtual servers”

Why Are Clusters Important? • Clusters Improve System Availability • Clusters Enable Application Scaling • Clusters Simplify System Management • Clusters Are A Superior Server Solution

Clusters Improve System Availability • When a networked server fails, the service it provided is down. • When a clustered server fails, the service it provided “failover” and downtime is avoided. Mail Server Internet Server Mail & Internet Networked Servers Clustered Servers

Clusters Enable Application Scaling • With networked SMP servers, application scaling is limited to a single server • With clusters, applications scale across multiple SMP servers (typically up to 16 servers) Oracle Voice mail Oracle E-Mail E-Mail Voice mail

Three Management Domains One Management Domain Clusters Simplify System Management • Clusters present a Single System Image; the cluster looks like a single server to management applications • Hence, clusters reduce system management costs

RAID An Analogy to RAID • RAID Makes Disks Fault Tolerant • Clusters make servers fault tolerant • RAID Increases I/O Performance • Clusters increase compute performance • RAID (GAM) Makes Disks Easier to Manage • Clusters make servers easier to manage

Two Flavors of Clusters • High Availability Clusters • Microsoft's Wolfpack 1 • Compaq’s Recovery Server • Load Balancing Clusters (a.k.a. Parallel Application Clusters) • Microsoft’s Wolfpack 2 • Digital’s VAXClusters Note: Load balancing clusters are a superset of high availability clusters.

Available Clustering Software • Failover Clusters • Microsoft Clustered Servers (Enterprise Server 4.0) • Compaq’s Recovery Server • NCR Lifekeeper • Digital NT Clusters • UnixWare 7 Reliant • NSI DoubleTake failover in Wide Area Networks • Novell’s Vinca Corp. Standby Server • Load Balancing Clusters • Microsoft’s Clustered Servers (NT 5.0) • Digital’s VAXClusters • IBM’s HACMP • Oracle’s Parallel Server • Novell’s Orion

Failover Example • Two node clusters (node = server) • During normal operations, both servers do useful work • Failover • When a node fails, applications “failover” to the surviving node and it assumes the workload of both nodes Mail Web Mail & Web

3,000 TPM 3,600 TPM Load Balancing Example • Failover(cont’d) • Multi-node clusters (two or more nodes) • Load balancing clusters typically runs a single application, e.g. database, distributed across all nodes • Cluster capacity is increased by adding nodes (but like SMP servers, scaling is less than linear)

Load Balancing Example (cont’d) • Cluster rebalance the workload when a node dies • If different apps are running on each server, they failover to the least busy server or as directed by preset failover policies

Sharing Storage In Clusters Two Models For Storage • #1. “Shared Nothing” Model • Microsoft’s Wolfpack Cluster • #2. “Shared Disk” Model • VAXClusters

What is Fibre Channel? • Fibre channel is a high performance multiple protocol data transfer technology. Fibre channels primary task is to transport data extremely fast with the least possible delay. Fibre channel is a serial interconnect standard that allows servers, storage devices, and workstation users to share large amounts of data quickly. Fibre channel gives networks superior data transfer speeds, flexible topology, and flexible upper-level protocols. Fibre channel easily handles both networking and peripheral I/O communication over a single channel.

Key Features of Fibre Channel • Data Transfer Rates up to 100 MB/sec per connection • Three Fibre channel connection schemes • Point-to-Point: • Provides a single connection between two servers or a server and its RAID storage. • Switched Fabric: • Using a Fibre channel switch this connection allows each server or RAID storage to be connected point-to-point to a switch. This method allows for the construction of massive data storage and server networks. • Loop: • Connects up to 126 servers, RAID systems, or other storage devices in a loop topology.

Key Features of Fibre Channel • Long Distance Connections: • Up to 10 kilometers • High reliability connections providing high data integrity • Low Overhead • Multiple media types support: • Copper: • lower cost • not as reliable as fibre optic cable • distances up to 20 meters between nodes • Fibre optic: • higher cost • more reliable than copper • distance up to 10,000 meters between nodes • Allows for creation of Storage Area Networks

Benfits of Fibre Channel • Runs multiple protocols over a single network or loop • Ideal solution for clustering of servers • Enables multiple servers to share storage and RAID systems • Scales from small numbers of peripherals attached short distances apart, to large numbers attached many kilometers apart.

Benfits of Fibre Channel (cont’d) • Delivers speeds that are 2.5 to 250 times faster than existing communication and I/O interfaces • Overcomes today's network performance limitations regarding error detection and recovery • Provided low-cost and reliable performance for distance connections • Offers flexible protocols and topologies to best leverage existing technology investments

Cluster Interconnect • This is about how servers are tied together and how disks are physically connected to the cluster.

Client Network Cluster Interconnect HBA HBA RAID Cluster Interconnect • Clustered servers always have a client network interconnect, typically Ethernet, to talk to users, and at least on cluster interconnect to talk to other nodes and to disks.

NIC HBA NIC HBA RAID Cluster Interconnect • Or they can have Two Cluster Interconnects • One for nodes to talk to each other: • Heartbeat Interconnect typically Ethernet • And one for nodes to talk to disks • Shared Disk Interconnect typically SCSI or Fibre channel Cluster Interconnect Shared Disk Interconnect

“Heartbeat” Interconnect NIC NIC HBA HBA Shared Disk Interconnect RAID RAID NT Cluster with Host-Based RAID Array • Each node has: • Ethernet NIC -- Heartbeat • Private system disks (HBA) • PCI-based RAID controller - SCSI or Fibre, such as Mylex’s eXtremeRAID™ • Nodes share access to data disks but do not share data

“Heartbeat” Interconnect NIC NIC Shared Disk Interconnect HBA HBA RAID NT Cluster w/ SCSI External RAID Array • Each node has: • Ethernet NIC -- Heartbeat • Multi-channel HBA’s connect boot disk and external array • Shared external RAID controller on the SCSI Shared Disk Interconnect

NIC NIC HBA HBA RAID NT Cluster w/ Fibre External RAID Array • Fibre Channel Host to Ultra2 SCSI LVD RAID system • Fibre Channel Host to Fibre Channel RAID system “Heartbeat” Interconnect Shared Disk Interconnect

Internal vs External RAID in Clustering • Internal RAID • Lower cost solution • Higher performance in read-intensive applications • Proven TPC-C performance enhances cluster performance • External RAID • Higher performance in write-intensive application • Write-back cache is turned-off in PCI-RAID controllers • Higher connectivity • Attach more disk drives • Greater footprint flexibility • Until PCI-RAID implements fibre

Single FC Array Interconnect FC HBA FC HBA HBA HBA SX SX Active / Active • Fibre Channel Host - Ultra2 LVD RAID Controller in a Clustered Loop

FC HBA FC HBA HBA HBA FC HBA FC HBA SF SF SF (or FL) Active / Active Duplex • Fibre Channel Host - Ultra2 LVD RAID Controller in a Clustered Environment FC Disk Interconnect Dual FC Array Interconnect

HBA HBA FC HBA FC HBA FF FF Fibre Host Active / Active Duplex Single FC Array Interconnect

FC HBA FC HBA HBA HBA FC HBA FC HBA FF FF FC - FC in Cluster Dual FC Array Interconnect

Mylex - The RAID Clustering Experts eXtremeRAID™ 1100 AcceleRAID™ 250 DAC SF DAC FL DAC FF

Clustering & Fibre Channel