L inux clustering Workshop 4

Linux clustering Workshop 4 Dr. Zahid Anwar

Simplified Architecture of Linux Cluster Simplified Architecture of a Single Computer Simplified architecture of an enterprise cluster

No Single Point of Failure An enterprise cluster should always have the following characteristic: “Any computer within the cluster, or any computer the cluster depends upon for normaloperation, can be rebooted without rebooting the entire cluster.” e.g. bybuilding high-availability server pairs

Clustering Terminology Process When a program runs When Process runs on a Linux System A demon and the effects it produces A service when combined with its operating environment (config files, data, network) When a resource moves from one computer to another A proper failover configuration has no single point of failure. daemon Service Fail-Over High Availability

Types of Clusters Originally, "clusters" and "high-performance computing" were synonymous. Today, the meaning of the word "cluster" has expanded beyond high-performance to include • high-availability (HA) clusters and • load-balancing (LB) clusters

Types of Clusters • High-availability clusters, • also called failover clusters, • used in mission-critical applications. • The key to high availability is redundancy. • Load-balancing cluster • provide better performance by dividing the work • This might be accomplished using a simple round-robin algorithm. • For example, Round-Robin DNS

Terminology • Parallel computing • Tightly coupled sets of computation. • E.g. Several pieces of data are being processed simultaneously in the same CPU • Homogenous collection of computers • Distributed computing • Computing that spans multiple machines or multiple locations. • Heterogeneous collection • Cluster Computing • A form of Distributed Computing • Generally restricted to computers on the same subnetwork or LAN. • Grid computing • Frequently describes computers working together across a WAN or the Internet. • Much larger scale, • tend to be used more asynchronously, • and have much greater access, authorization, accounting, and security concerns. • Peer-to-Peer • Data or file-sharing (Napster, Gnutella, or Kazaa) • SETI@Home

Building a HA Cluster using Heartbeat • Heartbeat: ability to failover a resource from one computer to another • Functioning • Tell Heartbeat which computer owns a particular resource (define primary and backup server • Heartbeat daemon on backup server listens to the "heartbeats" coming from the primary server. • If backup server does not hear the primary's heartbeat, it initiates a failover and takes ownership of the resource.

The Physical Paths of Heartbeat Normally Heartbeat configured to work over a separate physical connection between two servers. Separate physical connection can be either a • serial cable or • another Ethernet network connection (via a crossover cable or mini hub). Adds extra traffic to your network

Heartbeat Control Messages • 3 basic kinds • Heartbeat (status msgs) • Typically 150 bytes • broadcast, unicast, or multicast • Cluster Transition msgs • relatively rare • contains conversation b/w daemons to move resources • ip-request : to release the resource of ownership • ip-request-resp:shuts off the service and no longer owns the resource. • On receiving ip-request-resp, it starts up the service and offers it to client • Retransmission Requests • Rexmit-a request for a retransmission of a heartbeat control message when one of the servers notices that it is receiving heartbeat control messages out of sequence.

Secondary IP Addresses (Virtual Ips) Method for adding multiple IP addresses to the same physical network card. When you use Heartbeat to offer services it is done using secondary IP addresses

Lab Exercise Set up a 2-node cluster Configure a highly-available web server

Load Balancing using Ultra Monkey (LVS) Linux Virtual Server (LVS) enables TCP/UDP connections to be load balanced Mechanism of connection control is referred to Layer 4 Switching. Layer 3 IP address/port information is used. The host that LVS runs on is referred to as the Linux-Director (specialized router) Packets received for a virtual service by linux-director, routed by a scheduling algo • subsequent packets for the same connection sent to the same real server Advantage of load balancer over round robin DNS • directs requests to less load nodes • accounts for sessions. (e.g. forum software, shopping carts)

L inux clustering Workshop 4