350 likes | 638 Views
Clustering. Types of Clustering. Objectives. At the end of this module the student will understand the following tasks and concepts. What clustering is and why you would want it Clustering options Differences between various types of clustering; advantages and disadvantages
E N D
Clustering Types of Clustering
Objectives At the end of this module the student will understand the following tasks and concepts. • What clustering is and why you would want it • Clustering options • Differences between various types of clustering; advantages and disadvantages • Factors to consider when choosing a cluster type
What is a cluster? • My definition • Multiple systems performing a single function • Black box
Why Cluster? • Performance • Availability • Recoverability
Features • Speedup • Faster response times • Transactions finish faster • Scaleup • More work done • More capacity, more concurrent transactions • Scalability
Server Single Node Scaling • Scales to multiple CPUs • Doesn’t scale beyond one node • Multiple single points of failure Users Database Database
Cluster Definitions • Shared Nothing (Federated) • Replicated Site • Shared Disk • Failover • Active/Passive • Active/Active • Shared Everything
Shared Nothing Cluster • Only one CPU is connected to a disk • May have shared memory • MPP Systems are Shared Nothing • Other vendors have “Shared Nothing” clusters
Server Server Federated (Shared Nothing) Cluster • Distributed database (separate database on each machine) • Data is spread across nodes; each machine has part of the data • Function is spread across nodes • Two-Phase Commit Got it? 1. Good! 3. Got it! 2. Database Database
Server Server Replicated System • Data replicated at the server (network) level or at the storage (SAN) level • Multiple copies of the same database • Most common implementation is Active/Passive • Failover between nodes Passive Node Active Node Server level Replication or Storage level Replication Database Database
Shared Disk Cluster • Shared file system • Multiple systems attached to the same disk • All nodes must have access to data • Only one database instance; only one node has “ownership” of the shared disk • Synchronization between systems; If one node fails, then the other takes over
Cluster Interconnect • Most Shared Disk clusters require some form of Cluster Interconnect • Network – i.e. Gigabit Ethernet • Specialized – i.e. Infiniband, Myrinet • Most clusters implement a “heartbeat” between cluster nodes to monitor node health • Multiple nodes require a switch • Usually separated from the LAN • Some shared disk clusters implement a “heartbeat” mechanism to a quorum disk via the SAN in addition to/instead of network heartbeat • Oracle RAC implements Cache Fusion across the interconnect • Extra network traffic increases the throughput requirements • UDP implementation requires a separate network
Failover Cluster • One system is a standby system for another • Only one system doing work at a time • Pseudo-Shared Disk • Limited scalability in active/passive mode
Server Server Failover Clustering Users • Fault tolerant systems; highly available • Basic failover clusters don’t scale beyond two nodes Database Database
Active/Passive vs. Active/Active • Both are failover only • Active/Passive • One node is active • The other is passive until failover • Active/Active • Still uses active/passive technology • 2 separate databases • One is active on node A and passive on node B • The second database is active on node B and passive on node A. • Separate applications and user connections to each of the different databases
Active/Passive • Node A is active • Node B is passive until/unless Node A fails • Only one Oracle license is required Node A Node B
Active/Passive X Node A Node B If Node A fails …
Active/Passive • Node B becomes active • Node A is dead (definitely passive!) until repaired and then “failed back” if necessary. X Node A Node B
Active/Active • Application Group A and User Group A are activeon Node A • Application Group B and User Group B are activeon Node B • Each node serves as failover for the other. • 2 separate databases. Both nodes are not accessing the same data at the same time. • Oracle license required on each node Node A Node B Application A User Group A Application B User Group B Passive Fail-over for B Passive Fail-over for A
Switchover vs. Failover • Many cluster systems utilize the concept of Service Groups • Service Groups allow granular control of individual software packages (i.e. individual Oracle instances) • An individual group can be manually moved to another server without affecting other service groups – a “switchover” versus a “failover” • Adds greater management flexibility
Node A Node B Node C Node D Failover Application A User Group A Application D User Group D Application G User Group G Failover G X Application B User Group B Application E User Group E Application H User Group H Failover H Application C User Group C Application F User Group F Application I User Group I Failover I Failback N-to-1 Failover Configuration • Node D is a dedicated failover node for failures on Node A, B, and C • Extends number of active nodes • A problem is that once the failed node is available, the Service Groups on Node D (failover node) must failback to original server to restore High Availability
N + 1 Failover Configuration • Node D is a dedicated failover node for failures on Node A, B, and C • Extends number of active nodes • Once Node C is restored, it becomes the failover node, leaving Node D in production. Node A Node B Node C Node D Failover Application A User Group A Application D User Group D Application G User Group G Failover G X Application B User Group B Application E User Group E Application H User Group H Failover H Application C User Group C Application F User Group F Application I User Group I Failover I
N-to-N Failover Configuration • Node C fails, and its Service Groups are re-distributed across surviving nodes • Optimal solution for > 2 nodes • Implemented on third party failover clusters and Oracle RAC Node A Node B Node C Node D Failover G Failover H Failover I Application A User Group A Application D User Group D Application G User Group G Application J User Group J X Application B User Group B Application E User Group E Application H User Group H Application K User Group K Application C User Group C Application F User Group F Application I User Group I Application L User Group L
Third Party Clusters • Support for extended cluster nodes – up to 32 nodes for vendor Clustering • Supports N + 1 and N - N failover clustering • Integrated with hardware and/or software replication for long distance “clusters”
Clustering Solutions from Oracle • Oracle Failsafe • Oracle Data Guard • Advanced Replication • Shared Nothing Cluster • Oracle Parallel Server • Real Application Clustering (RAC)
Failsafe • MS Clustering Enabled • Two servers one disk subsystem • Switches in the event of a hardware failure • Requires recovery
Standby Database • Copy of Database (usually remote) • Kept up to date with Archive Logs • Oracle 8i feature • Oracle 9i-10g version of a standby database is Data Guard
Oracle Data Guard • Mirrored Server • Physical Standby • Archive Logs are applied to the remote database • Switchover occurs in the event of a failure • Logical Standby • Log Miner technology is used to generate SQL • Standby Database can also be used for read-only reporting • Advantages • Safe from user failure • Can be in different location • No recovery required
Advanced Replication • Uses Updatable-Snapshots • Replicates to another system • Systems stay in sync
Oracle Parallel Server • Shared disk cluster product • Loosely Coupled • Scalable performance • No downtime in the event of a system failure • Replaced by RAC in 9i
True Shared Disk Server (RAC) • ONE database • Separate multiple instances (processes & memory) • All nodes can access data simultaneously • Shared Everything Cluster • Transparent Application Failover • Oracle license required on each node • Highest level of cluster functionality Node A Node B
Factors to Consider for Clustering • Which do you need most? • High Availability – Failover Clusters, Synchronous Replication, Data Guard • Performance scalability – Active/Active failover clusters, N-to-N failover clusters • Both – Oracle RAC • Administration complexity • Failover clusters – relatively low • Oracle RAC – relatively high • Substantially less complex for 10g RAC than 9i RAC • Local or long distance? • Local – Failover, RAC • Remote – Federated database, Replication, Standby database/Data Guard • Oracle license costs • Active/Passive failover clusters – active nodes only • Active/Active failover clusters, RAC – per node
Review • What type of commit is required for a Federated (shared nothing) cluster? • What is the difference in how the database is kept up-to-date in Oracle Data Guard vs. Advanced Replication? • What is the difference between N-to-1 failover clusters and N + 1 failover clusters? • How many databases are there in an 8 node Oracle RAC cluster?
Summary • Types of clusters: • Shared Nothing Clusters • Federated databases • Replication • Shared Disk Clusters • Failover • Oracle RAC • Failover Clusters • Active/Passive • Active/Active • N-to-1 • N + 1 • N-to-N • Shared Everything Clusters • Oracle RAC • Choosing a cluster type involves trade-offs in functionality, costs, and administration complexity