270 likes | 276 Views
Learn how to integrate Distributed Replication Block Device (DRBD) and Heartbeat for a high-availability MySQL database, including failover strategies. This article discusses the motivation behind high availability, the DRBD technology, integration with Heartbeat, failover scenarios, and more.
E N D
High-Availability MySQL DB based on DRBD-Heartbeat Ming Yue September 27, 2007
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 5 Discussion
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 5 Discussion
1 Motivation To minimize service interruption time due to unacceptable long-time recovery; To maximize the uptime of our system; Solutions? High-Availability Network Redundancy + Failover DRBD + Heartbeat
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 5 Discussion
2 DRBD:DistributedReplicationBlockDevice VFS VFS Block Device Block Device Block Device Block Device . . . . . . Mapping Layer Mapping Layer Disk Disk Network
2 DRBD:Distributed ReplicationBlockDevice (continued) A loadable kernel module of Linux providing real-time data replication to other block device on remote machine.
2 DRBD:DistributedReplicationBlockDevice (continued) /dev/drbd0 (not mountable) mounted /dev/drbd0 Block Device Block Device /mydrbd Primary “devdrbd01” Network Secondary“devdrbd02” Disk Real-time Replication Disk - Create file system on /dev/drbd0 of Primary node at first. - Only Primary node’s /dev/drbd0 can be mounted to its local directories. - Primary/Secondary status is specified manually if no Heartbeat.
2 DRBD:DistributedReplicationBlockDevice (continued) /dev/drbd0 (not mountable) (not mountable) /dev/drbd0 Block Device Block Device Secondary“devdrbd01” Network Secondary “devdrbd02” Disk Disk
2 DRBD:DistributedReplicationBlockDevice (continued) /dev/drbd0 (not mountable) mounted /dev/drbd0 Block Device Block Device /mydrbd Secondary “devdrbd01” Network Primary“devdrbd02” Disk Real-time Replication Disk After manual Primary/Secondary status switch
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 5 Discussion
3.1 What does heartbeat serve here? Heartbeat: A well known high-availability resource manager In DRBD-Heartbeat-MySQL configuration: 1. Provides the Virtual IP address interface 2. Auto-starts MySQL server on the Primary 3. Talks with the peer’s heartbeat process and starts failover if the Primary’s heartbeat doesn’t respond
3.2 Integrating DRBD-Heartbeat-MySQL Client Client Client “devdrbd” (virtual IP address/hostname) Primary “devdrbd01” Secondary “devdrbd02” MySQL server DRBD Replication Eth0 Interface Not mountable /dev/drbd0 mounted /dev/drbd0 /var/lib/mysql Disk Disk Heartbeat Heartbeat Heartbeat Probing
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 4.1 Failover after Power/Network Interruption 5 Discussion 4.2 DRBD Synchronization 4.3 Failover in MySQLstat Plot
4.1 Failover – Power/Network Interruption on the Primary “devdrbd” (virtual IP address/hostname) Primary“devdrbd01” Secondary “devdrbd02” Power Off Un-plug eth0 Cable MySQL server Eth0 Interface Not mountable No peer to Replicate DRBD Replication mounted /dev/drbd0 /dev/drbd0 of /var/lib/mysql No response from peer heartbeat. Switch! Disk Disk No peer responding Heartbeat Heartbeat Heartbeat Probing “devdrbd01”
4.1 Failover – Power/Network Interruption on the Primary “devdrbd” (virtual IP address/hostname) Primary “devdrbd02” MySQL server Eth0 Interface No DRBD Replication mounted /dev/drbd0 /var/lib/mysql Disk Heartbeat Heartbeat Probing
4.1 Failover – Power/Network Interruption on the Primary “devdrbd” (virtual IP address/hostname) Secondary “devdrbd01” Primary“devdrbd02” DRBD Synchronization Eth0 Interface MySQL server Not mountable mounted /dev/drbd0 DRBD Replication /dev/drbd0 /var/lib/mysql Disk Disk Heartbeat Heartbeat Heartbeat Probing Recovered after some time
4.2 Failover – DRBD Synchronization DRBD Synchronization: A DRBD procedure which updates data of the Inconsistent node by the data of the Up-to-Date node. When doesinconsistencyhappen? If one machine is powered off or off-line, “write” operation is performed only on the on-line machine.
4.2 Failover – DRBD Synchronization (continued) The speed of DRBD Synchronization In our configuration, the bandwidth of both synchronization and normal replication is 10MB/s in average. Time needed for complete DRBD Synchronization It depends on disk size. For our 4TB 64-bit machine, almost 20 hours. Too much time to accept? DRBD Synchronization is smart. It chooses the recently updated data to synchronize first.
4.3 Workload Simulation/Failover in MySQLstat Plot Environment: Simulate the status and workload ofPandaDB • Two 64-bit machines configured by DRBD-heartbeat • Install MySQL community 5.0.45 on both machines • Use a typical PandaDB dump as the queried database • Write workload generator to simulate client queries • It performs balanced query to the PandaDB tables through virtual IP interface and multiple connections. • It has re-connect feature to detect the service interruption and save current query state. It re-creates MySQL sessions when service is available again, and resumes the latest interrupted session.
4.3 Workload Simulation/Failover in MySQLstat Plot 1. General Queries >150 Client Connections Time Gap Time Gap Workload
4.3 Workload Simulation/Failover in MySQLstat Plot 2. Query Type 5% Insert/Delete 48% Update 47% Select 3. Input/Output 8.8M/sec in average
Outline 1 Motivation 2 DRBD: Distributed Replication Block Device 3 Integrating DRBD-Heartbeat-MySQL 4 Failovers 5 Discussion 5.1 The Comparison with Master/Slave Replication 5.2 What we have already done 5.3 What we will do next
5.1 The Comparison with Master/Slave Replication Master/Slave is based on MySQL – User Level; DRBD is based on block device – Kernel Level. Master/Slave is asynchronous; DRBD is synchronous. Master/Slave has higher probability of inconsistency. Master/Slave has load-balancing; DRBD doesn’t. Master/Slave isgeographically more flexible; DRBD has to be located in the same subnet or neighbor subnets.
5.2 What we have already done now Testing InnoDB MySQL DB on the basis of DRBD-Heartbeat Configuration (on both 32-bit and 64-bit machines): • Configuration for integrating DRBD-Heartbeat-MySQL • Multi-connection/multi-type workload simulation according to the status of Panda Server • Failover situations • Power interruption • Network interruption • Automatic/manual DRBD Synchronization • Testing time gap of failover/synchronization by automatically re-connect load generator
5.3 What we will do next • Simulate the workload and failover feature through production Panda Server; • Quantitative efficiency comparison with Master/Slave Replication; • Simulate the work load and failover feature according to the status of Archive MyISAM MySQL DB. It has small number of connections, but is highly insert-intensive.
Thank you very much! Questions?