140 likes | 228 Views
Providing an SFX failover system using MySQL replication. Anne L. Highsmith Head of Consortia Systems Texas A&M University hismith@tamu.edu http:// library.tamu.edu/directory/hismith. Failing over gracefully , or, What to do when your computer crashes. Whys and hows of a failover.
E N D
Providing an SFX failover system using MySQL replication Anne L. Highsmith Head of Consortia Systems Texas A&M University hismith@tamu.edu http://library.tamu.edu/directory/hismith
Failing over gracefully, or, What to do when your computer crashes
Whys and hows of a failover • SFX is a critical application requiring as much public uptime as possible • Decided to model SFX failover on our Voyager system • What does it cost to run a failover? (double your server, double your fun, double your invoice?)
The name game DB update via replication Failover server (bonden.tamu.edu) Production server (killick.tamu.edu) http://linkresolver.tamu.edu:9003 (public name of service)
Failing over our way 1 • The licence request contained: • Server names and ips for the production and failover server • Service name and ip for the “public name” of the service. http://linkresolver.tamu.edu:9003 • The /etc/hosts file on each server contains: • Server names and ips for the production and failover server • Service name and ip for the “public name” of the service.
Failing over our way 2 – ifconfig • Sysadmin uses ifconfig to configure name for linkresolver.tamu.edu on production server • When switching between servers, the sysadmin uses ifconfig to take down the name on production and bring it up on failover. Takes about 5 min. • Avoids DNS reload
Production and failover setup • Install vanilla SFX 4 on failover server • Verify that vanilla installation works, then take it down and remove /exlibris/sfx_ver/sfx_4[slot] • Run a full cold backup on production • Transfer cold backup files to failover server and unpack
MySQL replication setup & testing MySQL documentation: http://dev.mysql.com/doc/refman/5.1/en/replication-howto.html
Special setup for replication (1) • Run binary logging on production but not failover • Set up a unique server id for both source and target • Create a userid on the source server that the target server can use to query for updates • DBA finishes MySQL replication setup [i.e. MAGIC HAPPENS HERE]
Special setup for replication (2) • Leave reverse proxy apache down on failover • Disable admin updates to failover by setting up [instance]/config/connection_admin.config_ (optional)
Software updates on failover • KBDB updates are unnecessary, because replication takes care of them. • Software updates must still be applied • Use a special option on the rev-up process /[sfxglb41_path]/admin/revision/rev-up --type=sw --type=kbsw –backup=no • Apache restarts after update not a problem
Failover testing • Steps to switch from production to failover • Stop replication process on failover (DBA) • Start reverse apache on failover (SFX sysadmin) • Move the linkresolver interface from production to failover (Computer center sysadmin). • Steps to switch back to production • Move the linkresolver interface back to production (Computer center sysadmin) • Stop reverse apache on failover (SFX sysadmin) • Rebuild database on failover (DBA) • Start replication process on failover (DBA)
Implications of running on failover • Switchover/switchback creates a synchronization issue between databases • Failover database logged new statistics • When production came back, it created stat requests with keys that duplicated those already in failover • Replication to failover couldn’t be restarted until all of the potential duplicates were deleted • Decided to rebuild failover database after “use” and lose statistics