1 / 14

Providing an SFX failover system using MySQL replication

Providing an SFX failover system using MySQL replication. Anne L. Highsmith Head of Consortia Systems Texas A&M University hismith@tamu.edu http:// library.tamu.edu/directory/hismith. Failing over gracefully , or, What to do when your computer crashes. Whys and hows of a failover.

kimama
Download Presentation

Providing an SFX failover system using MySQL replication

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Providing an SFX failover system using MySQL replication Anne L. Highsmith Head of Consortia Systems Texas A&M University hismith@tamu.edu http://library.tamu.edu/directory/hismith

  2. Failing over gracefully, or, What to do when your computer crashes

  3. Whys and hows of a failover • SFX is a critical application requiring as much public uptime as possible • Decided to model SFX failover on our Voyager system • What does it cost to run a failover? (double your server, double your fun, double your invoice?)

  4. The name game DB update via replication Failover server (bonden.tamu.edu) Production server (killick.tamu.edu) http://linkresolver.tamu.edu:9003 (public name of service)

  5. Failing over our way 1 • The licence request contained: • Server names and ips for the production and failover server • Service name and ip for the “public name” of the service. http://linkresolver.tamu.edu:9003 • The /etc/hosts file on each server contains: • Server names and ips for the production and failover server • Service name and ip for the “public name” of the service.

  6. Failing over our way 2 – ifconfig • Sysadmin uses ifconfig to configure name for linkresolver.tamu.edu on production server • When switching between servers, the sysadmin uses ifconfig to take down the name on production and bring it up on failover. Takes about 5 min. • Avoids DNS reload

  7. Initial data load on failover

  8. Production and failover setup • Install vanilla SFX 4 on failover server • Verify that vanilla installation works, then take it down and remove /exlibris/sfx_ver/sfx_4[slot] • Run a full cold backup on production • Transfer cold backup files to failover server and unpack

  9. MySQL replication setup & testing MySQL documentation: http://dev.mysql.com/doc/refman/5.1/en/replication-howto.html

  10. Special setup for replication (1) • Run binary logging on production but not failover • Set up a unique server id for both source and target • Create a userid on the source server that the target server can use to query for updates • DBA finishes MySQL replication setup [i.e. MAGIC HAPPENS HERE]

  11. Special setup for replication (2) • Leave reverse proxy apache down on failover • Disable admin updates to failover by setting up [instance]/config/connection_admin.config_ (optional)

  12. Software updates on failover • KBDB updates are unnecessary, because replication takes care of them. • Software updates must still be applied • Use a special option on the rev-up process /[sfxglb41_path]/admin/revision/rev-up --type=sw --type=kbsw –backup=no • Apache restarts after update not a problem

  13. Failover testing • Steps to switch from production to failover • Stop replication process on failover (DBA) • Start reverse apache on failover (SFX sysadmin) • Move the linkresolver interface from production to failover (Computer center sysadmin). • Steps to switch back to production • Move the linkresolver interface back to production (Computer center sysadmin) • Stop reverse apache on failover (SFX sysadmin) • Rebuild database on failover (DBA) • Start replication process on failover (DBA)

  14. Implications of running on failover • Switchover/switchback creates a synchronization issue between databases • Failover database logged new statistics • When production came back, it created stat requests with keys that duplicated those already in failover • Replication to failover couldn’t be restarted until all of the potential duplicates were deleted • Decided to rebuild failover database after “use” and lose statistics

More Related