1 / 12

High Availability 24 hours a day, 7 days a week, 365 days a year…

High Availability 24 hours a day, 7 days a week, 365 days a year…. Vik Nagjee Product Manager, Core Technologies InterSystems Corporation. Topics. What is High Availability (HA)? Current HA strategies What’s coming? Questions & Discussion. What is High Availability (HA)?. Reliability

giovanni
Download Presentation

High Availability 24 hours a day, 7 days a week, 365 days a year…

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. High Availability24 hours a day, 7 days a week, 365 days a year… Vik Nagjee Product Manager, Core Technologies InterSystems Corporation

  2. Topics • What is High Availability (HA)? • Current HA strategies • What’s coming? • Questions & Discussion

  3. What is High Availability (HA)? • Reliability • Fault-tolerance • High Uptime • Operational Continuity • Redundancy • Minimal Disruption

  4. High Availability vs. Disaster Recovery • High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. • Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!

  5. Current HA Strategies • Failover = Automatic switch to redundant system • Uses some type of heartbeat software (e.g., HACMP) • Current Failover Options: • Failover Clusters • Concurrent Clusters • ECP Clusters • With Failover Cluster for Database • With Concurrent Cluster for Database

  6. Failover Clusters • One active system (PROD), and one standby system (STDBY), with a heartbeat connection • Windows Cluster, IBM HACMP, Sun Cluster, HP Serviceguard, Red Hat Cluster Suite, Veritas Cluster Services… • Needs shared disk for install directory, WIJ, database files, and journal files • Users/Applications connect to a DNS which is mapped to PROD • In event of failure, 3rd party cluster software fails Caché to STDBY node • Caché performs recovery on STDBY node before allowing connections - open Tx’s are rolled back, open locks are released, etc…

  7. Concurrent Clusters • AKA Caché Clusters • Can be configured on OpenVMS and Tru64 UNIX • Two or more servers, each running an instance of Caché and each with access to all disks, concurrently provide access to all data • Users connect to either one of the clustered nodes; Caché provides data and lock synchronization across nodes • If one machine fails, users can immediately reconnect to any of the remaining cluster nodes • Caché performs cluster-wide recovery during failover – logical and physical data integrity is maintained

  8. ECP Clusters – with DB as Failover Cluster • Enterprise Cache Protocol (ECP) provides a distributed, tiered system • Typical configuration: • N+1 application servers • Users load-balanced across app servers • If any app server goes down, users can be reconnected to other remaining app servers • If database goes down, users on app servers will experience pause while DB failover completes (here DB is configured as a failover cluster) • Application servers will reconnect after database has performed recovery

  9. ECP Clusters – with DB as Concurrent Cluster • Similar to previous example, except DB server is configured as a concurrent cluster (OpenVMS or Tru64 UNIX) • App servers can connect to any one of the nodes • If any node fails, the app server(s) connected to that node will reconnect to another surviving node after failover • Caché performs cluster-wide recovery during failover – logical and physical data integrity is maintained

  10. High Availability: What’s Coming? Database Mirroring: • Delivers faster, automated failover • Eliminates requirement for shared disk configurations • Reduces dependency on 3rd party clustering software • Uses multiple redundant servers • Integrated ECP recovery

  11. Database Mirroring • Multiple servers in Mirror Set - one is Primary, others are Backup (1+) • TCP connections between mirror members • Primary PUSHES journal updates to Backups, who ack and continuously de-journal • Primary role can flip from one server to another within moments – automated failover • All clients (except ECP) connect to a Mirror Virtual IP – mirror handles appropriate redirection to current Primary • ECP protocol is “mirror aware” – app servers will connect directly to current primary, and will fail over to new primary as appropriate. ECP will perform recovery on reconnection.

  12. Wrap-up Questions & Discussion

More Related