ASE120: Choosing the Right High Availability Solution

ASE120: Choosing the Right High Availability Solution Chris N. BrownPrincipal Systems Consultantchris.brown@sybase.com AIM: DenverSybaseSC

Agenda • Intro: Cutting through all the hype • Why clustering is not enough • Ways to achieve HA • Physical Copy • Logical Copy • Client-Side High Availability • OpenClient 12.x • OpenSwitch • DNS Update • DBA administration in a 24x7 environment • Summary • Q&A

HA – Do you need it? • HA (High Availability) has evolved almost into an industry buzzword • Everyone talks about it • Everyone says that they can do it • Everyone wants to sell you a “solution” • ..... but in the end, what are they really offering you? • HA isn't something new, it's been around in one form or another for years • Now seen as something critical because of heavy reliance on computing systems for business critical processes • The question still remains though .... Do you really need it?

And That's Why You're Here! • We are going to answer that question in the next 90 minutes • Examine what's out there • Analyze what is being offered (both Sybase and non-Sybase) • Discuss how they work • ... when they are appropriate • ... when they aren't appropriate • Talk about how to make them all work together • And address the <gasp!> users out there as well • Let's cut through the Bull and make it simple • So standby for some business level discussion... but it sets the stage for the rest of the session....

So do you need it? • Before you embark down the HA path.... • There are some questions that you should ask... • And know the answers to! • Many times, people THINK they need HA when they really don't. • As a result, thousands (and even millions) are spent that don't need to be • What about the man hour cost? • What about the increased administration? • What about the “overhead”?

Think about it.... • What you are buying is an “insurance policy”. • A protection from incurring a loss. • This is very similar to an auto insurance policy. • You can have high deductibles or low ones • You can have liability only or full coverage • You can choose the level of protection • You might be covered if you hit a Yugo, but will you be covered if you hit a Mercedes or Jaguar or Bentley? • How much of an out-of-pocket loss are you willing to take? • The same principle is true with an HA architecture. • Decide how much loss (downtime and cost) is acceptable. • Architect around THAT • What works for one system will not necessarily be the same for another.

How critical is the system in question? • The first thing to ask is... how critical is the system? • How much does it cost if it goes down? • How long could your company operate without it? • How much would your company lose if it went down? • ... and for how long? • Sometimes the costs are intangible • SLA's (Service Level Agreements) are usually put into place and should take this question into account. • Blanket SLA's sometimes used, not always prudent • Some systems (billing, customer service) are more important than others (email, instant messaging, LAN-based fileserver). • Where are the resources focused?

How much can you spend? • Highly Available Systems can cost from $ to $$$$$$$ • That incremental .09% or .009% can hike the cost exponentially. • Big derailer of HA implementations. • It's extremely important to understand the business requirements as discussed earlier. • I.E., does the system REALLY have to be 24x7 or can it be 25x5 or 9x5? • This will dramatically change the artchitecture chosen and of course, the cost of implementation. • Raw system cost should not (completely) drive the architecture chosen, but rather minimize the risk to an acceptable level.

High Availability Levels Continuous Operations Online Maintenance Continuous Availability $$$$$$ Automatic Failover DBMS HA $$$$ High Availability Warm Standby Database Replication Standby Systems $$ Cold Standby Backup/Restore Redundant Systems Hardware Redundancy RAID/Mirroring/HW Cluster $

Clustering : The traditional solution • For many years, the traditional HA solution was hardware level clustering. • Generally what most IT professionals think of when you say “HA”. • Mainly addresses hardware failures. • Evolved now to watch for process failure and will re-start them. • When failure detected, tries to restart services on a redundant host “as fast as possible” • HA usually discussed in terms of “host uptime” - meaured as a percent (i.e., 99%, 80%, etc).

Why isn't clustering enough? • In today's computing environments, hardware level redundancy isn't always enough. • It only provides the foundation. • Sometimes the amount of time required to restart services is unacceptable • Can take minutes when seconds are required ($$$$$). • What if the problem is with a shared resource? • What if the problem is site-specific?

Agenda • Intro: Cutting through all the hype • Why clustering is not enough • Ways to achieve HA • Physical Copy • Logical Copy • Client-Side High Availability • OpenClient 12.x • OpenSwitch • DNS Update • 3rd party • DBA administration in a 24x7 environment • Summary • Q&A

How can I achieve HA (50,000 ft view)? • There are 2 main ways to achive HA from a database level: • Physical Database re-creation • Logical Database re-creation. • They range from the simple to the down and dirty to the complex. • Which one you use depends on your requirements • ... and your budget • ... and your level of risk • In the 'ideal world', a combination of these strategies provides the best solution.

Physical Database Recreation: Dump and Load • This is the easiest way to get a backup copy and VERY basic HA • Dump the primary DB, and then load it to an ASE running on another host. • Keep in sync via incremental transaction log dumps • Inexpensive to implement • Issues: • Size of dump and getting it to backup site. • What if a dump is corrupt? • What if tranlog loads get out of sync? • Usually manual; can be automated but requires some 'babysitting' and has quite a few moving parts. • Good for non-critical systems or those whose data does not change much (think Log Size). Ftp, rcp, scp, etc.

Physical Database Recreation: Quiesce Database • If you have a SAN, another easy way to achieve basic HA is via the 'quiesce database' functionality. • New feature of ASE 12.x • Works best with ASE 12.5 and higher • Similar in principle to dump and load, but much faster. • Quiesce DB suspends writes to databases, so that underlying devices can quickly be copied • Read still allowed • Initally targeted for: • Quick refreshes of production to development • Quick creation of DSS environment • Quick troubleshooting of production with a 'snapshot'. • However, customers wanted to use it for more of an HA solution (and they got that with ASE 12.5).

How does Quiesce Database Work? Secondary Primary 2:00 AMquiesce database hold;<copy database usingexternal command>;quiesce database release 7:00 AMdump tran with standby_access 9:00 AMdump tran with standby_access 10:00 AMdump tran with standby_access Repeat each houruntil activity tapers off; then lengthen intervals accordingly 2:10 AMdataserver-q .. 7:05 AMload tran;online database for standby_access 9:07 AMload tran;online database for standby_access 10:10 AMload tran;online database for standby_access

Things To Think About With Quiesce Database • Quiesce Datbase is a solution to a specific problem. • Can be very very fast • True physical copy so WYSIWYG. But that may not be what you want • HA-ish solution with tranlog loads works best with ASE 12.5.x • You can do maintenance on replicate copy (dbcc, etc) • However... • It's a physical copy • Dependent on tranlog loads • Can't really use replicate since users must be kicked off for tranlog load to occur • “erorrs” in primary will be in replicate.

Physical Database Recreation: Block Replication • This is something that is offered by SAN vendors. • Very attractive: • Copies data from one area in SAN to another • Copies data from one SAN to another. • Often times pitched as an HA or DR solution, WHICH IT CAN BE. • Operates in 2 modes: synchronous and asynchronous • Because it is a block-level copy, what exists on the primary will exist on the replicate. Block Copy

Network Block Copy – How It Works (sync) • Methodology: • The Host OS will write its I/O to the primary SAN cache • The Primary cache copies its I/O to the secondary SAN cache • The secondary SAN sends an ACK to the primary SAN that it received the I/O • Both the primary and the secondary then write their I/O to disk • In this case, every disk I/O is copied. • This is similar to RAID-1 or a variation of a 2-phase commit. • The standby server (ASE in this case) can restart at the same spot that the primary ended. • Used over shorter distances.     

Network Block Copy – How It Works (async) • Methodology: • The primary OS writes its I/O to the primary SAN cache • The primary SAN sends an ACK to the primary OS that it received the I/O. • The primary SAN copies that I/O to the secondary SAN cache • Here is where it gets tricky • Not every I/O is copied • The block could have changed many times • Changed blocks are 'scored' and the latest change is what is sent over. • Both SAN's write to disk • Think of the replicate as a point-in-time snapshot.     

This sounds like A Great Thing! (TM) • Since the SAN is copying data at the bit level, it makes sense as a DR / HA mechanism • No data loss • Server can be restarted where the other once crashed • Copy from primary to secondary is usually very fast • There are some issues to be aware of though. • Sometimes, what you see on disk isn't what you want at the replicate (corruption) • Be aware of how ASE writes data to disk, and how the OS writes data • We write 2k (4k, 8k, 16k) pages, they write 512k Blocks • We log first then write the data for consistency, so what happens if data pages are written in the SAN before the log pages are, and you go down? (eeeeek) • Overall, this is a good stragegy that many people use, but it cannot offer you total protection.

ASE HA Option: Riding The Clustering Wave • ASE 12.0 introduced a new feature we call the HA option. • It brings clustering technology to the database server. • No logical IP needed for the ASE to 'listen on'. • Failover designed to be very very fast. • You can utilize both nodes in a 2 way cluster (prev. one was usally idle) • This results in better leverage of your hardware investment and can make multiple systems highly available with less cost. • Reduces start-up overhead significantly.

HA System Replicate Users/Logins Disk Disk Physical Database Recreation: ASE HA option Establish S1 S2 Companion Node 2 Node 1 Shared Disk Storage

Fail Over HA System Disk Disk ASE HA option: Node Failure S2 Node 2 Shared Disk Storage

Fail Back HA System PrepareFailback Disk Disk ASE HA Option: Failing Back S2 Node 2 Shared Disk Storage

Fail Back HA System Replicate Users/Logins Disk Disk ASE HA Option: Failing Back (Cont'd) Establish S1 S2 Companion Node 2 Node 1 Shared Disk Storage

Some notes on the HA option • We rely on the HA “Heartbeat” to notify us when one ASE fails. • Brings up several administration aspects • Both ASE's must be at the same version • Currently we only support 2-node failover • One of the 2 ASE's must be a fresh install • It's possible to access data from one server on another • Via proxy tables, this is done via CIS • Performance issues to consider • Might be a feasible load-balancing option • We failover fast (since that's unplanned) but failing back is unplanned and manual (and slower). • Significant improvements in this area since the 12.0 release. • Still physical database recovery.

Logical Database Recreation • So far, we have only discussed ways of re-creating the database server “physically” • Meaning, copying the data (disk, devices, dumps) from point A to point B • All of these work well and in some cases work very fast • They provide near zero or zero data loss • However, they all suffer from the same common drawbacks • What you see is What you get (WYSIWYG) • Corruption is almost always copied over, making backup copy useless. • You cannot change the data as it is being moved over • In most cases, the replicate is down or not useable. • The only way to get around these problems today is to use a logical database recreation scheme. • RepServer • Message Bus • 3rd Party

Quickies on Queues and 3rd parties • We will quickly discuss message queueing and 3rd parties. • Message queueing takes “events” and publishes them out on a bus • The event could be a data event or an application level event • A listener subscribes to certain events • Data can be manipulated based on rules. • There are 3rd party products out there that can also do this • DataMirror • UPSuite • They may not use log-based replication though ... some use triggers to replicate data and that could put significant overhead on a busy system or bring about DBMS management issues.

1 Client Applications 2 3 Replicate Data Server 4 Replication Server 1) The client application updates data on the primary. 2) The primary data server manages its local data. 3) Replication Agent notifies Replication Server of primary server data updates. 4) Replication Server coordinates data replication of those updates with other Replication Servers. Relication Server Architecture Primary Data Server Replication Agent Network

How Replication Server Works Primary Data Server Replication Agent • Monitors Transaction Log • Truncation Point • Marked Tables • Creates LTL PDS Txn Log Primary db RSSD • Rep-Defs • Publications • Subscriptions • Routes LTL Replicate Data Server LAN/WAN – DSI (Data Srv Int) Stable Device / Stable Queues • Inbound Queue • Outbound Queue • Materialization Queue Replicate db

Warm Standby --- a little different Physical ASE “A” Physical ASE “B” (IP Address 192.233.56.21) (IP Address 192.233.56.20) Logical ASE “XYZ”

Some notes on Replication Server • Warm Standby is a variant of “traditional replication” • You can replicate DDL changes if you replcate at a database level • It can be tuned to near zero latency • Better to have the RepServer on its own host or on the replicate host. • Beware of failure points and how they might affect your application. • The primary and the secondary must be controlled by the same RepServer • Currently limited to 1 primary, 1 warm standby (will change in RepServer 12.6)

What about the client? • Often times, HA solutions only include the back-end. • Archtectures consider only how quick we can recover the downed system, but what about the end user? • Some questions to ponder: • How is uptime and availability measured? • If the system was down for 5 minutes but the user couldn't connect for 30, how long was the outage? • What if the system were down, but the user didn't really know or notice? • It's possible today! • Would it be considered an outage if you could do this?

Method #1: OpenClient 12.x Primary Server • OpenClient 12.x integrates with the HA option of ASE. • Provides client-side failover from the failed ASE server to the surviving ASE server. • ONLY useful if you are using the HA option. • ONLY can be used if you can recompile your applications against OpenClient 12.x Companion Server Primary Server x Companion Server

OpenClient 12.x and HA: How Does It Work? • To support this feature, 2 things need to be done • The First thing is change the interfaces file. • Typical entry would contain master/query syntax and connectivity info • A new entry is added in the interfaces file at the end • It indicates what server is the failover (companion) server for a primary node • For example: • ASTRO • master tcp ether stewie 5000 • query tcp ether stewie 5000 • hafailover ELROY • ELROY • master tcp ether felix 5000 • query tcp ether felix 5000 • If using LDAP, would add an entry to the LDAP server containing the same information Indicates where we failover to

OpenClient 12.x and HA: How Does It Work (Cont'd)? • The second thing that needs to be done is re-compile against OpenClient 12.x • There is a new property that need to be addressed to utilize the HA functionality • CS_HAFAILOVER • CS_RET_HAFAILOVER • These are set using the ct_config and ct_con_props syntax at the connection or context level • This is only with ctlib (dblib DOES NOT support this functionality) • Client will receive an error 1205 • Client failed over to server listed as hafailover server in the interfaces file • Client re-submits in-flight (non-committed) transaction

Method #2: Sybase OpenSwitch • Much more flexible than OpenClient 12.x • Does not require recompile of applications • Not tied to HA option of ASE • Can be used against existing and 3rd party applications • Allows for increased flexibility and user management. • User logs directly into OpenSwitch, not into ASE. • OpenSwitch manages the user connection and migrates them when it detects a 'failure'. • Integrates with Business Logic. ???

Transparent Connection Management ASE Server A OpenSwitch EAServer ISQL ASE Server B PowerBuilder RPC Switch Request Any Open Client Application and Platform Administrator (ISQL) • For each incoming connection OpenSwitch • decides where it should go and opens up a new connection • Manual switch capability

HA Coordination OpenSwitch ASE Server A Action Application What do I do? Response CM • Coordination Module provides an API to coordinate with third party HA solutions • This is the “brains” of OpenSwitch • OpenSwitch defers switching decision to CM if present

CM Typical OpenSwitch Usage Scenario Check if it is a real failure or a network hiccup New York Application Check if transactions are pending in Rep Server queue Connection Lost! What do I do? OK, Failover Check if warm-standby is really up and functional OpenSwitch Denver CM = Coordination Module

The “Pie In The Sky” • This covers all possible areas: physical, logical, and users (and the shortcomings of each). Via CM

24x7 DBA Administration • Any HA scenario MUST allow for DBA maintenance activities • DBCC • Dump and Load • Update Statistics • Reorg Rebuilds • Upgrades • If it doesn't, then by definition it's not hightly available • Simply because to do any of the above actions, you have to take the server down • ... or you might impact performance so much that the system beomes 'pseudo-down'. • How do you do these then????

Well.... • Update stats: always the “achilles heel” • Attend Eric Miner's ASE 126 class on speeding up Update Stats • Thursday 3:30pm, 90 minutes, Sun “A” • DBCC • Use a phyusical database recreation scheme (bit level rep, quiesce, etc) • Then run DBCC on the copy • Since it's a physical recreation, then errors in the copy will be in the primary, then you can take action accordingly to fix it. • Reorg • Starting with ASE 12, you can specify parameters around it's use • Done at an extent level, doesn't lock the entire table down. • Dump and Load • Religious issue, some sites don't do it since they have multiple copies, or they do this on the replicate.

Well... (cont'd) • Upgrades • Best done with a logical database recovery scheme (like RepServer, etc) • This will allow you to keep both the old and vew version in-sync and you can easily fail back to the old version without any data loss

HA Questions To Ask.... • How does HA solution cover … • … host machine failures • … operating system faults • … database failures/corruptions • … datacenter loss • … online maintenance • ... DBMS, OS, Database Schema, User Admin • How well does it handle … • … latency between synchronization • … outage during failover • … client connections • ... connection context (database in use, etc) • ... in-flight transactions

ASE120: Choosing the Right High Availability Solution