350 likes | 1.17k Views
Oracle High Availability using Veritas Cluster Server (VCS) Chris Lawson Performance Solutions chris_lawson@yahoo.com. VCS INTRODUCTION. Veritas Cluster Server is a popular variety of HA --High Availability architecture. Other examples are HP MC Service Guard, IBM HACMP.
E N D
Oracle High Availability using Veritas Cluster Server (VCS)Chris LawsonPerformance Solutionschris_lawson@yahoo.com
VCS INTRODUCTION • Veritas Cluster Server is a popular variety of HA --High Availability architecture. • Other examples are HP MC Service Guard, IBM HACMP. • Goal: Minimize downtime due to server failure. • Software used: Veritas Database Edition for Oracle 2.1 (includes VCS 1.1.2); Oracle 8.1.6.2, Solaris 2.6
ADVANTAGES OF VCS • Fail-over occurs with no loss of data. There is no need to apply redo logs. • Fail-over criteria is configurable. • Fail-over occurs without DBA intervention, and typically requires just a few minutes.
NODE Y NODE X Local disk Local disk Service Group A Service Group B Virtual IP Virtual IP shared disk shared disk Application Application DATABASE ‘A’ DATABASE ‘B’ LISTENER_A LISTENER_B
ADVANTAGES OF VCS • The clients are automatically routed to the correct node. • Fail-over systems such as used in VCS are proven technology and are widely used. • Future databases can be added to cluster with only moderate effort. • Far simpler to implement than Replication or Oracle Parallel Server. • No modification to application required.
DISADVANTAGES OF VCS • Thorough testing required to ensure that all related applications properly fail-over together. • Higher level of Sys Admin and DBA expertise and time required for setup. • Care must be taken to ensure that nodes are consistently and correctly configured.
DISADVANTAGES OF VCS(continued) • Upon fail-over, to return the database and applications to the original node will require another, brief, interruption in service. • Since the two nodes are physically co-located, a disaster could still cause service interruption. • Note: Products such as Veritas Global Cluster Manager address this issue • Servicing of database is slightly more complex
MONITORING • VCS uses Oracle Enterprise Agents to monitor the health of the Oracle system. • These agents perform monitoring at both the database and Listener levels.
MONITORING RDBMS • Primary check: Confirms that Oracle background processes are present • Secondary check: Perform simple database transaction • Failure of either test will cause fail-over.
LISTENER MONITORING • Primary check: Confirm process is running • Secondary check: Status command • Thus, listener.log should show a ‘status’ command every one minute. • Optionally, failure of either listener test will not cause failover, but rather multiplerestart attempts first.
DATABASE SETUP • Place ALL oracle data files on the shared volume group. • Do not intermix files from different databases. • In oratab file on each node; specify ‘N’ as VCS will control startup. • Optionally, two sets of archive logs may be specified—one set on local, one on shared.
TIPS: RECOVERY TIME • Instance recovery time depends on: • time to roll-forward un-checkpointed transactions PLUS • time to rollback uncommitted transactions: • Roll Forward: Time is proportional to frequency of checkpoints.
TIPS: RECOVERY TIME • Can control checkpoint frequency several ways: via size of redo logs, and parameters such as: • FAST_START_IO_TARGET • LOG_CHECKPOINT_INTERVAL. • Roll Back: ‘Rollback on Demand’ feature (default) drastically reduces this time.
LISTENER SETUP • A listener should be defined for each service group that has an Oracle database. • This allows VCS to try to restart the listener if a failure is detected. • This ‘RestartLimit’ value is not set using hagui; rather, this feature is entered in the file OracleTypes.cf.
LISTENER SETUP • Suggest setting parameter RestartLimit to three so that VCS will try 3 times to restart listener before initiating a failover. • The count is reset when VCS sets this ‘resource’ offline. • A separate listener (and port) for each database is required because VCS will shutdown the listener for a service group upon failover.
CLIENT SETUP • The client tnsnames.ora file specifies the Service Group (virtual) IP address. • Upon failover, this IP address will automatically point to the correct node. • No change to client tnsnames.ora file is required.
VCS BASIC SETUP • The Sys Admin will setup a VCS configuration file, main.cf. • This file lists the various Resource Types that constitute a cluster Service Group. • Some typical Resource Types are: DiskGroup, IP, and Volume. • It is most convenient to use the hagui utility to define the resources for each Service Group.
ORACLE AGENT SETUP • VCS uses two ‘agents’ to monitor both the database and listener. • The agents are defined in the main.cf file using two new Resource Types: Oracle and Sqlnet. • The GUI tool may again be used to configure the agents.
SECONDARY-MONITOR SETUP • Besides checking for the background processes, VCS will attempt to perform a simple update transaction. • This secondary check is automatically enabled when the following Oracle attributes are defined: MonScript [script executed], User, Pword, Table.
SECONDARY-MONITOR SETUP • Thus, for each database to be monitored, create an oracle user for this transaction. Grant minimal privilege. • create a table with column: TIMESTAMP • Insert one row into the table • Confirm that this user can perform simple update of the table.
SECONDARY DB CHECK • For example: • Create user dbcheck identified by dbcheck; • Grant connect, resource to dbcheck; • Connect dbcheck/dbcheck • Create table DBTEST ( TSTAMP DATE); • Insert into DBTEST values (SYSDATE );
TIP: INITIAL SETUP • Good investment: If complications arise, hire Veritas consultant for a few days to perform the basic setup This can easily save weeks of frustration
TIP: INITIAL SETUP • The Sql*net parameter RestartLimit must be manually entered into the VCS configuration file. This will allow VCS to attempt listener restart before failing over. • Note also that the database secondary monitoring feature is automatically enabled when the User/Pword/Table fields are defined.
TIP: DATABASE CHECK • Whenever the database is rebuilt, remember to create the user that performs the secondary database check. • Good idea: Have a script handy to create this user and the associated ‘check’ table. • Otherwise, database check will fail, and failover will occur immediately.
TRAP: PROCESSES PARAM • Setting of parameter is critical! • Else “secondary” database check will fail. This will cause an unexpected failover. • VCS log will indicate that the database “became offline unexpectedly, on its own”
TRAP: SHARED MEMORY • On Solaris 2.6, shared memory and semaphores may not be released • Upon failover involving Shutdown abort, this will leave node unavailable to that instance. • Upon “fail-back,” the instance will be unable to start.
SHARED MEMORY • Detect this error and clear shared memory after a failover has occurred. • Perhaps set cron job to notify DBA. • Detect then correct this problem using the unix commands ipcs and ipcsrm. • A shared memory segment can be matched to a particular instance by the memory size. • Another method: unix command sysresv shows shared memory and semaphores for an instance (see next slide)
SHARED MEMORY export ORACLE_SID=acx sysresv IPC Resources for ORACLE_SID "acx" : Shared Memory: ID KEY 2 0x68972d94 Semaphores: ID KEY 196609 0x0b0f6b85 2 0x0b0f6b86 3 0x0b0f6b87
USEFUL REFERENCES • Veritas VCS Reference Guide • Veritas VCS Oracle Agent Guide • Listserver: • veritas-ha-request@Eng.Auburn.EDU http://support.veritas.com/
Contact Information Chris Lawson chris_lawson@yahoo.com For other papers, see: http://dbspecialists.com/ (see DBA Resources)