350 likes | 478 Views
OpenVMS Solutions Center Lab Project - Spring 2004 : Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover. RAC DT/HA – Goals – Phase I. First:
E N D
OpenVMS Solutions Center Lab Project - Spring 2004 :Oracle 9i RAC DT/HA in a distributed OpenVMS Environment Phase I – Failover
RAC DT/HA – Goals – Phase I • First: • Demonstrate that Oracle 9iRAC continues to run during simulated network failure using LAN Failover and failSAFE IP configurations. • Second: • Measure the latency effect of failover when RAC instances are connected over long distance (100km).
RAC DT/HA – What is Failover? • Oracle RAC failover: The ability to resume work on an alternate instance upon instance failure • Oracle TAF (Transparent Application Failover): Runtime failover which enables client applications to automatically reconnect to the database if the connection fails • LAN Failover: Hardware failover from failed network interface card (NIC) to another NIC configured as part of LAN failover set • failSAFE IP: Address failover to alternate interfaces
RAC DT/HA – Hardware Config • 2 4-cpu GS160, with Shared Cluster System disk, a Shared Oracle install disk on Enterprise Storage Array connected via Fibre SAN A Switch • DE602-AA (EIA) NIC’s, using Twisted Pair on 100m-bit LAN Extreme Summit4 Switch • 5-DEGPA-SA, 1-DEGXA-SA (EWA-D) NIC’s, 1Gbit fiber on 1Gbit LAN Digital Networks DNSwitch 800 • 100km cable - Gbit SCS Extreme Summit 7i Switch
RAC DT/HA – Server Config • OpenVMS 7.3-2, TCPIP 5.4 • Oracle Server 9.2.0.4, with Oracle patch for bug fix 3026720: Excessive CPU and BUFIO for LMD0 and SMON processes when >2cpu • Running 2 RAC instances, in 2 node cluster • Requires the INIT<SID>.ORA parameter CLUSTER_INTERCONNECTS to specify alternate network interface for RAC communication
RAC DT/HA – Client Config • 9.2 SQLNet Client, on PC running Windows 2000 • Benchmark/Load Generating software: • Swingbench 2.1f- An ‘unofficial’, Java based, client load generating tool from Oracle, which allows a ‘load’ to be generated and the transactions/response times to be charted • Configured to connect 100 clients, load balanced between the 2 instances, and run 50,000 ‘typical’ Order Entry transactions
RAC DT/HA – Test Plan • Restore from disk backup before each test run to ensure same starting point • Ensure RAC instances communicating over specified network interface • Run 3 iterations of same benchmark load while collecting data • Run Benchmark load, no failures • Run Benchmark load, fail instance • Run Benchmark load, fail network connection between instances
RAC DT/HA – Data collection • T4 running on both nodes, 10sec sampling interval • Saved Swingbench data results after each run • Executed and ‘saved’ output of VMS commands during network failures to see status of network devices and Oracle processes $ MC LANCP SHOW DEVICE/CHARATERISTICS LLA0 $ TCPIP SHOW INTERFACES/FULL $ PIPE SHO SYS|SEA TT: ORA_CPU
Tabular Timeline Tracking Tool – T4 • Created by OpenVMS Sustaining Engineers to help diagnose OS functionality. Uses OpenVMS Monitor data, stored in Comma Separated Value file format (.csv file), which can then be used by a variety of applications (spreadsheets, TlViz, etc) • Download from web. Shipped with OpenVMS 7.3-2, in SYS$ETC directory • http://h71000.www7.hp.com/openvms/products/t4/index.html • Users are able to queue data collection and configure data collection frequency • Helpful in establishing baseline performance footprint which can then be used in before and after comparisons of system changes • T4 ‘hooks’ for Oracle and Rdb Server being created
RAC DT/HA – LAN Failover DCL • Before NIC ‘fails’ • Device Characteristics LLA0: • Value Characteristic • ------ -------------- • 256 Max receive buffers • Yes Full duplex enable • . . • . . • 1000 Line speed (mbps) • "EWB0" Failover device • "EWA0" Failover device (active) • . . • . . • 0 Failover priority • After NIC ‘fails’ • Device Characteristics LLA0: • Value Characteristic • ------ -------------- • 256 Max receive buffers • Yes Full duplex enable • . . • . . • 1000 Line speed (mbps) • "EWB0" Failover device (active) • "EWA0" Failover device • . . • . . • 0 Failover priority • $ MCR LANCP SHOW DEVICE/CHAR LLA0
RAC DT/HA-T4 LAN Failover EWA/B EWA0 cable pulled EWB0 cable pulled
RAC DT/HA – failSAFE IP DCL • $ TCPIP SHOW INTERFACE/FULL • Route Tree for Protocol Family 2: • default 161.114.69.1 UGS 0 7999 IE0 • 10.4.4/24 10.4.4.2 U 274 408185 WE3 • 10.4.4/24 10.4.4.3 U 274 445714 WE4 • 10.4.4.2 10.4.4.2 UHL 0 0 WE3 • 10.4.4.3 10.4.4.3 UHL 0 14 WE4 • WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • failSAFE IP Addresses: • inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4) • *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 • WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • failSAFE IP Addresses: • inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE3) • *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500
RAC DT/HA – failSAFE IP DCL Failed 1 • $ TCPIP SHOW INTERFACE/FULL • Route Tree for Protocol Family 2: • default 161.114.69.1 UGS 0 7999 IE0 • 10.4.4/24 10.4.4.2 U 274 408185 WE3 • 10.4.4/24 10.4.4.3 U 274 445714 WE4 • 10.4.4.2 10.4.4.2 UHL 0 0 WE3 • 10.4.4.3 10.4.4.3 UHL 0 14 WE4 • WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • *failSAFE IP - interface is in a failed state • failSAFE IP Addresses: • inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 (on QBB3 WE4) • *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE4) • WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 • inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500
RAC DT/HA – failSAFE IP DCL Failed 2 • $ TCPIP SHOW INTERFACE/FULL • Route Tree for Protocol Family 2: • default 161.114.69.1 UGS 0 7999 IE0 • 10.4.4/24 10.4.4.2 U 274 408185 WE3 • 10.4.4/24 10.4.4.3 U 274 445714 WE4 • 10.4.4.2 10.4.4.2 UHL 0 0 WE3 • 10.4.4.3 10.4.4.3 UHL 0 14 WE4 • WE3: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • *inet 10.4.4.2 netmask ffffff00 broadcast 10.4.4.255 ipmtu 1500 • inet 10.4.4.3 netmask ffffff00 broadcast 161.114.69.63 ipmtu 1500 • WE4: flags=c43<UP,BROADCAST,RUNNING,MULTICAST,SIMPLEX> • *failSAFE IP - interface is in a failed state. • failSAFE IP Addresses: • inet 10.4.4.2 netmask ffffff00 broadcast 161.114.69.63(on QBB3 WE3) • *inet 10.4.4.3 netmask ffffff00 broadcast 10.4.4.255 (on QBB3 WE3)
RAC DT/HA – T4 data failSAFE IP EWD0 cable pulled EWE0 cable pulled
RAC DT/HA – Load Generation Data 50k Transactions, no RAC or Network Failure
RAC DT/HA – Load Generation Data 50k Transactions, Network failover
RAC DT/HA – Load Generation Data 50k Transactions, 1 RAC instance failed
RAC DT/HA – Conclusions • RAC seemed to have no problems when running with network configured to use LAN Failover or failSAFE IP (on the same node). • There seems to be a definite distributing effect on network traffic when Oracle init.ora parameter CLUSTER_INTERCONNECTS is used
RAC DT/HA – Phase II and III • Phase II: Configure Oracle 9iRAC 2-node cluster using Raid-1 Shadow Sets for database and logfiles, and test recently released Host Based Mini-Merge (HBMM) functionality in a variety of configurations. • Refer to: http://h71000.www7.hp.com/news/hbmm.htm • Phase III: Distribute nodes in cluster over 100km+ distance and test failover and HBMM functionality
RAC DT/HA - References • OpenVMS Technical Journal: • Matt Muggeridge’s July 2003 - V2 Article: Configuring TCP/IP for High Availability http://h71000.www7.hp.com/openvms/journal/v2/articles/tcpip.pdf • Steve Lieman’s January 2004 - V3 Article: TimeLine-Driven Collaboration with T4 & Friends: A Time-saving Approach to OpenVMS Performance http://h71000.www7.hp.com/openvms/journal/v3/t4.pdf
RAC DT/HA – References (con’t) • TCPIP docs: • http://h71000.www7.hp.com/doc/tcpip54.html • OpenVMS docs: http://h71000.www7.hp.com/doc/os732_index.html • HP TCP/IP Services for OpenVMS Management:Chapter 5 Configuring and Managing FailSAFE IP • http://h71000.www7.hp.com/doc/732final/documentation/pdf/aa-lu50m-te.pdf
RAC DT/HA – References (con’t) • HP OpenVMS System Management Utilities Reference Manual:Chapter 13, LAN Control Program (LANCP) Utility • http://h71000.www7.hp.com/doc/732FINAL/DOCUMENTATION/PDF/aa-pv5ph-tk.PDF • HP OpenVMS System Manager’s Manual, Volume 2 -Tuning, Monitoring, and Complex Systems: Chapter 10, Managing the Local Area Network (LAN)Software • http://h71000.www7.hp.com/doc/732FINAL/aa-pv5nh-tk/aa-pv5nh-tk.pdf
RAC DT/HA – References (con’t) • Oracle References: • Swingbench – an ‘unofficial’ load generating benchmarking tool, developed in Java, which allows a load to be generated and the transactions/response times to be charted • http://www.dominicgiles.com/swingbench.php • OTN otn.oracle.com Real 24/7: Use Oracle9i RAC and TAF to guarantee availability. http://otn.oracle.com/oramag/oracle/02-may/o32clusters.html
RAC DT/HA – References (con’t) • Oracle Metalink articles: metalink.oracle.com. • Note:183340.1 - Frequently Asked Questions About the. • CLUSTER_INTERCONNECTS Parameter in 9i. • Note 220970.1 - “Which network is Oracle using for RAC traffic?" • Note: 162725.1 - OPS/RAC VMS: Using alternate TCP Interconnects on 8i OPS and 9i RAC on OpenVMS. • Note: 226880.1 – Configuration of Load Balancing and Transparent Application Failover.
OpenVMS Solutions Lab • Available to customers to test new hardware, software, applications • Alpha and Integrity systems available for use • To get the most benefit from the Lab, customer is expected to be prepared with exact list of hardware and software requirements, test plan and goals