370 likes | 538 Views
RAC and 11i - 101. SURENDER SARA NCOAUG Email : SURENDER.SARA@ORABYTE.COM SURENDER.SARA@SERACONSULTING.US. Two Node Architecture, Unprotected. Two Node Architecture, Protected Apps tier and unprotected DB tier. Two Node Architecture, Protected Apps tier and DB tier. Failover Cluster.
E N D
RAC and 11i - 101 SURENDER SARA NCOAUG Email : SURENDER.SARA@ORABYTE.COM SURENDER.SARA@SERACONSULTING.US
Two Node Architecture, Protected Apps tier and unprotected DB tier
Failover Cluster • Detecting failure by monitoring the heartbeat and checking status of resources • Reorganizing Cluster membership in the cluster manager • Transferring Disk ownership from primary node to secondary node • Mounting the FS on secondary node • Starting DB instance • Recovering the Database and rollback of uncommitted data • Reestablishing the client connections to the failover node
FAILOVER CLUSTER OFFERINGS • Veritas cluster server • HP Service Guard • Microsoft Cluster Service with Oracle Failsafe • RedHat Linux Advanced Server 2.1 • Sun Cluster Oracle Agent • Compaq, now HP, Segregated Cluster • HACMP
Real Application Cluster • Many instances of Oracle running on many nodes • Multiple instances share a single physical database • All instances have common data, control, and initialization files • Each instances has individual, shared log files and rollback segments or undo tablespaces • All instances can simultaneously execute transactions against the single database • Caches are synchronized using Oracle’s Global Cache Management technology (Cache Fusion)
RAC Building Blocks • Instance and Database files • Shared storage with OCFS, CFS or raw devices • Redundant HBA cards per HOST • Redundant NIC cards per HOST, one for cluster interconnect and one for LAN connectivity • Local RAID protected drives for ORACLE_HOMES ( OCFS does not support ORACLE_HOME install)
CLUSTERINTER CONNECT FUNCTION • - Monitoring Health, status and message synchronization • - Transporting Distributed Lock manager messages • - Accessing remote File system • - Moving application specific traffic • - providing cluster alias routing Interconnect Requirements • - Low latency for short messages • - High speed and sustained data rates for large messages • - LOW Host CPU utilization • - Flow Control, Error Control and heart beat continuity monitoring • - switched network that scale well
INTERCONNECT PRODUCTS • Memory Channel • SMP Bus • Myrinet • Sun SCI • Gigabit Ethernet • Infiband Interconnect
INTERCONNECT PROTOCOL • TCP/IP • UDP • VIA • RDG • HMP
IO CHANNEL HBA Products • Adaptec • DPT • LSI Logic • Interphase • Qlogic • Emulex • JNI
FACRIC SWITCHES • mcDATA • EMC • QLOGIC • BROCADE
CLUSTER NODES NUMA SMP • - shared system bus and IO • - expensive and scalability problems • - Adding more CPU can result into upgrading architecture components • - DELL and HP-Compaq BLADE Servers • - BladeFram system from egenera • - egenera - 24 2 way and 4 way SMP processing resources • - egenera - redundant central controllers ,redundant high-speed interconnects, PAN manager • - egenera - PAN manager handles external storage mapping and virtualization • - egenera - PAN manager handles , IO and network traffic to and from individual servers
Oracle’s High Availability (HA) Solution Stack Real Application ClustersContinuous Availability for all Applications System Failure Data GuardZero Data Loss UnplannedDowntime Data Failure& Disaster Flashback QueryEnable Users to Correct their Mistakes Human Error Dynamic ReconfigurationCapacity on Demand without Interruption SystemMaintenance PlannedDowntime Online RedefinitionAdapt to Change Online Data Maintenance
Shared Storage Options • NFS Mounted storage ( Netapp ) • SCSI shared storage with OCFS, OFS, Raw devices • Fiber channel Storage with fabric Architecture
11i Steps - 1 • Install RED HAT As 2.1 on all nodes • Install 11i as single node install on Apps Tier • Attached shared storage and install drivers for HBA
11i Steps -2 ( install OS Patches) • rpm -Uv tar-1.13.25-9.i386.rpm • This provides an updated version of tar • Allows a user to tar files from a running database on OCFS • examples : • tar --o_direct -cvf /tmp/backup.tar *
11i Steps -2 ( install OS Patches) • rpm -Uv fileutils-4.1-4.2.i386.rpm • This provides an updated version of cp and dd • Allows a user to copy files from a running database on OCFS • examples : • cp --o_direct /ocfs/quorum.dbf /tmp/backup/quorum.dbf • dd o_direct=yes if=/ocfs/quorum.dbf of=/tmp/backup/quorum.dbf
11i Steps -3 Install oracle provided RPM’s • ocfs-support-1.0.9-11.i686.rpm • ocfs-tools-1.0.9-11.i686.rpm • j2sdk-1_3_1_09-linux-i586.rpm.bin • unzip-5.50-30.i386.rpm • zip-2.3-10.i386.rpm • wu-ftpd-2.6.1-21.i386.rpm • hangcheck-timer-2.4.9-e.10-0.4.0-2.i686.rpm • hangcheck-timer-2.4.9-e.10-enterprise-0.4.0-2.i686.rpm
11i steps -3 ( interconnect) • ifconfig eth0:0 192.168.2.100 • route add -host 192.168.2.100 dev eth0:0 • Do this on each node • Create watchdog file (oracle installer checks for this to install cluster option) # touch /dev/watchdog • Setup hangcheck-timer module • # vi /etc/modules.conf • options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180 • # modprobe hangcheck-timer
11i steps -5 OCFS.conf – 5 • # ocfstool ( from x windows) • # ocfs config • # Ensure this file exists in /etc • # • node_name = linux3.home.com • node_number = • ip_address = 192.168.1.100 • ip_port = 7000 • comm_voting = 1 • guid = 9D3B77AF2FF26E92E25D00E04CA44B58
11i Steps -6 install OCFS • mkfs.ocfs -F -b 128 -L /s01 -m /s01 -u 500 -g 500 0755 /dev/sda1 • srvconfig_loc=/s01/oragsd-config ( touch this file)
11i steps -7 OCM • $ ls • If cmcfg.ora exists: • $ cp cmcfg.ora cmcfg.ora.original • If cmcfg.ora does not exist: • $ cp cmcfg.ora.tmp cmcfg.ora • $ echo HostName=dc1node3inter >> cmcfg.ora • $ vi cmcfg.ora • [comment out WatchdogSafetyMargin and WatchdogTimerMargin] • PrivateNodeNames=linux22 linux33 • PublicNodeNames=linux2 linux3 • MissCount=210 • KernelModuleName=hangcheck-timer • CmDiskFile=/u02/oracm-qourum • $ vi ocmargs.ora • [comment out first line, which contains the word “watchdogd”] • $ cd ../bin • $ cp ocmstart.sh ocmstart.sh.original • $ vi ocmstart.sh • [remove words “watchdog and” from line containing “Sample startup script...”] • [remove every line containing “watchdogd”, uppercase or lowercase. If it’s in a if/then/fi then remove the whole if/then/fi.] • $ su – root • export ORACLE_HOME=/d02/oracle/proddb/9.2.0 • /d02/oracle/proddb/9.2.0/oracm/bin/ocmstart.sh • Configure and Start Cluster Manager • $ cd $HOME/product/9.2/oracm/admin
11i steps -4 ( cp/dd - DB files to shared storage ) • cp --o_direct /d03/oracle/proddata/* /s01/oracle/proddata/ • Recreate the controlfile
11i steps 8 – init.ora / spfile • Create UNDO TBS for each instance • Enable and disable thread for instance 2 from instance 1 and vice versa
11i steps 9 – instance 1 • # RAC-specific Parameters • # • ######### • cluster_database = true • cluster_database_instances=2 • thread = 1 • instance_number = 1 • instance_name = PRODi1 • service_names = PROD • local_listener = PRODi1 • remote_listener = PRODi2
11i steps 10 – instance 2 • cluster_database = true • cluster_database_instances=2 • thread = 2 • instance_number = 2 • instance_name = PRODi2 • service_names = PROD • local_listener = PRODi2 • remote_listener = PRODi1
11i Apps tier – 806/iAS tnsnames.ora • PROD = (DESCRIPTION= • (ADDRESS_LIST = • (ADDRESS=(PROTOCOL=tcp)(HOST=linux1)(PORT=1521)) • (ADDRESS=(PROTOCOL=tcp)(HOST=linux2)(PORT=1521)) • ) • (CONNECT_DATA=(SERVICE_NAME=PROD)(SERVER=DEDICATED)) • ) • PRODi2 = (DESCRIPTION= • (ADDRESS=(PROTOCOL=tcp)(HOST=linux2)(PORT=1521)) • (CONNECT_DATA=(INSTANCE_NAME=PRODi2)(SERVICE_NAME=PROD)) • ) • PRODi1 = (DESCRIPTION= • (ADDRESS=(PROTOCOL=tcp)(HOST=linux1)(PORT=1521)) • (CONNECT_DATA=(INSTANCE_NAME=PRODi1)(SERVICE_NAME=PROD)) • )
Modify DBC file for Failover • APPS_JDBC_DRIVER_TYPE=THIN • FND_MAX_JDBC_CONNECTIONS=100 # Setup at Apps Tier • APPS_JDBC_URL=jdbc:oracle:thin:@(DESCRIPTION= (ADDRESS_LIST=(LOAD_BALANCE=ON) (ADDRESS=(PROTOCOL=TCP)(HOST=linux1)(PORT=1521)) (ADDRESS=(PROTOCOL=TCP)(HOST=linux2)(PORT=1521))) (CONNECT_DATA=(SERVICE_NAME=prod)))
WHAT can & cannot failover • SQL* PLUS will failover using TAF • JDBC Connections will failover • Forms run time connections will not, users will have to reconnect
Questions And Answers • Surender.sara@veritiesllc.com Contact us for 5 DAY Live 11i and RAC building workshop