250 likes | 379 Views
LCG. Oracle RAC – adding and removing cluster nodes. W LCG Service Reliability Workshop CERN, November 30 th , 2007 Jacek Wojcieszuk, CERN IT. Part I. Node addition. Methods. Silent cloning procedures Enterprise Manager Grid Control cloning and adding instance addNode.sh and DBCA
E N D
LCG Oracle RAC – adding and removing cluster nodes WLCG Service Reliability Workshop CERN, November30th, 2007 Jacek Wojcieszuk, CERN IT
Part I Node addition RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 2
Methods • Silent cloning procedures • Enterprise Manager Grid Control cloning and adding instance • addNode.sh and DBCA • Manual DEMO In the presented demo as well as in the command examples appearing in the presentation, the following names are being used: demorac – cluster name itrac18 (demorac1, +ASM1), itrac19 (demorac2, +ASM2) - existing cluster nodes/instances itrac20 (demorac3, +ASM3) – node/instances that are being added to the cluster RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 3
Steps • Setup hardware for a new node • Install and configure OS • Configure account equivalency • Add Oracle Clusterware to the new node • Configure ONS for the new node • Add ASM and DB home to the new node • Configure a listener on the new node • Add an ASM and a database instances RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 4
Hardware setup • New node of the cluster should be: • of similar type as other cluster nodes (the same architecture e.g. x86_64) • of similar size to other cluster nodes • physically connected to the shared storage used by the cluster • physically connected to the cluster private network(s) RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 5
OS installation and configuration • Install OS on the new node • The same flavor and version as installed on other nodes (e.g. RHEL 4.0) • The same kernel version recommended • Make sure that all required packages are installed • Ideally the list of software packages should be identical on all cluster nodes • Configure kernel parameters and shell limits • Create Oracle-related OS groups and users • Configure network interfaces • On one of the old cluster nodes edit /etc/hosts file adding entries related to the new node; distribute this file to all cluster nodes • Configure hangcheck timer module # as root on the new node: echo "/sbin/modprobe hangcheck-timer" >> /etc/rc.d/rc.local echo "session required /lib/security/pam_limits.so" >> /etc/pam.d/login /sbin/modprobe hangcheck-timer RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 6
Account equivalency • Passwordless access from one Oracle software owner account to another has to be configured • Needed only during software installation • The easiest is to re-configure the equivalency for the whole cluster • Use sshUserSetup.sh script available on metalink # as oracle user from one of the cluster nodes ./sshUserSetup.sh -hosts „itrac18itrac18.cern.ch itrac19 itrac19.cern.ch itrac20 itrac20.cern.ch" -advanced • One can use clufvy tool to verify if the new machine is ready for clusterware installation # as oracle user from one of the existing cluster nodes $ORA_CRS_HOME/bin/cluvfy stage -pre crsinst -n itrac18,itrac19,itrac20 –r 10gR2 –osdba ci –orainv ci RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 7
Adding Oracle Clusterware and configuring ONS • Run addNode.sh script from an existing cluster node cd $ORA_CRS_HOME/oui/bin; ./addNode.sh • After Oracle Universal Installer is started, follow the wizard: • Provide information related to the node being added • Launch the installation • Run the scripts as root on new and old nodes as requested by the pop-up window • Close OUI and verify if nodeapps associated with the new node have been properly added to the Cluster registry • Use crs_stat command • Configure the new ONS: # From one of the old cluster nodes run: cat $ORA_CRS_HOME/opmn/conf/ons.config |grep remoteport remoteport=6200 cd $ORA_CRS_HOME/bin ./racgons add_config itrac20:6200 RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 8
Adding ASM and DB home (addNode script) • Depending on the chosen configuration can be done in either 1 or 2 steps • Run addNode script on an existing cluster node cd $ORACLE_HOME/oui/bin; ./addNode.sh • After Oracle Universal Installer is started, follow the wizard • Tick node that you want to add (should be ticked by default) • Install the software • Run root.sh script as requested by the pop-up window RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 9
Adding ASM and DB home (cloning) • Go to an existing cluster node and tar ORACLE_HOME directory # on the first node of the cluster cd $ORACLE_HOME/.. tar cvfp rdbms.tar rdbms • Copy created tar file to the new node and unpack it # on the new node cd $ORACLE_HOME/.. scp oracle@itrac18:$PWD/rdbms.tar . tar xvfp rdbms.tar • Remove from the $ORACLE_HOME/dbs directory instance specific files (pfiles and password files) • Run cloning procedure # on the new node cd $ORACLE_HOME/clone/bin perl clone.pl ORACLE_HOME="/ORA/dbs01/oracle/product/10.2.0/rdbms" ORACLE_HOME_NAME="OraDb10g_rdbms" '-O"CLUSTER_NODES={itrac18,itrac19,itrac20}"' '-O"LOCAL_NODE=itrac20"' RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 10
Listener configuration • From the new node run netca to configure a listener: • Choose ‘Cluster configuration’ • Select the new node only • Choose ‘Listener configuration’ • Choose ‘Add’ • Leave the default listener name • Leave intact the default protocol choice • Specify proper port • Complete the configuration • Check with the crs_stat command if the new listener has been registered to clusterware • Edit listener.ora and tnsnames.ora files on the new node: • You may want to remove extproc related entries • In tnsnames.ora add/edit entries related to the new listener • Distribute updated tnsnames.ora file to all nodes of the cluster RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 11
Adding ASM and database instances (with DBCA) • Although this procedure is recommended by Oracle we have seen that it is quite unreliable and therefore we strongly recommend using manual procedure described later • Run DBCA from an old cluster node: • Choose ‘Oracle Real Application Clusters database’ • Choose ‘Instance Management’ • Choose ‘Add an instance’ • Specify SYS username and password • Select name of the new node • Modify custom service definitions • Agree to extend ASM to the new node • Complete the operation RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 12
Adding ASM and database instances (manually) • Edit /etc/oratab file on the new node • add entries describing new ASM and DB instances • Create dump directories for ASM and DB instances as defined by initialization parameters • Create ASM pfile in the default location ($ORACLE_HOME/dbs) • Create a password file for ASM orapwd file=$ORACLE_HOME/dbs/orapw+ASM3 password=xxxx • Connect to an existing ASM instance and define values for init parameters specific for the new instance export ORACLE_SID=+ASM1; sqlplus / as sysdba ASM> alter system set instance_number=3 scope=spfile sid=‘+ASM3’; ASM> alter system set local_listener=‘LISTENER_+ASM3’ scope=spfile sid=‘+ASM3’; • Define ASM instance in the cluster registry and start it up srvctl add asm -n itrac20 -i +ASM3 -o $ORACLE_HOME srvctl start asm -n itrac20 RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 13
Adding ASM and database instances (manually) (2) • Connect to an existing database instance • Create an undo tablespace and a redo log thread for the new RAC node • Set properly init parameters specific to the new instance export ORACLE_SID=demorac1; sqlplus / as sysdba SQL> create undo tablespace undotbs3 datafile size 1G autoextend on next 1G maxsize 30G; SQL> alter database add logfile thread 3 group 31 size 512m; SQL> alter database add logfile thread 3 group 32 size 512m; SQL> alter database enable public thread 3; SQL> alter system set instance_number=3 scope=spfile sid=‘demorac3’; SQL> alter system set thread=3 scope=spfile sid=‘demorac3’; SQL> alter system set undo_tablespace=‘UNDOTBS3’ scope=spfile sid=‘demorac3’ SQL> alter system set local_listener=‘LISTENER_DEMORAC3’ scope=spfile sid=‘demorac3’; • Create DB pfile in the default location ($ORACLE_HOME/dbs) • Create a password file for the DB instance orapwd file=$ORACLE_HOME/dbs/orapwdemorac3 password=xxxx RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 14
Adding ASM and database instances (manually) (3) • Define clusterware targets related to the new database instance srvctl add instance -d demorac -i demorac3 –n itrac20 srvctl modify instance -d demorac –i demorac3 -s +ASM3 • Start the new instance using srvctl tool srvctl start instance -d demorac -i demorac3 • Verify with crs_stat tool if all the clusterware targets are up RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 15
Final steps • Install OEM agent • Redefine services if needed srvctl stop service -d demorac -s demorac_lb srvctl modify service -d demorac -s demorac_lb –i demorac1,demorac2,demorac3 -n srvctl start service -d demorac -s demorac_lb • Update your public tnsnames.ora or LDAP • During next maintenance you may want to increase value of the cluster_database_instancesinitialization parameter RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 16
Part II Node removal RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 17
Steps • Stop and delete the DB instance on the node to be deleted • Clean up the ASM instance • Remove the listener from the node to be deleted • Remove RAC and ASM software from oraInventory • Remove ONS configuration from the node • Remove the node from the clusterware RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 18
Deleting DB instance on the node to be deleted from the cluster • Redefine services in a way that they do not use the node being deleted (using srvctl tool) • From the node that is not being deleted run DBCA: • Choose ‘Oracle Real Application Clusters database’ • Choose ‘Instance Management’ • Choose ‘Delete an instance’ • Provide SYS credentials • Select the instance to delete • Proceed with the deletion • Verify with the crs_stat command that the instance has been removed from the cluster registry RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 19
Cleaning up the ASM instance • Stop the ASM instance # on any cluster node srvctl stop asm -n itrac20 • Remove the ASM instance # on any cluster node srvctl remove asm -n itrac20 • Check with crs_stat command if the ASM instance has been removed from OCR • On the node being deleted edit /etc/oratab file • remove entries related to removed DB and ASM instances RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 20
Removing the listener • Run netca on the node being deleted: • Choose ‘Cluster configuration’ • Select the node being deleted • Choose ‘Listener configuration’ • Choose ‘Delete’ • Complete the deletion • Check with crs_stat tool if the listener has disappeared from OCR RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 21
Removing database software • Update oraInventory on the node being removed # on the node being deleted: $ORACLE_HOME/oui/bin/runInstaller –updateNodeList ORACLE_HOME=$ORACLE_HOME CLUSTER_NODES=itrac20 -local • Uninstall the DB software using OUI # on the node being deleted: $ORACLE_HOME/oui/bin/runInstaller • Update oraInventory on other cluster nodes # on the first node of the cluster $ORACLE_HOME/oui/bin/runInstaller –updateNodeList ORACLE_HOME=$ORACLE_HOME CLUSTER_NODES=itrac18,itrac19 • If you use separate software installation for ASM repeat the steps above for this ASM_HOME RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 22
Removing ONS configuration and cluster software • Remove ONS configuration # From the first node of the cluster run: $ORA_CRS_HOME/bin/racgons remove_config itrac20:6200 • Delete clusterware software from the node # On the node to be removed as root run: $ORA_CRS_HOME/install/rootdelete.sh • Run rootdeletenode.sh script as root from the first node of the cluster to complete clusterware removal # On first node of the cluster as root: $ORA_CRS_HOME/install/rootdeletenode.sh itrac20 RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 23
Removing cluster software (2) • Update oraInventory on the node being removed # on the node being deleted: $ORA_CRS_HOME/oui/bin/runInstaller –updateNodeList ORACLE_HOME=$ORA_CRS_HOME CLUSTER_NODES=itrac20 CRS=true -local • Uninstall the Clusterware software using OUI # on the node being deleted: $ORA_CRS_HOME/oui/bin/runInstaller • Update oraInventory on other cluster nodes # on the first node of the cluster: $ORA_CRS_HOME/oui/bin/runInstaller -updateNodeList ORACLE_HOME=$ORA_CRS_HOME CLUSTER_NODES=itrac18,itrac19 CRS=true • Verify with the crs_stat command that the node has been properly removed from the cluster RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 24
Final steps (optional) • Remove binaries left on the removed node • Deconfigure OS oracle account equivalency • Stop and remove OEM agent from the node • Deconfigure private network interfaces RAC – adding and removing nodes – WLCG Service Reliability Workshop, Nov 2007 - 25