350 likes | 552 Views
RAC Basics. Julian Dyke Independent Consultant. Web Version - February 2008. juliandyke.com. © 2008 Julian Dyke. Agenda. Real Application Clusters The Theory The Reality. RAC The Theory. RAC Redundancy. Single Point of Failure If component fails, system will be inaccessible
E N D
RAC Basics Julian Dyke Independent Consultant Web Version - February 2008 juliandyke.com ©2008 Julian Dyke
Agenda • Real Application Clusters • The Theory • The Reality
RACRedundancy • Single Point of Failure • If component fails, system will be inaccessible • Redundancy • Duplicate components • If component fails another can be used • Active-Active or Active-Passive • Examples include • Power Supplies • RAID • Bonded Networks • IO Multipathing • Oracle RAC
Node 1 Node 2 Node 3 Node 4 Instance 1 Instance 2 Instance 3 Instance 4 SharedStorage RAC4-node cluster Public Network PrivateNetwork(Interconnect) StorageNetwork
RACCache Coherency • RAC must ensure changes made by any instance • Are not overwritten by another instance • Maintain ACID properties • Current Blocks • Blocks can be updated by any instance • Only current version of a block can be updated • Only one current version of a block can exist across all instances • Consistent Read Blocks • Can have theoretically unlimited number of consistent versions of a block • in each instance • across all instances
RACCluster Manager • All clusters must have cluster management software • Manages node membership and evictions • Oracle Clusterware • Mandatory for RAC in Oracle 10.1 and above • Known as Cluster Ready Services (CRS) 10.1 only • Can be combined with vendor clusterware • IBM HA/CMP • HP ServiceGuard • Sun Cluster • Must be running before ASM/RDBMS instances can be started on a node • Can be used with non-RAC databases and applications • Oracle 10.2 and above
RACInterconnect • Used for inter-node communication by: • Oracle Clusterware • ASM Instances • RDBMS Instances • Optimally high bandwidth / low latency • Typically 1GB Ethernet • Uses TCP / UDP protocols • NIC interfaces often bonded for availability • Other physical networks supported e.g. Infiniband
RACShared Storage • Required for: • Oracle Clusterware Files • Oracle Cluster Registry (OCR) • Voting Disk • Database Files • Control Files • Database • Online Redo Logs • Server Parameter File • Strongly recommended for • Archived redo logs • Backup copies
RACShared Storage • Can use: • Storage Area Network (SAN) e.g.: • EMC Clariion / Symmetrix • HP MSA / EVA / XP series • Hitachi • Fujitsu • Network Attached Storage (NAS) e.g.: • Network Appliance • Pillar Data System • Sun StorageTek • EMC Celerra • JBOD (with ASM)
RACShared Storage • Fibre Channel • SCSI protocol - block based • Normally 2Gb or 4Gb • Requires one or more Host Bus Adapters (HBA) per node • Requires fabric switches • iSCSI • SCSI protocol - block based • Packets sent over dedicated IP network • Can use standard network components • Processing often offloaded to NIC firmware • NFS • File-based • Uses standard network components
RACShared Storage • Cluster-aware File Systems: • Automatic Storage Management • Cluster File Systems • Oracle Cluster File System (OCFS/OCFS2) • Red Hat GFS • IBM GPFS • Sun Storedge QFS • Veritas CFS • Network File System • On supported Network Attached Storage only
RACAutomatic Storage Management (ASM) • Introduced in Oracle 10.1 • Additional functionality in 10.2 and 11.1 • Generic code (all supported platforms) • Available for both single-instance and RAC databases • Provides shared storage for RAC • Can optionally provide mirroring: • Normal Redundancy (mirrored) • High Redundancy (triple mirroring) • Useful with JBOD or extended clusters • Mandatory for Oracle 10g Standard Edition RAC • Presents storage as disk groups containing • Physical disks • Logical files • Requires additional ASM instance on each node
RACLicensing • Standard Edition • RAC option free • Maximum two nodes • Maximum four CPUs • Must use Oracle Clusterware • Must use Automatic Storage Management (ASM) • No extended clusters • Enterprise Edition • RAC option 50% extra (per EE license) • No limit on number of nodes • No limit on number of CPUs • Can use any shared storage (ASM, CFS or NFS) • Can use Enterprise Manager Packs (Diagnostics, Tuning..)
Clusterware Clusterware OPROCD OCSSD CRSD EVMD OPROCD OCSSD CRSD EVMD +ASM1 +ASM2 PMON SMON LGWR DBWn ARCH PMON SMON LGWR DBWn ARCH LMON LCK0 LMD0 LMSn DIAG LMON LCK0 LMD0 LMSn DIAG PROD1 PROD2 PMON SMON LGWR DBWn ARCH PMON SMON LGWR DBWn ARCH LMON LCK0 LMD0 LMSn DIAG LMON LCK0 LMD0 LMSn DIAG Node 1 Node 2 RACProcess Architecture
RAC Reasons For Deployment • Availability • Node failure • Instance failure • Scalability • Distribute workload across multiple instances • Scale out • Manageability • Economies of scale • Administration / Monitoring / Backups / Standby • Reduction in total cost of ownership • Database consolidation • Commodity hardware
RACAvailability • Ensure continued availability of database in event of node or instance failure • Automatic failover • No human intervention required • In the event of node or instance failure: • All sessions connected to failed node are terminated • Sessions connected to remaining nodes are • temporarily suspended while resources are re-mastered • resume after brown-out period • New sessions will be connected to remaining nodes only • Ensuring availability requires spare capacity during normal operations • Either additional node • Or reduction in service level
Node 1 Node 2 Node 3 Node 4 Instance 1 Instance 2 Instance 3 Instance 4 SharedStoage RAC Availability Public Network PrivateNetwork(Interconnect) StorageNetwork
RACScalability • Workload can be distributed across multiple nodes • Workload can be balanced across all nodes using connection management • Client-side using Oracle Net • Server-side using listener processes • Workload can be directed to specific nodes using services • Level of scalability dependent on application Resources Resources Throughput Throughput
RACScalability • Factors that can degrade scalability • Excessive parsing • Consistent reads • SELECT FOR UPDATE / user defined locking • DDL • Object-oriented code • Features that can improve scalability • Services • Automatic Segment Space Management • Partitioning • Sequences • Reverse indexes
RACManageability • Advantages • Consolidation • Economies of scale • Administration • Monitoring • Backup and recovery • Standby database • Disadvantages • Increased Planned downtime • Complexity • Dependencies • Skills
RACTotal Cost of Ownership • Benefits • Lower hardware costs - commodity hardware • Lower support costs • Management economies of scale • Costs • Redundant hardware • Servers, Storage, NIC, HBA, Switches, Fabric • Oracle licenses • Experienced staff • Application modifications
RACApplications • Most applications should run on RAC without modification • Performance is not guaranteed • Applications that perform well in single-instance have best chance of scaling in RAC • Applications performing badly in single-instance will perform worse in RAC • Some features do not port easily to RAC e.g.: • DBMS_ALERT, DBMS_PIPE, External files • Applications that can be logically partitioned tend to scale best • Minimize use of interconnect • Maximize use of buffer caches • Implementation more likely to succeed if you have direct or indirect access to source code
RACDatabase Services • Allow sessions with similar workload characteristics to be logically grouped and managed • Services can be assigned to • set of preferred instances - used if available • set of available instances - used if preferred instances not available • failover to available instances is automatic • failback to preferred instances is manual • Services can be configured to maximize instance affinity • Limited statistics reported at service level • Can also be reported at service / module / action level • Trace can be enabled at service level • Can also be enabled at service / module / action level
RACDatabase Services Before After SERVICE1 SERVICE1 Listener1 Listener2 Listener1 Listener2 PROD1 PROD2 PROD1 PROD2 SERVICE1 SERVICE1 SERVICE1
RACExtended Clusters • Currently the Holy Grail of high availability • RAC nodes located at physically separate sites • Implicit disaster recovery • Requires Enterprise Edition licences + RAC option • In the event of a site failure, database is still available • Storage is duplicated at each site • Can use ASM or vendor-supplied storage technology • Active / Active configuration • Users can access database via either site • Configuration and performance tuning are complex • Cache fusion traffic between sites
RACExtended Clusters Private Network Public Network Quorum Instance 1 Instance 2 Node 1 Node 2 Site3 Storage Network Storage Network Database Database Site1 Site2
RACDisaster Recovery • Data Guard and RAC are fully compatible • Can configure any permutation e.g. • All instances can participate in redo log shipping • Only one instance can perform managed recovery • Standby database might be a potential bottleneck
RACAlternatives • Single Instance Databases • No RAC overhead • Simpler to install / configure / manage • Single point of failure • Oracle Products • Oracle Streams • Oracle Clusterware • Proprietary Clustering Solutions • HP ServiceGuard • IBM HA/CMP • Sun Cluster
RACThe Reality • Many sites running RAC • Mostly Oracle 10.2 • A few still running Oracle 10.1 • Still some Oracle 9.2 • Most RAC users develop their own applications or use bespoke applications developed by a third-party • Probably around 20 extended clusters in production across Europe • Many Oracle 10.2 sites run ASM • Very few run OCFS or raw devices • Very few use third-party cluster file systems • Most sites using SAN - fewer using NAS • In UK most users currently deploy on Linux x86-64 • Solaris very popular in other regions
RACThe Reality • Few Oracle 10g users run vendor clusterware • Most RAC deployments for availability • Decreased unplanned downtime • Increased planned downtime • Increasing number of deployments for scalability • Workload balancing • Services • Manageability benefits very doubtful • Economies of Scale versus Additional complexity • TCO reductions possible in some circumstances • Replace large SMP boxes • Replace legacy active-passive clusters
RAC The Reality • Most users run 2-node clusters • Some have 3-node or 4-node clusters • A handful run five nodes or more • Most users only have one database per cluster • Few grids • Oracle Clusterware scales well • Number of nodes does not impact performance • Oracle RAC databases might scale well • Dependent on application • Additional nodes may improve or degrade performance
RACThe Reality • ASM currently the most popular RAC storage technology • Deployed in numerous Oracle 10.2 RAC production systems • No operating system utilities • ASMCMD in Oracle 10.2 and above • Generally disliked by storage administrators • Too much control to DBAs • Acceptable performance • ASM instance provides metadata • RDBMS instances read and write blocks directly from files
Thank you for your interest • References • http://www.juliandyke.com/References/References.html • Questions • info@juliandyke.com