240 likes | 271 Views
Oracle Streams Performance. Eva Dafonte P érez. Outline. Oracle Streams Replication Replication Performance Streams Optimizations Downstream Capture Split & Merge Procedure TCP and Network Rules Flow Control Lessons Learned Transportable Tablespaces for Scalable Resynchronization
E N D
Oracle Streams Performance Eva Dafonte Pérez
Outline • Oracle Streams Replication • Replication Performance • Streams Optimizations • Downstream Capture • Split & Merge Procedure • TCP and Network • Rules • Flow Control • Lessons Learned • Transportable Tablespaces for Scalable Resynchronization • Summary 2
Oracle Streams Replication SourceDatabase Target Database Apply Propagate Capture Redo Logs • Technology for sharing information between databases • Database changes captured from the redo-log and propagated asynchronously as Logical Change Records (LCRs) 3
Streams Setup for ATLAS RAC: 6 nodes dual CPUs RAC: 4 nodes quad-core CPUs RAC: 3 nodes quad-core CPUs RAC: 2 nodes dual CPUs 5
Streams Setup for LHCb(Conditions and LFC) RAC: 4 nodes quad-core CPUs RAC: 3 nodes quad-core CPUs RAC: 2 nodes dual CPUs 6
Replication performance • The atomic unit is the change record: LCR • LCRs can vary widely in size • Throughput is not a fixed measure • Capture performance: • Read LCRs information from the redo • From redo log buffer (memory - much faster) • From archive log files (disk) • Convert changes into LCRs • depends on the LCR size and number of columns • Enqueue the LCRs • Independent of the LCR size 7
Replication performance • Propagation performance: • Browse LCRs • Transmit LCRs over the network • Remove LCRs from the queue • Done in separate process to avoid any impact • Apply performance: • Browse LCRs • Execute LCRs • Manipulate the database is slower than the redo generation • Execute LCRs serially => apply cannot keep up with the redo generation rate • Remove LCRs from the queue 8
Downstream Capture Apply Target Database DownstreamDatabase SourceDatabase Propagate Redo Transport method Capture Redo Logs • Downstream capture to de-couple Tier 0 production databases from destination or network problems • source database availability is highest priority • more resources allocated to the Streams processes • Optimizing redo log retention on downstream database to allow for sufficient re-synchronisation window • we use 5 days retention to avoid tape access • Optimizing capture parameters • retention time and checkpoint frequency 9
Split & Merge Procedure • If one site is down • LCRs are not removed from the queue • capture process might be paused by flow control impact on replication performance • Objective: isolate replicas against each other • split the capture process • (original) Real-Time capture for sites “in good shape” • (new) normal capture for site/s unstable/s • new capture queue and propagation job • original propagation job is dropped • spilled LCRs are dropped from the original capture queue • merge the capture processes • resynchronization suggested by Patricia McElroy ( Principal Product Manager Distributed Systems/Replication) 10
TCP and Network Optimizations • TCP and Network tuning • adjust system max TCP buffer (/etc/sysctl.conf) • parameters to reinforce the TCP tuning • DEFAULT_SDU_SIZE=32767 • RECV_BUF_SIZE and SEND_BUF_SIZE • Optimal: 3 * Bandwidth Delay Product • Reduce the Oracle Streams acknowledgements • alter system set events '26749 trace name context forever, level 2'; 11
Streams Rules • Avoid complex rules: • LIKE • Functions • NOT Complex Rule condition => '( SUBSTR(:ddl.get_object_name(),1,7) IN (''COMP200'', ''OFLP200'', ''CMCP200'', ''TMCP200'', ’'TBDP200'', ''STRM200'') OR SUBSTR (:ddl.get_base_table_name(),1,7) IN (''COMP200'', ''OFLP200'', ''CMCP200'', ''TMCP200'', ''TBDP200'', ''STRM200'') ) ' Simple Rule condition => '(((:ddl.get_object_name() >= ''STRM200_A'' and :ddl.get_object_name() <= ''STRM200_Z'') OR (:ddl.get_base_table_name() >= ''STRM200_A'' and :ddl.get_base_table_name() <= ''STRM200_Z'')) OR ((:ddl.get_object_name() >= ’'OFLP200_A'' and :ddl.get_object_name() <= ''OFLP200_Z'') OR (:ddl.get_base_table_name() >= ’'OFLP200_A'' and :ddl.get_base_table_name() <= ''OFLP200_Z'')) • Used to control which information to share • Rules on the capture side caused more overhead than on the propagation side • Avoid Oracle Streams complex rules 12
Streams Rules Complex Rules 600 80 Simple Rules • Example: ATLAS Streams Replication • filter tables by prefix 13
Streams Rules • Example: ATLAS Conditions • 2 GB per day • Offline (CERN) to Tier1 (Gridka) 14
Flow Control • By default, flow control kicks when the number of messages is larger than the threshold • Buffered publisher: 5000 • Capture publisher: 15000 • Manipulate default behavior • 10.2.0.3 + Patch 5093060 = 2 new events • 10867: controls threshold for any buffered message publisher • 10868: controls threshold for capture publisher • 10.2.0.4 = 2 new hidden parameters • “_capture_publisher_flow_control_threshold” • “_buffered_publisher_flow_control_threshold” 15
Flow Control • Example: ATLAS PVSS tests • Online to Offline (CERN), 6 GB per day • default: 2500 - 3000 LCRs per second • new thresholds: 3500 - 4000 LCRs per second 16
Lessons Learned • SQL bulk operations (at the source db) • May map to many elementary operations at the destination side • Need to control source rates to avoid overloading • Long transactions (non-frequent commits) • Total number of outstanding LCRs is too large • LCRs are in memory too long • LCRs are spilled over to disk • Apply performance is impacted • All LCRs in a single transaction must be applied by one apply server • Parallel servers cannot be used efficiently • Too many unbrowsed messages enables flow control • Streams processes are paused 17
Lessons Learned • Example: CMS replication • Online to Offline (CERN) • Single transaction mapping 428400 LCRs 18
Lessons Learned • Example: CMS replication • Use BLOB objects: single transaction mapping 3600 LCRs • Performance depends on the number of LCRs per transaction, rather than LCR size 19
Scalable Resynchronization • Target site out of the Streams recovery window • Complete transfer of data (schemas and tables) using Oracle Data Pump utility to destination database may take days • Example ATLAS Conditions data • Transportable Tablespaces: move a set of tablespaces from one Oracle database to another • Export metadata of tablespace instead of data in tablespace Moving data using transportable tablespaces is much faster than Data Pump export/import 20
Scalable Resynchronization • Restrictions: • Database block size and character set must be the same at source and target • The Oracle release and the compatibility must be the same or higher at target • Tablespaces must be self contained • Tablespaces need to be set to read-only while the files are copied • Cross-Platform since Oracle 10g • Oracle provides a byte-order-conversion solution that uses Oracle Recovery Manager (RMAN) • It is NOT possible to use tablespaces from Tier0 • BUT we can use tablespaces from Tier1 21
Scalable Resynchronization 1 2 A A A A A A A 5 3 4 C C C 3 5 Source Database Target Databases • Create database links between databases • Create directories pointing to datafiles • Stop replication to site 5 • Ensure tablespaces are read-only • Transfer the data files of each tablespace to the remote system • Import tablespaces metadata in the target • Make tablespaces read-write • Reconfigure Streams 22
Summary • The LCG Database Deployment Project (LCG 3D) has set up a world-wide distributed database infrastructure for LHC • some 124 RAC nodes = 450 CPU cores at CERN + several tens of nodes at 10 partner sites are in production now • Monitoring of database & streams performance has been implemented building on grid control and strmmon tools • key to maintain and optimise any larger system • Important optimizations have been implemented in order to increase the Streams performance • collaboration with Oracle as part of the CERN openlab has been essential and beneficial for both partners 23
Questions? 24