210 likes | 658 Views
An Overview of GoldenGate Replication. Brian Keating December 31, 2009. Introduction. The GoldenGate replication utility provides a flexible and powerful framework, for automatic propagation of data changes.
E N D
An Overview of GoldenGate Replication Brian Keating December 31, 2009
Introduction The GoldenGate replication utility provides a flexible and powerful framework, for automatic propagation of data changes. This presentation will provide an overview of GoldenGate replication; and it will display some examples of actually using GoldenGate. In order to provide examples of using GoldenGate, a “proof of concept” (POC) environment, which uses GoldenGate, has been set up.
Overview of GoldenGate Replication GoldenGate replication consists of three basic utilities: “extract”, “replicat”, and “data pump”. In addition, all three of those utilities access “trail files”, which are flat files that contain formatted GoldenGate data. An extract group mines a database’s redo logs; and writes information about the changes it finds into trail files. Extract is functionally similar to the Streams “capture” process. A replicat group reads information from trail files; and then writes those changes into database tables. Replicat is functionally similar to the Streams “apply” process. A data pump group copies information between trail files. Data pump is functionally similar to the Streams “propagate” process.
Overview of the POC Environment In the GoldenGate proof of concept environment, two separate “types” of replication have been implemented – “audit-style” replication, and “archive-style” replication. Audit-style replication maintains an “audit trail” of DML operations that have been executed on source tables. This is similar to the replication provided by Oracle Change Data Capture (CDC). Archive-style replication causes inserts and updates, and some DDL statements, to be replicated from source tables into target tables. Deletes, drops, and truncates on source tables are not replicated to target tables, however. As the name implies, this type of replication is appropriate for maintaining “archive” databases.
Diagram of Audit-Style Replication Here is a basic diagram of the various steps involved in archive-style replication: Source Redo Log Source Trail File Source Audit Table Source Extract Group Source Replicat Group 1 2 3 4
Description of Audit-Style Replication The following is a text description of the diagram on the last slide. With audit-style replication, whenever any DML operation is committed on a source table, the following items will occur: The source extract group notices the DML operation, in the source database’s redo log. The source extract group writes information about that DML operation, into the source trail file. The source replicat group reads that information from the source trail file. The source replicat group inserts that information into the source “audit” table. Note: with audit-style replication, DDL operations do not get replicated at all. Also note: in this particular POC environment, audit-style replication is implemented with all objects on the source system – i.e., everything is “self-contained”, on the same source host. It is also possible to implement audit-style replication with objects spread out among multiple systems.
Diagram of Archive-Style Replication Here is a basic diagram of the various steps involved in archive-style replication: Source Redo Log Source Trail File Target Trail File Target Archive Table Source Extract Group Source Data Pump Target Replicat Group 1 2 3 4 5 6
Description of Archive-Style Replication The following is a text description of the diagram on the last slide. With archive-style replication, whenever any DML or DDL operation is committed on a source table, the following items will occur: The source extract group notices the change (the DML or DDL operation) in the source database’s redo log. If the DML operation is NOT a delete, or if the DDL operation is NOT a “drop” or “truncate”, then the source extract group writes that change into the source trail file. The source data pump group reads that change from the source trail file. The source data pump group writes that change into the target trail file. The target replicat group reads that change from the target trail file. The target replicat group executes that DML or DDL operation on the target archive table. Note: as mentioned above, “delete” DML statements are not replicated. In addition, any DDL statements that contain the strings “drop” or “truncate” are not replicated.
Examples of Audit-Style Replication The next four slides contain some examples of audit-style replication. That is, those slides will provide some examples of DML and DDL operations on a test “source” table – and then they will show you what data would end up getting replicated into the corresponding “audit” table, through audit-style replication. In these examples, the name of the “source” table is ggs.ts_test, and the name of the “audit” table is ggs.ts_test_ct. Here are the descriptions of those tables: SQL> desc ggs.ts_test Name Null? Type ----------------------------------------- -------- ---------------------------- ID_VALUE NOT NULL NUMBER STRING_VALUE VARCHAR2(20) DATE_VALUE DATE SQL> desc ggs.ts_test_ct Name Null? Type ----------------------------------------- -------- ---------------------------- OPERATION$ VARCHAR2(20) COMMIT_TIMESTAMP$ TIMESTAMP(6) ID_VALUE NUMBER STRING_VALUE VARCHAR2(20) DATE_VALUE DATE
Audit-Style Example 1: Insert Insert statements: insert into ggs.ts_test values (1, ‘test 1’, sysdate); insert into ggs.ts_test values (2, ‘test 2’, sysdate); commit; Results: SQL> select * from ggs.ts_test; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 test 1 12/29/2009 15:25:49 2 test 2 12/29/2009 15:25:57 SQL> select * from ggs.ts_test_ct; OPERATION$ COMMIT_TIMESTAMP$ ID_VALUE STRING_VALUE DATE_VALUE ---------- ------------------- -------- ------------ ------------------- INSERT 12/29/2009 15:26:02 1 test 1 12/29/2009 15:25:49 INSERT 12/29/2009 15:26:02 2 test 2 12/29/2009 15:25:57
Audit-Style Example 2: Update Update statement: update ggs.ts_test set string_value = ‘update’ where id_value = 1; commit; Results: SQL> select * from ggs.ts_test; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 update 12/29/2009 15:25:49 2 test 2 12/29/2009 15:25:57 SQL> select * from ggs.ts_test_ct; OPERATION$ COMMIT_TIMESTAMP$ ID_VALUE STRING_VALUE DATE_VALUE ---------- ------------------- -------- ------------ ------------------- INSERT 12/29/2009 15:26:02 1 test 1 12/29/2009 15:25:49 INSERT 12/29/2009 15:26:02 2 test 2 12/29/2009 15:25:57 UPDATE OLD 12/29/2009 15:29:07 1 test 1 12/29/2009 15:25:49 UPDATE NEW 12/29/2009 15:29:07 1 update 12/29/2009 15:25:49
Audit-Style Example 3: Delete Delete statement: delete from ggs.ts_test where id_value = 1; commit; Results: SQL> select * from ggs.ts_test; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 2 test 2 12/29/2009 15:25:57 SQL> select * from ggs.ts_test_ct; OPERATION$ COMMIT_TIMESTAMP$ ID_VALUE STRING_VALUE DATE_VALUE ---------- ------------------- -------- ------------ ------------------- INSERT 12/29/2009 15:26:02 1 test 1 12/29/2009 15:25:49 INSERT 12/29/2009 15:26:02 2 test 2 12/29/2009 15:25:57 UPDATE OLD 12/29/2009 15:29:07 1 test 1 12/29/2009 15:25:49 UPDATE NEW 12/29/2009 15:29:07 1 update 12/29/2009 15:25:49 DELETE 12/29/2009 15:31:42 1 update 12/29/2009 15:25:49
Audit-Style Example 4: Truncate Truncate statement: truncate table ggs.ts_test; Results: SQL> select * from ggs.ts_test; no rows selected SQL> select * from ggs.ts_test_ct; OPERATION$ COMMIT_TIMESTAMP$ ID_VALUE STRING_VALUE DATE_VALUE ---------- ------------------- -------- ------------ ------------------- INSERT 12/29/2009 15:26:02 1 test 1 12/29/2009 15:25:49 INSERT 12/29/2009 15:26:02 2 test 2 12/29/2009 15:25:57 UPDATE OLD 12/29/2009 15:29:07 1 test 1 12/29/2009 15:25:49 UPDATE NEW 12/29/2009 15:29:07 1 update 12/29/2009 15:25:49 DELETE 12/29/2009 15:31:42 1 update 12/29/2009 15:25:49
Examples of Archive-Style Replication The next four slides contain some examples of archive-style replication. That is, those slides will provide some examples of DML and DDL operations on a test “source” table – and then they will show you what data would end up getting replicated into the corresponding “archive” table, through archive-style replication. In these examples, the name of the “source” table, and the name of the “archive” table, is ggs.bp_test. (Those two tables reside in separate databases.) So, in my queries I will refer to those tables by the aliases “source” and “archive”. Here are the descriptions of those tables: SQL> desc ggs.bp_test Name Null? Type ----------------------------------------- -------- ---------------------------- ID_VALUE NOT NULL NUMBER STRING_VALUE VARCHAR2(20) DATE_VALUE DATE SQL> desc ggs.bp_test Name Null? Type ----------------------------------------- -------- ---------------------------- ID_VALUE NOT NULL NUMBER STRING_VALUE VARCHAR2(20) DATE_VALUE DATE
Archive-Style Example 1: Insert Insert statements: insert into ggs.bp_test values (1, ‘test 1’, sysdate); insert into ggs.bp_test values (2, ‘test 2’, sysdate); commit; Results: SQL> select * from ggs.bp_test source; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 test 1 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31 SQL> select * from ggs.bp_test archive; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 test 1 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31
Archive-Style Example 2: Update Update statement: update ggs.bp_test set string_value = ‘update’ where id_value = 1; commit; Results: SQL> select * from ggs.bp_test source; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 update 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31 SQL> select * from ggs.bp_test archive; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 update 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31
Archive-Style Example 3: Delete Delete statement: delete from ggs.bp_test where id_value = 1; commit; Results: SQL> select * from ggs.bp_test source; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 2 test 2 12/29/2009 16:20:31 SQL> select * from ggs.bp_test archive; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 update 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31
Archive-Style Example 4: Truncate Truncate statement: truncate table ggs.bp_test; Results: SQL> select * from ggs.bp_test source; no rows selected SQL> select * from ggs.bp_test archive; ID_VALUE STRING_VALUE DATE_VALUE -------- ------------ ------------------- 1 update 12/29/2009 16:20:27 2 test 2 12/29/2009 16:20:31
Rowid / Unused Column Information GoldenGate is able to replicate the rowid datatype without any problems. In other words, tables that have columns with the rowid datatype can be replicated with GoldenGate. (Note that Oracle Streams is not able to replicate the rowid datatype.) GoldenGate is able to replicate tables that have “unused” columns, as long as certain conditions are met. First, the “DBOPTIONS ALLOWUNUSEDCOLUMN” parameter must be set in the GoldenGate extract group. Also, if any columns get set to unused in a source table, then those columns must also manually get set to unused in the corresponding target table. (In other words, GoldenGate cannot replicate the “alter table set unused” command.)
Performance / Scalability Information GoldenGate has a relatively limited ability to do “parallel” processing; i.e. to use multiple CPUs as part of its replication. Basically, using parallel processing in GoldenGate involves creating multiple replicat groups – and then manually “splitting up” the overall processing between those groups. For example, you could create two replicat groups – and then have one group process table “a” while the other group processes table “b”. Of course, this will only work if there are no referential integrity constraints (or even “logical” constraints) between tables a and b. GoldenGate also has a very useful ability to do “batch-style” processing. Basically, the replicat utility has the ability to apply changes with array processing – rather than using row-at-a-time processing. This can dramatically improve the performance of replicat operations. From my preliminary tests, batch-style processing can reduce the overall replication latency by up to 67% - i.e., the latency can be reduced down to one third of its original, non-batch latency.