630 likes | 780 Views
E194 Multiple Primaries Replication Topology - Design Considerations. Viji Sripathi Vice President Bank of America viji.sripathi@bankofamerica.com. Agenda. Multiple primaries replication topologies Peer-to-peer (P2P) replication model Replicating identity columns in a P2P model
E N D
E194 Multiple Primaries Replication Topology - Design Considerations • Viji Sripathi • Vice President • Bank of America • viji.sripathi@bankofamerica.com
Agenda • Multiple primaries replication topologies • Peer-to-peer (P2P) replication model • Replicating identity columns in a P2P model • Replication data flow for text/image columns • Gotchas in replicating text/image columns • Conclusion • Questions and Answers
Multiple Primaries Replication Topologies - Overview • Distributed primary fragments • Corporate rollup • Re-distributed corporate rollup • Peer-to-peer (P2P) update-anywhere • Comparison Charts
Distributed Primary Fragments Model CHI LON TOK
Distributed Primary Fragments Model Characteristics • Every site has primary and replicated data • Data inserted at a site belongs to that site • A row can be updated/deleted only at its primary site • No possibility of simultaneous update conflicts • Every site subscribes to all or only the primary data from every other site
Distributed Primary Fragments Model Pros • Conflicts cannot occur • All data is present at every site • Replication Latency is minimal
Distributed Primary Fragments Model Cons • Highly restrictive • Application must be sensitive to the “Site Ownership” of data • Changes to site ownership will restrict manipulation of data in flight • Not scalable • Heavy network traffic • High maintenance cost
Corporate Roll-up Model CHI TOK LON
Corporate Roll-up Model Characteristics • Multiple primary sites publish data to one centralized rollup site • Each primary site contains only its own data • Rollup site consolidates data from all primary sites • Data can be modified only at its primary site
Corporate Roll-up Model Pros • Conflicts cannot occur • Highly scalable • Simpler to manage • Number of sets of replication definitions = Number of primary sites • Number of sets of subscriptions = Number of primary sites • Minimal network load
Corporate Roll-up Model Cons • Highly restrictive • Data ownership is static at the primary sites • Each site can only perform “specific” business functions • Only the roll-up site has “all” data • Potential loss of business workflow due to fragmented data at primary sites • Single point of failure at the roll-up site
Re-distributed Corporate Roll-up Model REPLICATE SITES CHI TOK LON
Re-distributed Corporate Roll-up Model Characteristics • Multiple primary sites publish data to one centralized rollup site • Rollup site re-distributes data to other remote sites • Data can only be modified at its primary site • No data changes allowed in rollup site and the replicate sites
Re-distributed Corporate Roll-up Model Pros • Conflicts cannot occur • All replicate sites have all data • Highly scalable • Simpler to manage
Re-distributed Corporate Roll-up Model Cons • Highly restrictive • Data ownership at primary sites is static • Rollup site and all replicate sites can only be used as decision support systems • Primary sites cannot re-subscribe from the rollup site • Potential loss of business workflow
Peer-to-Peer (P2P) Model CHI CHI LON LON TOK TOK
Peer-to-Peer (P2P) Model Characteristics • Every site hosts the same copy of the database • No static ownership of data by any site • All data can be modified at every site • Potential for conflicts (simultaneous updates) exists • Conflicts need to be resolved to avoid data inconsistencies
Peer-to-Peer Model Pros • Highly Flexible and Open • All data is present everywhere • Data can be modified at any site regardless of ownership • Local site autonomy • Disaster in one site does not affect other sites • Replication Latency is minimal
Peer-to-Peer Model Cons • Not scalable • High network load • Huge maintenance cost • Requires N sets of replication definitions and • N * (N-1) sets of subscriptions • 5 sites = 5 * (5 – 1) = 20 sets of subscriptions • 7 sites = 7 * (7 – 1) = 42 sets of subscriptions
Comparison Chart - II Distributed primary fragments Corporate rollup Re-distributed corporate rollup Peer-to-peer (P2P) Data Ownership No Conflicts No Data Ownership Conflicts
Peer-to-Peer (P2P) Model • Example • Design Considerations • Application Design Requirements • Database/Replication Design Requirements • Design Implementation
P2P Model - Example • Application: Sales Order System • New sales orders/invoices are created at every site • Every site performs updates to any order/invoice • Requirement: • Every site needs to be able to query/modify/delete orders and invoices created at another site ==> • Peer-to-Peer Update anywhere replication with conflict resolution logic
P2P Model - Design Considerations • Different order numbers for new orders at each site ==> Allocate ranges to each site • Resolution of simultaneous update conflicts ==> Code for Conflict Resolution based on a Criteria • Compare modification timestamps to retain the earliest/latest modification of a record • Compare modification origin site to retain records based on site priorities
P2P Model - Application Design Requirements • Manage allocation of primary key value ranges to each site • Generate “new” primary key values based on the site ranges • Populate relevant criteria (lastUpdateTime, lastUpdateSite etc.) to resolve update conflicts
P2P Model - Database/Replication Design Requirements • Setup static tables at each site to manage range allocations for that site • Add selection criteria columns to each table • Code custom function strings for inserts and updates to detect and resolve conflicts • Maintain all sites in one timezone if required
P2P Model - Design Implementation NY (Range:1- 100000) SF (Range:100001 - 200000) orders (num, desc, tstamp) orders (num, desc, tstamp) • Insert: 1, “Tst1”, T1 • Insert: 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 • Update: 1, “Tst3”, T3 • Update: 1, “Tst4”, T4 1, “Tst4”, T4 100001, “Tst1”, T2 1, “Tst4”, T4 100001, “Tst1”, T2
Identity Columns in a P2P Model • Identity Columns • Advantages & Disadvantages • Replicating Identity Columns in a P2P Model • Examples • Design Issues • Design Technique • Design Implementation
Identity Columns • Advantages • Automatically generated by ASE and require no application logic • Faster generation of new primary key values • Performance boost by use in joins • Disadvantages • Identity Gaps or Sudden jumps in values causing problems for applications
Replicating Identity Columns in a P2P Model - Example 1 NY (Seed num to 1) SF (Seed num to 100001) orders (num, desc, tstamp) orders (num, desc, tstamp) • Insert: 1, “Tst1”, T1 • Insert: 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 Next Insert generates the value 100002 for the new order Next Insert generates the value 100002 for the new order
Replicating Identity Columns in a P2P Model - Example 2 NY (Seed num to 1) SF (Seed num to 100001) orders (num, desc, tstamp) orders (num, desc, tstamp) • Insert: 1, “Tst1”, T1 • Insert: 100001, “Tst2”, T2 1, “Tst1”, T1 100001, “Tst2”, T2 1, “Tst1”, T1 100001, “Tst2”, T2 • Insert: 100002, “Tst3”, T3 • Insert: 100002, “Tst4”, T4 100002, “Tst3”, T3 100002, “Tst4”, T4 100002, “Tst3”, T3 100002, “Tst4”, T4
Replicating Identity Columns in a P2P Model - Design Issues • Additional Implementation Steps • Column must be defined as datatype “identity” in the replication definition • Replication Maintenance User should be aliased to “dbo” • Challenges • No Control in generation of different primary key values at each site ==> No Ranges
Replicating Identity Columns in a P2P Model - Design Technique • Create a key table, say “orders_keytbl” with an identity column “ordnum” • Create the application orders table with ordnum as a numeric datatype, but NOT as an identity column • Generate a new ordnum by inserting into the orders_keytbl • Use @@identity value from above to insert into orders table
Replicating Identity Columns in a P2P Model - Design Technique cont. • Seed the orders_keytbl with different values at each site • Insert into orders_keytbl only from application code
Replicating Identity Columns in a P2P Model - Design Technique cont. • Technical Details • Concurrency not an issue due to session-specific global variable @@identity • Easier method to repair identity gaps • Key tables need to be purged on a regular basis • Key tables can be partitioned for insert performance • No indexes required on key tables
Replicating Identity Columns in a P2P Model - Design Implementation NY SF orders (ordnum,desc,tstamp) orders (ordnum,desc,tstamp) • Insert: 1, “Tst1”, T1 • Insert: 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 1, “Tst1”, T1 100001, “Tst1”, T2 • Insert: 2, “Tst2”, T3 • Insert: 100002, “Tst4”, T4 2, “Tst2”, T3 100002, “Tst4”, T4 2, “Tst2”, T3 100002, “Tst4”, T4
Text/Image Columns in a P2P Model • Text/Image Columns Replication • Overview • Function Strings • Replication Data Flow for Text/Image Columns • insert behavior • update behavior • writetext behavior • Replicating Text/Image Columns in a P2P Model • Issues • Options
Text/Image Columns Replication - Overview • Different from other standard datatypes • Initialization of a text pointer before applying a text/image value • Replication in chunks similar to ASE text logging • Replication of non-logged LOB data applied through writetext and bulk library routines
Text/Image Columns Replication - Overview cont. • Replication of LOBs not as part of normal insert/update function strings • Different handling based on whether the LOB columns are nullable or not • Choice to replicate/not replicate the LOBs based on the replication status
Text/Image Columns Replication - Overview cont. • Text status • tpnull • tpinit • notrep • hastext • zerolen • Replication Status • always_replicate • replicate_if_changed • do_not_replicate
Text/Image Columns Replication Function Strings • rs_get_textptr - Retrieves handle for each text/image column • rs_datarow_for_writetext - Provides values of all non text/image values for modifications using writetext or DB-Lib/CT-Lib functions • rs_textptr_init - Initializes a text pointer for a text/image column • rs_writetext - Modifies the text/image data
Replication Data Flow - Insert of non-nullable text/image Columns insert demo (cola,colb) values (1, “test”) insert text logged.. RA rs_insert rs_writetext... syslogs RS DSI insert demo (cola, colb) values (1,’’) select colb from demo where cola = 1 writetext… rs_insert rs_get_textptr rs_writetext... Placeholder
Replication Data Flow - Insert of nullable text/image Columns insert demo (cola,colb) values (1, “test”) insert text logged.. RA rs_insert rs_writetext... syslogs RS DSI insert demo (cola) values (1) update demo set colb=NULL where cola=1 select colb from demo where cola = 1 writetext… rs_insert rs_textptr_init rs_get_textptr rs_writetext...
Replication Data Flow - Insert of nullable text/image Columns insert demo (cola,colb) values (1,NULL) RA insert (1) rs_insert syslogs RS DSI insert demo (cola) values (1) rs_insert
Replication Data Flow - Update of non-nullable text/image Columns update demo set colb=“tst1”where cola = 1 update text logged.. RA rs_update rs_writetext... syslogs RS DSI update demo set cola=1 where cola=1 select colb from demo where cola = 1 writetext… rs_update rs_get_textptr rs_writetext...
Replication Data Flow - Update of nullable text/image Columns update demo set colb=“tst1” where cola=1 update text logged.. RA rs_update rs_writetext... syslogs RS DSI update demo set cola=1 where cola=1 update demo set colb=NULL where cola=1 select colb from demo where cola = 1 writetext… rs_update rs_textptr_init rs_get_textptr rs_writetext...
Replication Data Flow - Update of nullable text/image Columns update demo set colb=NULL where cola=1 update text logged.. RA rs_update rs_writetext... syslogs RS DSI update demo set cola=1 where cola=1 update demo set colb=NULL where cola=1 select colb from demo where cola = 1 writetext… rs_update rs_get_textptr rs_writetext...
Replication Data Flow for writetext RA rs_datarow_for_writetext rs_writetext... writetext.. writetext syslogs RS DSI rs_datarow_for_writetext rs_get_textptr rs_writetext... select colb from demo where cola = 1 writetext…
Text/Image Replication Data Flow Summary • rs_insert/rs_update/rs_datarow_for_writetext • Executed based on whether an insert/update/writetext is executed at the primary • Carry the text_status and the replication_status • rs_textptr_init • Executed when an insert/update at primary caused a text pointer initialization • rs_get_textptr, rs_writetext • Executed whenever a text/image data value is replicated
Replicating Text/Image Columns in a P2P Model - Issues • Challenges: • Text/Image data not replicated as part of insert/update statements • No programmatic control to invoke text functions from rs_insert/rs_update • Options for conflict resolution: • rs_get_textptr • rs_writetext • Triggers in replicate database