320 likes | 541 Views
Advanced Transaction Management. Chapter 13. Outline. Mixing heterogeneous TMs High-Availability Commit & Transfer of Commit Optimizing Commit Disaster Protection via Data/Application Replication. Mixing Transaction Managers.
E N D
Advanced Transaction Management Chapter 13
Outline • Mixing heterogeneous TMs • High-Availability Commit & Transfer of Commit • Optimizing Commit • Disaster Protection via Data/Application Replication ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Mixing Transaction Managers • Four standards: LU 6.2 ~ APPC ~ CPIC ~ CICS: de facto TP standard • X/Open + OSI/TP : The de jure TP standard. • OTS: The CORBA standard • TIP: De facto interoperability standard • Almost everyone interoperates with LU6.2 • LU6.2 has evolved to have presumed abort, not reuse aborted trids, .. other fixes • LU6.2 is "open" two phase commit, documented interface, reconnection / resolve is documented. • Internally, everyone uses private protocols with many tricks. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Mixing "OLD" Transaction Managers • Many old TP monitors are not open: • Do not expose 2PC (prepare() and commit()) • => insist on being root commit coordinator. • All will become X/Open-compliant eventually and thus be open TP monitors. • If stuck with an "closed" TM: • Can still get atomicity if: • 1. Only one closed TM involved. • 2. TM is direct not queued ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Do Transaction Transaction Gateway While not acknowledge to Closed Transaction Mgr Send trid + data If Not duplicate Wait Do transaction Insert trid in done table Commit Done Table Acknowledge Mixing with a Closed Transaction Manager All "open" TMs and RMs prepared, closed TM does "RUMP" deferred_update(int id, complex_type list_of_updates) /* rump logic */ {Begin_Work(); /* start a new transaction */ select count(*) from done where id = :id; /* test if work was done */ if not found then /* if not done */ do list_of_updates; /* then do the list of updates.*/ insert into done values (:id); /* flag transaction done */ Commit_Work(); /* commit update and flag */ acknowledge; /* reply success to caller */ } /* in both cases. */ Status_Transaction(TRID trid) { select count(*) into :ans from done where trid = :trid; return ans:} ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transaction Gateway "Our" "Foreign" Transaction Transaction Manager Managers OSI Protocol Stack Trid Map Table Local Protocol our trid his trid Mixing Open Transaction Managers • Gateway translates between external and internal TRID. • Gateway translates between external and internal protocols • Participates in transaction resolution (is a TM in both worlds) ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Mixing Open Transaction Managers • Multiple entry problem: • TRID enters system twice at two different paths. • "works" but looks like two separate transactions. • commit dependency is external to system. • Fancy option problem: • External/internal TM has an option the other does not. • Fakes (or turn off) optimizations/options not supported • by one side or the other ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Outline • Mixing heterogeneous TMs • High-Availability Commit & Transfer of Commit • Optimizing Commit • Disaster Protection via Data/Application Replication ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Non-Blocking Commit The problem: what if the coordinator fails. Solutions: 1. wait 2. appoint a new coordinator Appointment can be thought of as a process pair (n-plex) Works great in a cluster (no communications failures). ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Non-Blocking Commit in a WAN: 3j or Heuristic or Operator Command Wide area net can partition Process pairs cannot reliably decide to take over. Solution(s): 1. Three phase protocol Broadcast participant list and decision as part of phase 1.5; let (majority) of participants decide if coordinator fails. 2. Heuristic decisions Default to commit/abort. Announce Heuristic Mismatch at reconnect if wrong guess 3. Human decision Announce Operator Mismatch at reconnect if wrong guess. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transfer of Commit What if a participant is more secure than the coordinator? is more reliable than the coordinator? Is faster than the coordinator? Transfer commit authority to him? ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transfer of Commit Is also an optimization: saves messages if done as part of commit. called nested commit protocol or last resource manager optimization 2 messages vs 5 messages (plus one lazy msg) ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transfer of Commit: More Complex Case More complex if the root has more than one branch: Need to set up new sessions among "trusted" nodes root sends new root name to all participants at phase 1 ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Outline • Mixing heterogeneous TMs • High-Availability Commit & Transfer of Commit • Optimizing Commit • Disaster Protection via Data/Application Replication ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Optimizing Commit Can optimize: Delay: milliseconds/commit Message cost: number, size, urgency of messages IO cost: number, size, or urgency of IO CPU cost: cycles used Throughput: maximum commit rate. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Commit: the General Case Prepare(): 1 rpc or message pair per RM and one per non-root TM 1 forced IO per RM (prepare record) 1 forced IO per TM(commit record) Commit(): The same. Summary of 2PC cost: IO: 2(RM+TM) RPCs: 2(RM+(TM-1)) Messages: 4(RM+(TM-1)) (equivalent to RPCs) Delay: 2IO ~ 50ms ~ 10Kins. 4 msg ~ 20ms ~ 50Kins 50ms*(RM+TM) + 20ms*(RM+TM-1) These are the error-free counts (i.e. the minimum values) ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Commit: Simple Optimizations Presumed abort saves a TM IO (implicit in protocol above) Do phase 1, phase2 in parallel (saves delay) Common log (saves RM log forces) IO: 2(TM) Messages: 4(RM+TM-1) (equivalent to RPCs) Delay: 2*IO*TM + 4*M*(RM+TM-1) ~50ms*TM+40ms*(RM+TM-1) Use Local RPC (10x faster) ~50ms*TM + RM+40ms*(TM-1) Use WADS for low IO latency(3ms vs 25ms) ~ 6ms*TM + RM + 40ms*(TM-1) Simple case of 1 TM 2 RM: ~ 8ms delay for a commit. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Group Commit Optimization Amortizes IO and messages across several transactions Adds delay If N transactions in a group: IO, Message cost per transaction is ~ 1/N Small extra delay if one slow step in original path. As system heats up (commit rate rises) to 25tps start to install group commit with a 30ms threshold (at 100tps: 3.3 trans/group). ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Simple Commit Optimizations Read-only: just get phase1 call to release locks. Note: may violate ACID, should release read locks at phase 2 if any locks acquired during phase 1. Saves messages (Phase 2) and IO (no RM IO). True read-only transaction must prepare at phase 1 unlock at phase 2. Unjoin: RM does no work at commit/abort. Lazy: user-requested group commit. Piggybacks on others. no extra IO or messages. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transaction Commit Trees one node deep bush general case share log transfer Parallel Parallel LRPC commit transfer transfer . ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Transfer of COMMIT: Linear COMMIT Parent and other sub-trees prepare then transfer commit authority to remaining child. Last in chain becomes commit coordinator. More delay, fewer messages For N=2, Same delay, 3 vs 4 messages. Always use it. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Outline • Mixing heterogeneous TMs • High-Availability Commit & Transfer of Commit • Optimizing Commit • Disaster Protection via Data/Application Replication ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Disaster Recovery at a Remote Site Replicate Data Applications Network connection at 2 (or more sites) Symmetric design: Either site can process transactions Asymmetric design: One site is master of each data item. Allows: Caching Batching of updates at backup So far, asymmetric design is most popular. To get symmetry, have each node master 1/2 of the db/net. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Sample Physical LOG RECORD Basic idea of asymmetric design: send log from primary to backup backup applies log to its copy backup is in constant media recovery backup processes/sessions/data ready to take over ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Sample Physical LOG RECORD Need some way to decide failure. Easy in a cluster Hard in a WAN (partition possible) Solutions: Extra wires Wires on demand (dialup) Human (operator) Quorum device. Kind of log? Logical log is best loose coupling (allows backup to be a different TM/RM failure independence (different from physiological log) ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Takeover Logic /* initialization */ Tell primary I'm here Setup all RMs and application processes Open all initial sessions to clients. /* the main backup loop */ While (not primary) {redo log} /* the main backup loop */ /* Takeover */ redo rest of log resend most recent message on each session abort any incomplete transactions /* Become Primary */ tell application processes to start accepting requests. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Session Takeover • Just like process pairs • Session sequence numbers eliminate duplicates • So, get at-least-once delivery: resend msg at takeover ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Catch-up After Failure Failed node at restart executes normal restart Then enters backup logic. If both fail, outside observer must say who is best backup has to match its log to new primary. Design issue: are nodes bit-for-bit identical? If so, backup must “trim” log to match primary. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
How Safe? 1-SAFE: no extra delay, risks lost transactions 2-SAFE: extra delay (if backup up), single fault tolerant, high availability VERY-SAFE: extra delay, no lost transactions low availability ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
System Pairs vs Replicated Data System pairs replicate the application • DB • application processes • sessions Data replicators only replicate data. Other aspects left as an exercise for the application designer. ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
System Pair Benefits Tolerates faults Hardware Environment Operations Heisenbugs Can replace software/hardware online Can move backup to new building or... Allows design diversity: backup can be completely different ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999
Outline • Mixing heterogeneous TMs • High-Availability Commit & Transfer of Commit • Optimizing Commit • Disaster Protection via Data/Application Replication ©Jim Gray, Andreas Reuter Transaction Processing - Concepts and Techniques WICS August 2 - 6, 1999