550 likes | 573 Views
Transaction Models. COMP3017 Advanced Databases. Dr Nicholas Gibbins – nmg@ecs.soton.ac.uk 2012-2013. Overview. Data Processing Styles Flat Transactions Context and Savepoints Chained Transactions Nested Transactions Distributed Transactions Multi-level Transactions.
E N D
Transaction Models COMP3017 Advanced Databases Dr Nicholas Gibbins – nmg@ecs.soton.ac.uk 2012-2013
Overview • Data Processing Styles • Flat Transactions • Context and Savepoints • Chained Transactions • Nested Transactions • Distributed Transactions • Multi-level Transactions
You should already know about: • Transactions • Schedules • Serial Schedules • (Conflict-)Serializable Schedules • ACID • Two Phase Locking (2PL) • Timestamp Ordering
Transaction Models Processing Styles • Batch processing • Time sharing • Real time processing • Client-server systems • Transaction-oriented processing
Data Duration Guarantees of Work No of Performance Avail- Reli- Consist- Pattern Work Criteria ability ability ency Sources 1 Batch Private Long Normal None Regular 10 Throughput Normal 2 Time Private Long Normal None Regular 10 Response Normal Sharing Time 3 R T P Private Very Very None Random 10 Response High Short High Time 2 Client- Shared Long Normal None Random 10 Throughput High Server 5 Trans Shared Short Very ACID Random 10 Throughput High Proc High & Resp Time
Transaction Oriented Processing • Some batch transactions • Many terminals • High availability • System takes care of recovery • Automatic load balancing • Sharing access to data • Variable requests • Repetitive workload • Mostly simple functions
Flat Transactions • The simplest type of transaction • Basic building block for other models • Only one layer of control by the application - hence "Flat” • Begin Work • Commit Work or Abort Work • "All or nothing" processing
Outcomes of Flat Transactions BEGIN WORK Successful completion Operation 1 1 Operation 2 ... Operation k COMMIT WORK BEGIN WORK Application requests termination Operation 1 Operation 2 2 ... (I can't go on!) ROLLBACK WORK BEGIN WORK Operation 1 Operation 2 Forced termination ... 3 ! Rollback due to external cause
Example BEGIN WORK Apply delta to account balance Account DB Read new value of balance Branch id Teller id Account id Apply delta to teller balance Delta Teller DB Apply delta to branch balance Branch DB Write everything to history file New balance Send output History File COMMIT WORK
Example BEGIN WORK Perform database updates Branch id Teller id If Account id Delta Withdrawal and account now overdrawn ROLLBACK WORK Else New balance Send output COMMIT WORK
Limitations • Simple control structure cannot model more complex applications • Trip planning • Bulk updates • Does not allow much co-operation • May be too strict • Databases used in advanced applications • CAD/CAM • Office automation • Software development
Limitations Long-lived transactions present particular challenges: • More likely to be interrupted • Rollback not acceptable • May access many data items • Cannot hold locks indefinitely • Others need the data • Will lead to deadlocks
Requirements Controlling computations in a distributed multi-user environment primarily means: • Containing the effects of arbitrary operations as long as there might be a necessity to revoke them • Monitoring the dependencies of operations on each other in order to be able to trace the execution history in case faulty data are found at some point
Transaction Processing Context For a simple program output_message = f {input_message} More typically however, f {input_message, context} -> {output_message, context} where context might be a cursor position
Transaction Processing Context It is normally the private context which matters: • Transaction • Program • Terminal • User
Transaction Models Context Examples Counters • How often the program was called User-related data • Last screen sent • Last order number worked on Information about durable storage • Last record read • Most recently modified tuple
Transaction Models Transaction Processing Context • Preservation of context is needed for long-lived transactions • One way is for the application to maintain the context as a database record
Flat Transactions and Savepoints BEGIN WORK (=SAVE WORK:1) Action Work 'covered' by Savepoint 2 Action SAVE WORK:2 Action Action Action Action SAVE WORK:4 SAVE WORK:3 Action Action Action ROLLBACK WORK(2) ROLLBACK WORK(4)
Persistent Savepoints • Following a system crash, restart in-flight transactions from their most recent savepoints • Needs whole context preserved • System variables • Program variables • Programming languages have limitations • A persistent programming language would be needed
Chained Transactions BEGIN WORK - Commit results - Keep [some] locks CHAIN WORK - Keep cursors - Continue processing - No-one else can access kept objects - Application cannot rollback to before 'Chain Work'
Savepoints versus Chained Transactions Both allow substructure to be imposed on a long-running application program • Database context is preserved • Cursors are kept Commit vs Savepoint • Chained - rollback only to previous ‘savepoint’ • Savepoints - can rollback to arbitrary savepoint Locks • Chained frees unwanted locks
Savepoints versus Chained Transactions Work lost • Savepoints more flexible than flat transactions, as long as the system does not crash Restart • Chained can restart from most recent commit, as long as no processing context hidden in local programming variables Both organise work into a sequence of actions
Nested Transactions Top-level transaction Subtransactions Subtransactions Tk Tk11 Tk1 BEGIN WORK BEGIN WORK BEGIN WORK COMMIT WORK Invoke subtran Invoke subtran Tk12 Invoke subtran BEGIN WORK COMMIT WORK Invoke subtran Tk2 COMMIT WORK BEGIN WORK Invoke subtran COMMIT WORK Tk3 Tk31 BEGIN WORK Invoke subtran BEGIN WORK Invoke subtran COMMIT WORK COMMIT WORK COMMIT WORK
Three Rules for Nested Transactions • Commit Rule • Rollback Rule • Visibility Rule
Commit Rule The commit of a subtransaction makes the results accessible only to the parent The final commit happens only when all ancestors finally commit
Rollback Rule If any [sub]transaction rolls back, all of its subtransactions roll back
Visibility Rule Changes made by a subtransaction are visible to its parent Objects held by a parent can be made accessible to subtransactions Changes made by a subtransaction are not visible to its siblings
Observations Subtransactions are not fully equivalent to flat transactions. They are • Atomic • Consistency preserving • Isolated • Not durable, because of the commit rule
Observations Nesting and program modularisation complement each other • Well designed module has a clean interface, and no global variables • If it touches the database, the database is a large global variable • If the module is protected as a subtransaction, then database changes are kept clean too Nested transactions permit intra-transaction parallelism
Emulating Nesting with Savepoints Tk Tk11 Tk1 S1 S11 Tk12 S12 Tk2 S2 S3 Tk3 Tk31 S31 Commit
Observations Using savepoints is more flexible than nested for internal recovery • Can roll back further True nested is needed in order to run subtransactions in parallel (Intra-transaction parallelism) • Emulating with Savepoints needs 'subtransactions' to be run in strict sequence True nested can pass locks selectively • More flexible than Savepoints • “Similar but different”
Distributed Transactions A flat transaction that has to visit several nodes to collect data • Differs from nested; only one level • If a subtransaction, which is just a slice of the main transaction, commits or rolls back, the whole transaction does the same
Multi-Level Transactions • A generalized and more liberal version of Nested transactions • Multi-level transactions have the ability to pre-commit the results of a subtransaction • Therefore cannot do unilateral backout of updates • A 'Compensating Transaction' is needed to reverse effects, if necessary • Scheme of layering object implementation ensures ACIDity at the root level
Compensating Transactions • The compensating transaction needs careful thought! • It must be available from the point of subtransaction commit • It must not itself abort • Needs provision to survive system failure • Needs provision to survive internal error(eg SQL call fails) • If it is not needed, the compensating transaction must be disposed of when the application finally completes
Transaction SELECT ... INSERT ... UPDATE ... SELECT ... Insert B-tree Entry Insert Tuple Update page Insert address table entry Locate position Insert entry Example
Layering • Abstraction hierarchy • The entire system consists of a strict hierarchy of objects with their associated operations • Layered abstraction • The objects of layer 'n' are completely implemented by using operations of layer 'n-1' • Discipline • There are no shortcuts that allow layer 'n' to access objects on a layer other than 'n-1' • Other models use early commit • Open Nested, Sagas, Engineering transactions ... • Only Multi-Level transactions have all the ACID properties
The Problem How can we update one million bank accounts as one transaction?
Flat Transaction Update all accounts using a single flat transaction: • Operation performed exactly once • Unacceptable price for restart
Chained Transactions Decompose into sequence of 1 million single (chained) transactions: • Performance overhead • Only last transaction rolled back if failure • No longer atomic overall • No restart context preserved • No state of chain overall preserved
Locking • Flat transaction - one ACID unit which locks all records for the duration • Chained transaction - one record locked at a time, so other updates can occur during the run
Recovery • Recover by reading the log? • Not standard application code • Should be a better way