600 likes | 741 Views
Transactions and Databases. Paul Greenfield CSIRO. This Week. More on transactions Left overs http://research.microsoft.com/~gray/wics_99_TP Isolation and locking How do we achieve isolation? Recovery How do we recover after failure?. Why bother with TP?.
E N D
Transactions and Databases Paul Greenfield CSIRO
This Week • More on transactions • Left overs • http://research.microsoft.com/~gray/wics_99_TP • Isolation and locking • How do we achieve isolation? • Recovery • How do we recover after failure?
Why bother with TP? • Use two-tier apps with database transactions? • Business logic in client and stored procedures • Fast! • Scalable? • Maintainable? • Cheaper? • Flexible?? Stored procedures Database Server
Two-tier Applications • The most recent ‘legacy’ • Stored procedures • Different and proprietary languages • Integrated debugging? • Re-use in different applications? • DB connection per client • Even when not active
Three-tier Applications • Business logic written in common or standard languages (VB, C++, Java) • Clean separation of business logic • Easier re-use and maintainability? • Use server resources only for active transactions • Process and connection pooling
TP Implementation • What are the TP programs? • Small ‘one-shot’ executable programs? • Application programs fed from queue? • Libraries called from a process? • Libraries called from threads? • Answer have an effect on performance, integrity and management
One-shot Programs • Old-style solution (CICS, TIP, …) • Schedule application to run when transaction request arrives • Start app, process request, terminate • Single function per application • OS/TP monitor support for • Fast application startup • Application recycling (reduce overheads)
Queued Applications • TP application ‘always’ running • Instances balanced against load • Queue of waiting requests • Application supports multiple functions • Group functions into applications • Clients not bound to server applications • Tune response times • Faster response time for some transactions • Multiple copies of critical applications
TP Processes Client bound to server process • Typical CORBA approach • Queue of requests for each server • Need to run/manage multiple servers • Tune response times? • Can allocate transactions to programs • Fast, critical transactions delayed? • Need for load balancing • Unequal server load possible
TP Process - Orbix Server processes Server objects Waiting requests
TP Threads • Thread pool inside a server process • No binding from client to thread • Objects live in process address space • Threads have access to all objects • Queue of requests shared by all threads • No need for load balancing • No idle/busy processes • No way to push priority of some transactions – may not matter?
TP Threads - MTS Server threads Proxyserver objects Activeserver objects Waiting requests
Failure • Need to isolate faults • Failing application takes down what?? • Entire application process? • Process holding thread pool? • Entire transaction system? • Need to run applications as separate processes or have careful fault traps
What Goes Where? • Routing and directories • Where to send a request message? • Where to create a remote object? • Routing tables • Table of what requests go where • Directories/name servers • Database and server that knows who is providing what service
Directory/Name Servers • Map name onto server locations • Could be part of TP system • CORBA Name Servers • Could be part of system-wide directory • Active Directory for COM+ • ‘Hard-wiring’ also works • Administration costs can be high
Name Servers • Client asks name server where to find a service when creating object • Servers advertise their services to the name server • Load balancing by name server distributing requests over multiple server processes and systems
Request Integrity • What happens to requests on failure • Transactions ensure database integrity • Incoming requests can be saved to disk • Fetch request operation included as part of transaction • Undone and request requeued on failure • Need to avoid failure loops! • Easy recovery from transient errors
Response Integrity • Are responses part of transaction? • Rolled out if transaction fails • Recovered and sent after system recovery if committed • Is this reasonable? Sent to who?? • Just discard? • Need feedback to know delivery succeeded • Just what does the operator see/do? • Wait? Retry? Check success?
RPC Extras • DCE, CORBA, COM, … are language and platform independent • Interfaces specified in IDL • Marshalling translates between languages and platforms • Character sets, byte order, … • Translate to and from ‘canonical’ form • Or use ‘receiver makes it right’ • Send in client format • Receiver translates only if necessary
IDL Example • COM IDL fragment • More detail in a later lecture!! [object, uuid(6B29FC40-CA47-1067-B31D-00DD010662DA)]interface IHop : IUnknown { import “unknwn.idl”; // bring in definition of IUnknown HRESULT Walk([in] long How_far); HRESULT Hop([in] long How_far); HRESULT Bound([in] BSTR Over_what);}
Nested Transactions • Calling a transaction from anywhere • Directly from a client • From within a transaction • Start a sub-transaction, linked into the parent transaction • All transactions committed together • Sub-transaction commit does not really commit and make changes durable. Changes made visible to other sub-transactions.
Nested Transactions • Not widely supported • Alternative programming models • Top-level transactional service code calling on business logic • MTS and EJB ‘requires transaction’ • Run in existing transaction if there is one • Start new transaction otherwise • More in MTS/COM+ and EJB lectures
Nested Transactions Function transfer(src, dest, amt) tx_start withdraw(src, amt) deposit(src, amt) tx_commit Function withdraw(src, amt) tx_start …….. Tx_commit Function deposit(dest, amt) tx_start …….. Tx_commit Nested Transactions Function transfer(src, dest, amt) tx_start withdraw(src, amt) deposit(src, amt) tx_commit Function withdraw(src, amt) …….. Function deposit(dest, amt) …….. Transactional Services
Isolation and Locking • How do resource managers achieve the illusion of ‘isolation’ • Application programmers can (largely) pretend no other programs are running concurrently • Done using ‘locks’ and ‘lock managers’ • Application programmers still need to be aware of possible problems
Serialisable • Concurrent execution of concurrent transactions has the same effect as running them serially. • One after another with no overlap • Highest level SQL Isolation Level • Implemented by locking resources before they are used
Locks • Lock data before using it • Set read lock before reading • Set write lock before writing • Wait if lock cannot be granted • Locks only granted if no conflicts • Read locks conflict with write locks • Write locks conflict with both read and write locks
Locks • Locks affect performance • All computers wait at the same speed • Can result in single-threading • Concurrent transactions waiting for access to the same resource • Strongly influenced by application design • Locks introduce new problems • deadlocks
Two-phase Rule • Correct locking avoids problems • Locks have to be held until commit to achieve isolation • Locks are held for longer • Performance is reduced • Two phases • Locking resources • Unlocking (only at commit) • Avoids cascading aborts
Lock Managers • Code that manages locking • Maintains a lock table • Keeps track of all locks in the database • Waiting requests and granted locks • Lock operations are atomic • Protected by low-level locks (mutex, spin) Locks granted Locks requested x T1(read), T2(read) T3(write) y T2(write) T4(read), T1(read) z T1(read)
Lock Managers • Distributed systems can have interesting locking problems • No lock analysis across databases? • Distributed databases have distributed lock managers • Shared lock state • Communication between LMs
Lock Types • More than just read and write! • Shared (read) locks • Exclusive (write) locks • Update (read then write) • Intent locks (lock also held at finer level) • Key locks (lock ranges within keys)
Lock Granularity • What is locked? • Whole database • Whole table? • Page of data? • Individual record? • All of the above at times • X lock on record • IX locks on page and table • S locks on database
Record Record Record Tables to Records Table Page Page Page
Lock Granularity • Level of locking a DB decision • Fine grain locks give less contention and better performance • Fine grain locks using lots of locks and are more expensive to manage • Choose record lock when.. • Just locking a few records • Otherwise get coarser locks
Lock Escalation • DB can start with record locks and move to page/table locks • Finds that many locks are being held for the page/table • Escalate lock up a level • Free lock resources • Guess at proper locking level and adjust as needed (up only?)
SQL Isolation Levels • Uncommitted read (dirty read) • Read all changes, no locks, no waits • Fastest and sometimes useful • Statistical scans of data • Committed read (SQL default) • Only read committed data • Release read locks after use • Repeating an SQL statement can give different results each time
SQL Isolation Levels • Repeatable read • Same query always returns same data • Can get phantoms – new records • Keep shared locks until Commit • Serializable (TP Isolation) • Same query returns same data • No phantoms! • Lock data that does not exist • Need to keep key locks as well
Locking Hints • DB decides what locks to use • Shared or exclusive lock? • Locks can be converted normally • Programmer can override with ‘hints’ • Programmer knows what will happen next • Avoid deadlocks? Select * from accounts (updlock) where acc_no = 123 Update accounts set balance= … where acc_no=123
T1 Lock A Lock B Deadlocks • Normally applications just wait for locks to be granted • Sometimes dependencies between locks means they would wait forever Granted Lock B Lock A T2 A T1 T2 B T2 T1 Waiting
Deadlocks • Db performs locking graph analysis • Deadlock if loop found! • Solution? • Pick a process/transaction and return a db error • Application recovers or dies… • Transaction abort and retry?
Deadlocks • Deadlock avoidance is an application coding problem – and a hard one • Use ‘canonical locking orders’ • Define a standard locking order • Invoice header before invoice details • Nice idea in theory • Can still get ‘conversion deadlocks’
Conversion Deadlocks • Database uses shared locks rather than exclusive locks for reading • Can convert to exclusive later • Deadlocks when DB cannot do convert Granted Select next from keytable where type=1 Update keytable set next=next+1 where type=1 K1 T1(s) T1(x) T2(s) T2(x) Waiting
Conversion Deadlocks • A use for locking hints • Tell DB to get exclusive lock earlier Granted Select next from keytable (updlock) where type=1 Update keytable set next=next+1 where type=1 K1 T1(x) T2(x) Waiting
Performance • Blocking on waits undesirable • Remove hot spots • ‘next entry’ counters, summary information, end of file counter • Avoid altogether • Cache high contention records • Reduce ‘path length’ • Obtain locks as late as possible
Recovery • Durability and redundancy • Keep critical information on disk • In-memory copies for performance • Ensure disk writes complete before continuing at critical times • Keep multiple copies of disk data • Protecting against … • Memory loss when system fails • Disk file loss with disk failure