390 likes | 558 Views
3. Transaction Processing Communications. CSE 593 Transaction Processing Philip A. Bernstein. Outline. 1. Introduction 2. Remote Procedure Call (RPC) 3. LU6.2 Peer-to-Peer 4. Comparing Peer-to-Peer to RPC. 3.1 Introduction.
E N D
3. Transaction Processing Communications CSE 593 Transaction Processing Philip A. Bernstein
Outline 1. Introduction 2. Remote Procedure Call (RPC) 3. LU6.2 Peer-to-Peer 4. Comparing Peer-to-Peer to RPC
3.1 Introduction • Three paradigms for communications between application programs in transactions • remote procedure call - procedure calls between address spaces • peer-to-peer messages - send-message / receive-message • queues - enqueue, dequeue to a shared queue • These paradigms are not unique to TP, but they all have TP-specific aspects
3.2 Remote Procedure Call • Program calls remote procedure the same way it would call a local procedure • variation - asynchronous call and return, for single-threaded client • most widely-used standard is RPC in the Open Software Foundation’s Distributed Computing Environment • Hides certain underlying complexities • communications and message ordering errors • data representation differences between programs
Transparent Transaction IDs • Ideally, Start returns a transaction ID that’s hidden from the caller • Procedures don’t need to explicitly pass transaction id’s. • Easier and avoids errors • Moreover, when a transaction first arrives at a site, the local transaction manager needs to be notified. • Application shouldn’t need to deal with this • This is what makes RPC (or other paradigms) transactional.
Binding • Interface definitions • usually written in an interface definition language (IDL) • compiles into Proxy and Stub programs • could be generated directly from program without IDL • Client calls the Proxy (representing the server) • Stub calls the Server (represents the client on the server) • IDL compiler also produces header files • Marshaling • proxy marshals (lays out sequentially) calling parameters in a packet and decodes marshaled return values • stub decodes marshaled calling parameters and marshals return parameters
Binding (cont’d) • Communications binding • client must find the server at runtime • server location could be stored in a directory service • binding may be done transparently by RPC runtime or some burden may be placed on application • Application programmer’s view • write interface definitions • program multithreaded client to avoid blocking the process on each call • binding may involve importing/exporting interfaces, defining security, connecting sessions, ...
Client Proxy RPC Runtime Server stub Server App Client App RPC Runtime Call P unpack argu- ments pack argu- ments receive P send work wait Return to caller unpack results return send Pack results receive RPC Walkthrough Call packet Return packet Client’s System Server’s System
Stateful Servers • Sometimes a server maintains state on client’s behalf. E.g., • Server scans a file. Each time it hits a relevant record it returns it. Next call picks up the scan where it left off. • Server maintains large user profile information • Web server maintains a shopping basket or itinerary or ... • Approach 1: client passes state to server on each call, and server returns it on each reply. Server retains no state. • This is the default assumption outside TP, but doesn’t work well for TP, because there’s too much state • Note that transaction id context is handled this way.
Stateful Servers (cont’d) • Approach 2: server maintains state, indexed by client id (e.g. transaction id or cookie). Later RPCs from the client must go to the same server and pick up the retained context. • RPC can provide a binding handle to direct subsequent calls to the same server. • If the client fails (e.g. it aborts the transaction), server must be notified to release client’s context • It’s just like a resource manager that releases locks • So encapsulate context as a (volatile) resource • Deallocate based on timeout (e.g., web client disappears) • If state must be maintained across transaction boundaries, then treat it like any resource manager (e.g. DBMS)
Stateful Servers in MTS • Client creates a server object • Currently (MTS 2.0) costs a round-trip to the server • Server object can maintain state • Client can call (the same) server object many times • Server object accesses its retained state • SetComplete by server app says that transaction can be committed and state can be deleted • Ditto for SetAbort, except transaction is aborted • EnableCommit by server app says that transaction can be committed by client but don’t delete server state • Ditto for DisableCommit, but transaction can’t commit
Parameter Translation • During marshaling, proxy and stub can translate between client’s and server’s representation • either put the parameters into a standard canonical format, such as ASN.1/BER and NDR, or • ensure the server can interpret the client’s format (receiver-makes-it-right) • Latter is better for a homogeneous system • but requires the stub to deal with multiple client formats • Also is used to handle different machine representations (little endian / big endian)
Other Desirable Features • A way of pipelining large parameters on call or return (e.g. for queries). “Pipe” in DCE/RPC. • Pass a handle as parameter, with a type, so client and server agree on what’s being passed • Receiver can “claim” pieces, a chunk-at-a-time • Callbacks - a server calls a procedure in the client • Essentially a reverse RPC • Needs another controlled binding, to the client entry point • Useful for controlled conversational access between server and client • Not in DCE RPC
Load Balancing • For parameter-based routing, client has many server bindings and picks the right one, per request • To balance the load across many identical servers • each client can randomly choose a server, to spread the load among them, or • a dispatcher can monitor the load of servers and direct requests to lightly loaded servers • For long-running activities, migrate active load to a lightly loaded server • Requires moving retained context to the other server • Redirect later messages. Possibly rebind to other server.
Security • The binding process has security guarantees • The client must have privileges to bind to the server • The client must know it’s binding to an appropriate server • E.g. during session creation, client and server authenticate each other • Server may do client authentication per-access too • Usually, the server runs in its own security context • In MTS, server can ask for client’s role and adjust the client’s privileges appropriately
Fault Tolerance • What to do if a client doesn’t receive a reply within its timeout period? • Why not just retry? • In TP, many RPC calls are not idempotent • Idempotent = any number of operation executions has the same effect as one execution • Queries (read-only) are idempotent, but not most updates • Send a “ping” for non-idempotent calls • After giving up, ignore late-arriving responses… • Can’t assume that the call didn’t run, so abort the caller’s transaction (it’s up to the application)
Fault Tolerance (cont’d) • Interface definition can say whether server is idempotent • Could even be done per member function • More abstract view • RPC executes idempotent calls at least once • RPC executes non-idempotent calls at most once • If the goal is exactly once, execute the RPC within a transaction and use transaction retry logic to ensure transaction actually runs (cf. queuing discussion, later)
Performance • There are basically 3 costs • marshaling and unmarshaling • RPC runtime and network protocol • physical wire transfer • In a LAN, these are typically about equal • Typical commercial numbers are 10-15K machine instructions • Can do much better in the local case by avoiding a full context switch
3.3 LU6.2 Peer-to-Peer • Peer-to-peer is a programming model based on send-message and receive-message. • In TP, the de facto standard is the LU6.2 protocol with the APPC and CPI-C interfaces • Programs establish conversations (i.e. session) via Allocate • Close the conversation with Deallocate • Then send and receive messages over the conversation using Send_Data and Receive_Data • Uses the chained transaction model. Announce “transaction done” using Syncpoint or Backout. • One pipe model - data (send/receive) and control (2-phase commit) messages flow on the same session.
Two-way Alternate • A conversation is half-duplex. • Reflects the call-return style of most TP communications • One participant is in send mode and the other is in receive mode. • The sender must explicitly turn over send control to the receiver, in a Send_Data call. • The receiver can’t start sending until it receives from the sender (using Receive_data) a “send-mode signal” (a.k.a. “polarity indicator”)
Conversation Trees • When a program issues an Allocate(program-name), the called instance of program-name becomes a child of the caller • Thus Allocate calls from programs cause a conversation tree to develop • E.g. A calls Allocate(B); B calls Allocate(C), Allocate(D), and Allocate(A) A B C D A´
Synchronization Levels • There are 3 levels of synchronization in LU6.2 • Level 2 - programs in the conversation tree execute in a transaction. Each program explicitly says when to commit by issuing “Syncpoint”. • Level 1 - No transactions. Each program can acknowledge receipt of a message by issuing a Confirm signal, which is meant to indicate that the program has processed the message(s). • Level 0 - No transactions. No confirm. Just send and receive message over a conversation. • Many non-IBM implementations are level 0.
Syncpoint Rules • A program issues Syncpoint to announce it’s done with its part of the transaction • Causes Syncpoint message to propagate to its neighbors in the conversation tree. • A program can issue Syncpoint if either • all of its conversations are in send mode, and it has not received a Syncpoint request over any other conversation, or • all but one of its conversations are in send mode, and it received a Syncpoint over the receive-mode conversation • Syncpoint blocks the caller until the whole transaction is committed or aborted (return code tells which).
A B C Syncpoint Rules (cont’d) • Next statement is part of a new transaction (chained model) • all programs in the conversation are part of the same new transaction (chaining is in the protocol, not just the API) • Eliminates some but not all protocol errors. E.g., • A and C are in send mode to B, and no Syncpoints yet • A and C issue Syncpoint, which collides at B • B is stuck and will never satisfy the rules • LU6.2 is commit-from-anywhere. I.e. any program in the conversation tree can be the first to call Syncpoint. It needn’t be the root of the conversation tree.
Boolean Proc Pay_Bill(dda_acct#, CC_acct#) { Allocate net_addr1.pay_cc sync_level syncpoint returns conv_A; Allocate net_addr2.debit_dda sync_level syncpoint returns conv_B; Send_data conv_A, cc_acct# ; Receive_data conv_A, cc_amount; Receive_data conv-A, What_received=Send; Send_data conv_B, cc_acct#, cc_amount; Syncpoint; Receive_data conv_B; If (what_received=Syncpoint) return (TRUE); else return (FALSE) Deallocate conv_A; Deallocate conv_B; }
Void Procedure Pay_cc(acct#); Receive_allocate Returns conv_C; Receive_and_wait Gets acct#, Data_complete; Exec SQL Select AMOUNT into :amt From CREDIT_CARD Where (ACCT_NO = acct#); Exec SQL Update CREDIT_CARD Set AMOUNT = 0 Where (ACCT_NO = acct#); Receive_and_wait What_received = Send; Send_data amt; Receive_data What_received=Take_Syncpoint; Syncpoint; }
Void Procedure Debit_dda (acct#, amt); { Receive_allocate Returns conv_C; Receive_and_wait Gets acct#, amt Data_complete; Receive_and_wait What_received = Take_Syncpoint; Exec SQL Update ACCOUNTS Set BALANCE = BALANCE - :amt Where (ACCT_NO = :acct# and BALANCE :amt); If (SQLCODE == 0 ) Syncpoint else Rollback; }
Stateful Programs • It’s a connection-oriented communications model • A conversation names some shared state between the communicating programs • direction of communications • direction of the link • transaction id • state of the transaction • Since programs hold conversations across message exchanges, they may rely on each other’s retained state from previous message exchanges.
Stateful Programs (cont’d) • E.g., P1 has a connection to P2. P1 scans a file owned by P2. P2 maintains a cursor (retained state), indicating P1’s position in the file. • Since connections aren’t recoverable across system failures, programs must be able to reconstruct retained state after they recover. • I.e. after each transaction commits or aborts • When a session is lost, programs must be able to release retained state (needed anyway to abort automatically when a program fails)
ISO TP • It’s a protocol, not an API. • It’s the “standard”, but not widely supported (LU6.2 won) • ISO TP is based on LU6.2 model with modifications • globally unique transaction id (LU6.2 uses small integers starting with one) • commit from root only • application-defined data transfer (two-way simultaneous and two-way alternate) • Different commit protocol than LU6.2 (presumed abort and simple heuristics) • Layered on ISO/OSI protocol stack
LU6.2 Gateways • Most TP system products include a gateway that makes an LU6.2 server program look like a server application in the client’s environment Example COM Transaction Integrator (COM TI) in SNA Server 4.0 • Generates an Automation proxy running as a package in MTS that talks LU6.2 to its back end. • Proxy translates between COM and IBM data types • To generate the proxy interface, you can manually create a type lib (using a GUI tool) or you can run a COBOL scanner that extracts interface definitions
Client App The COMTI Architecture Windows NT Server MTS SNA Server COM TI Proxy TP MVS .tlb MTS Component COM TI Component Builder COM TI Admin Tool
MVS CICS MTS VSAM IIS ClaimKey DB2 Validate Claim Commit Process Claim Insert Claim Summary NT Server SQL Server Insert Claim NTServer
3.4 Comparing Peer-to-Peer to RPC • These two communications models differ in several ways • flexibility of message sequences • program termination model • management of distributed state
Message Passing Flexibility • Request-reply protocols (RPC) require programs to properly nest their request and reply messages. • Example - Request-reply matching A B C Call T I M E Call Return Return
Message Passing Flexibility (cont’d) • But peer-to-peer allows arbitrary message flows between the two parties to a conversation • Example - peer-to-peer message passing A B C Send Rcv • To communicate with • an application that uses • peer-to-peer, you must • know the message • flows (protocol) that • the application expects T I M E
Termination Model • In RPC, a program normally announces termination by returning to its caller • It must not return until all of its outbound calls have returned • In peer-to-peer, a program announces termination by invoking Syncpoint. • This also tells the program’s transaction to start committing, but each program decides independently when to commit (by issuing Syncpoint) • Termination errors are the price of more message passing flexibility, such as ...
Termination Model (cont’d) • Certain programming errors are possible in peer-to-peer • P1 invokes Syncpoint, but P2 is waiting for a message from P1. P1 and P2 are deadlocked. • P2 gives up waiting for P1’s message, so P2 invokes Syncpoint. P2 must be ready for P1’s message afterSyncpoint returns.
Connection Models • To cope with stateful servers, both models need a way to manage shared state. • In peer-to-peer, the state is implicitly attached to conversation (session) context • In RPC, it is either exchanged in parameters or a session is created above the communications layer using a binding handle or cookie. • In both models, need to clean up retained state after a failure and need to reconstruct shared state at appropriate times.