330 likes | 475 Views
Phoenix Project: Making Applications Persistent. David Lomet, Roger Barga Microsoft Research, Redmond Gerhard Weikum University of Saarlands Mark Tuttle Compaq Cambridge Research Lab Interns: Sanjay Agrawal, Thomas Baby, Sirish Chandrasekaran, Stylianos Paparizos. Internet.
E N D
Phoenix Project:Making Applications Persistent David Lomet, Roger Barga Microsoft Research, Redmond Gerhard Weikum University of Saarlands Mark Tuttle Compaq Cambridge Research Lab Interns: Sanjay Agrawal, Thomas Baby, Sirish Chandrasekaran, Stylianos Paparizos
Internet IP Sprayer … Web Server Web Server TP App Server TP App Server TP App Server uses transactions to access the DBMS DBMS Ex: An E-Commerce Server
Problem • Half+ of TP system is outside the transaction • Much of it is distributed computing • Some is TP-specific (e.g. persistent queues, workflow) • Current “Fact of Life”: after a crash • Databases recover – to last committed transaction • But applications “disappear” • They crash themselves and cannot recover • Or the DBMS crashes, and they cannot continue execution • Availability requires programmer actions • “Stateless” applications • Explicit storing of state in queues or database
Cmpnt1 Cmpnt1 Component Component Server Process Server Process Traditional TP ApproachStateless Components DCOM Client Machine Request Q Network Reply Q DCOM • Each method call is a transaction • Read state from a transactional queue • Invoke the method • Write state to the transactional queue
Component Component Component Application programmers tend to write stateful applications that retain state across method calls The problem with stateful apps is the risk of losing state as a result of a failure Stateful Applications Now Server Machine Client Machine Network DCOM Component Based App Code Server Machine DCOM Local Servicesand Data Server Machine
Transparent Persistence • Programmer simply writes his program • System ensures persistence– HOW? • Virtual Components • Virtual component mapped to physical • Mapping can be changed • Extract and restore state of virtual component • Into physical component • Under system discretion for scalability • As recovery technique after system crash • Efficiency by • Replaying to rebuild state • Not by simply frequently saving entire component state • Rationale: modest length of execution histories
Important Benefits for • Application programmer • Program does not have to deal with crashes • Greatly simplifying program • Natural stateful applications • More focus on correctness/functionality • Operational application • No manual restore of system of application state • Hence shorter outages • Users/customers • Less lost work, fewer frustrations • Business • Increased availability means • Less lost revenue (e.g. Ebay lost $1M/hr in revenue) • Happier customers
Outline of Talk • Background • Application Availability Problem • Phoenix goal: transparent application persistence • Persistent Stateful Applications • Fundamentals of multi-component protocols • Three forms of components • Three interaction contracts • Two sub-projects • Phoenix/COM for persistent components • Phoenix/ODBC for transactional components • External components need no system support
Persistent Stateful Applications For general multi-tier applications including web/app servers and database backends
Component Types • Persistent(PCOM): e.g.web/application servers • State survives system failures • Phoenix/COM component acts as PCOM • Via logging interactions with other system components • Transactional(TCOM): e.g.SQL DB sessions • Transactions can abort- aborted state may “die” • But committed states, messages survive system failures • Phoenix/ODBC session acts as TCOM • Via server reply logging • External(XCOM): e.g.users, autonomous sites,etc. • Not under our control (need do nothing) • Prompt logging by PCOM limits exposure • Only failure during interaction is “unmasked”
Interaction Contracts • Committed interaction contract(CIC) • PCOMPCOM • Guarantees that interaction persists across failures • Transaction’l interaction contract (TIC) • PCOMTCOM • Permits transactional component to abort • But final commit and its result msg is persistent • External interaction contract (XIC) • PCOMXCOM • Permits interaction with external world, which does not play by our rules • Immediate forced logging by PCOM to limit failure window
CIC XIC Web Server AppServer TIC Client “SQLServer” System Schematic • Replay needs captured at component interactions • Actions depend on component types involved • Interaction contracts provide persistence guarantees • Forms of Components and Interaction Contracts • Persistent [PCOM]- pervasive within system • External [XCOM]- at system edges (can initiate work) • Transactional [TCOM]- at system edges (receives work)
Phoenix/COM: Persistent Components– for multi-tier applications Transparently
Phoenix/COM Approach • Persistence as a runtime service • Component registered as persistent [I.e. a PCOM] • Similar to COM+ transaction property registration • State persistence is transparent and recovery automatic • Phoenix/COM runtime provides interaction contracts • Programmers write application logic only. • Via application replay • Piece-wise deterministic assumption • Between non-deterministic events, program execution is repeatable • Non-determinism removed by logging, e.g. • User interactions, Procedure calls, Reading sys clock, etc. • Efficiency by minimizing log forces & logged data • Via efficient interaction protocols & “logical” log operations
Log interactions, checkpoint state IF REQUIRED to fulfill CIC requirements Log Log Recovery Infrastructure CIC: keeps App’s recoverable App see COM interface Com Object App 1 CIC Com ObjectApp 2 Phoenix/COM Phoenix/COM
Phoenix/COM Elements • Virtual Components • Isolate application from physical component failure • Permit another component to “continue” execution • Normal operation: interception and logging • Log activation, call, etc. to capture object state • Recovery: detect failure & recover • Install virtual object state in new physical object • Error handler reconnects to recovered virtual object • Implementation on context infrastructure • Feature of COM+, .NET CLR, COMApps
Committed Interaction ContractPhoenix/Com Responsibility • Sender guarantees: • S1: Persistent state at least up to send • S2: Persistent message • S2A: Repeated transmission retry • S2B: Transmission on request • Receiver guarantees: • Duplicate message elimination • Persistent state with msg receive: S2A released • Subsequently, receiver must request message • Message content stable: S2B released • Sender can now discard message contents
Context Properties Context Component Infrastructure:Context • Boundary definition around components • Cross-context boundary events are raised • Permitting cross context call interception • Every component lives in a context • Subset of an apartment Contexts specify interception
Infrastructure:Policy • Handler for context boundary events • Cancel or repeat the call • Examine the call state (IID, method, parameters, etc.) • Take action as required • Example: Synchronization Policy • onEnter: acquire mutex • onLeave: release mutex Policies enable runtime extension… as interception handlers
Infrastructure:Switcher • The Interception Mechanism Context C2 Context C1 Proxy Stub A B SWITCHER Client Side Policies Server Side Policies Pass Buffer Call,Return Enter,Leave Pass Buffer
Role for Switcher • Interception mechanism for method execution • Detects errors during processing • Resolves reference to physical OID • Calls a registered error handler • If R_OK it retries method call/response • API to rebind “virtual” object • To new physical OID Enhanced switcher detects errors, provides automatic recovery, and virtualizes components…
Infrastructure SummaryHow Phoenix/COM leverages it… • Context • Registers interception boundary • Policies • Log Component Creation • Log Method Call/Response • Switcher • Detect Errors from Component Failure • Remap Virtual Component References • Retry Method Call/Response
Phoenix/ODBC Transactional components for Database servers
Phoenix/ODBC Objectives • Persistent Database Sessions • Mask network and SQL Server failures • Leverage database recovery mechanisms • Recovery is transparent to client • Originally:stand-alone session persistence • No protection from client crashes • Realize Transactional Components • Within Phoenix/COM • Phoenix/ODBC wrapped DB becomes TCOM • Client becomes a PCOM • Component registered as transactional (TCOM)
SQL ODBC Database Session • Allocate ODBC handles • Connect to SQL Server • Set session attributes • Send SQL statements to SQL Server • Fetch records from result set SQLServer Interconnect Client
Phoenix/ODBC Elements • Phoenix-enabled ODBC connection • Virtual session & connection from client to server • Intercept and “wrap” • Log actions that affect ODBC session context • Connection, logon, database and userid, set options • Make temporary tables/cursors on server persistent • Rewrite SQL statement(s) to persist SQL results • Receive server responses and error messages • Fetch results from persistent result tables • Recover session • Remapping virtual ODBC session to new real session
SQL SRVR 3.0 SQLServer SQL SRVR 2.5 Phoenix ODBC Driver Manager TCOM Oracle Oracle 3.0 Client Informix Informix 3.0 Phoenix/ODBC Infrastructure • Currently implemented in ODBC Driver Manager • Which wraps native ODBC drivers • No changes required for ODBC driver(s) • No changes required for database systems • No special client programming
Transaction Interaction ContractPhoenix/ODBC Responsibility • Before PCOM issued “Commit” • TCOM can abort, forgetting everything • PCOM can forget some/all state/messages • At PCOM issued “Commit” • PCOM sender promises are CIC promises • Committing TCOM receiver guarantees: • Duplicate message elimination • Persistent state with message receive/commit: • S2A, “S2B” released when TCOM commits & sends reply • PCOM can discard msg contents when reply received • Persistent reply message • A persistent result set committed by transaction
SELECT … WHERE FALSE SELECT * FROM phtable SELECT … INTO phtable CREATE TABLE phtable Default Result Set (stable reply) Select cust_name,cust_address from customers where acct_balance > max_allowed_balance • Determine format of result set • Create persistent table on SQL Server • Rewrite SQL stmt to load persistent table • Open access to persistent table SQLServer Client ODBC
SQLServer ODBC Session Context Database Session Recovery • Intercept errors and detect possible system failure • Ping database server (interval and timeout) • Reconnect to database server • Determines if server really failed • Verifies persistent structures were recovered • Reinstall session context from client log • Advance through result set(s) to last result seen • Application still sees aborted transactions
Summary • Phoenix goal: transparent app persistence • No application programming • Higher availability and scalability • PCOM by Phoenix/COM via • Logging non-deterministic events • By infrastructure via wrapping • Component replay if system crashes • TCOM by Phoenix/ODBC via • Inducing DBMS logging for database recovery • Via wrapping ODBC and modifying SQL requests • Inducing DBMS recovery for session state
Phoenix/COM Project URL http://www.research.microsoft.com/research/db/phoenix Phoenix/COM References Barga, Lomet, Weikum: Recovery Guarantees for General Multi-tier Applications. (submitted for publication) Barga, Lomet: Phoenix/COM: Enabling persistent component-based applications. (Internal white paper)
Phoenix/ODBC References • Barga, Lomet: Measuring and Optimizing a System for Persistent Database Sessions. Proceedings of ICDE, Heidelberg, Germany (April 2001) • Barga, Lomet, Baby, Agrawal: Persistent Client-Server Database Sessions. Proceedings of EDBT Conference, Lake Constance, Germany (Mar. 2000) 462-477. • Barga, Lomet:Phoenix: Making Applications Robust. (demo paper) Proceedings of ACM SIGMOD Conference, Philadelphia, PA (June, 1999) 562-564. • Lomet, Weikum: Efficient Transparent Application Recovery in Client-Server Information Systems. (Best Paper Award) Proceedings of 1998 ACM SIGMOD Conference, Seattle, WA (June 1998) 460-471.