250 likes | 371 Views
Transaction Madness. M. ostly. a. tomicity. oes. d. othing. n. xcept. e. pawning. s. eminars. s. Setting the Stage.
E N D
Transaction Madness M ostly a tomicity oes d othing n xcept e pawning s eminars s
Setting the Stage • At first I hoped that such a technically unsound project would collapse, but I soon realized it was doomed to success. Almost anything in software can be implemented, sold, and even used given enough determination. There is nothing a mere scientist can say that will stand against the flood of a hundred million dollars. But there is one quality that cannot be purchased in this way---and that is reliability. The price of reliability is the pursuit of the utmost simplicity. It is a price which the very rich find most hard to pay. (C.A.R. Hoare, "The Emperor's Old Clothes", Turing Award Lecture) • I have yet to see any problem, however complicated, which, when you looked at it in the right way, did not become still more complicated. (Poul Anderson) • Reality is that which, when you stop believing in it, doesn't go away. (Philip K. Dick) • The most likely way for the world to be destroyed, most experts agree, is by accident. That's where we come in; we're computer professionals. We cause accidents. (Nathaniel S. Borenstein)
Atomicity Has Arrived in the Real World • ACID • From Wikipedia, the free encyclopedia • Jump to: navigation, search • For other uses of this term, see Acid (disambiguation). • In databases, ACID stands for Atomicity, Consistency, Isolation, and Durability. They are considered to be the key transaction processing features/properties of a database management system, or DBMS. Without them, the integrity of the database cannot be guaranteed. In practice, these properties are often relaxed somewhat to provide better performance. • In the context of databases, a single logical operation on the data is called a transaction. An example of a transaction is a transfer of funds from one account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another). The ACID properties guarantee that such transactions are processed reliably.
Or In This One ... http://www.hpts.ws/agenda.html
And, of Course, ... almost nobody uses pure serializable transactions.
Atomicity as a Research Topic • In the database community, transaction stuff has fallen out of grace with the research community. • We understand the theoretical issues (synchronization, replication, commit processing, performance optimization, queueing, messaging), so its SMOP – can leave it to the developers. • And, sure enough, much progress was made and many interesting implementations came about. • But more recently, transactions and/or concepts closely related to them attracted some new attention; this attention, however, comes from groups that were not typically considering those issues before. • This talk is a snapshot of that situation. Originally, I wanted to provide some kind of a unifying perspective, but for reasons that (hopefully) will become clear as I go along, it is rather a description of a conceptual mess. • Depending on how one judges the reasons for the renewed interest in transaction-based solutions, one can (ab)use this analysis as an argument for the need of more orchestrated work – or as an illustration of how ill-advised all these attempts are.
First ACM SIGPLAN Workshop on Languages, Compilers, and Hardware Support for Transactional Computing PLDI 2006 Ottawa, Canada, June 11, 2006 ::Motivation:: The goal of this workshop is to provide a forum for the presentation of research on all aspects of transactional computing. There has been much recent interest on extending programming languages, systems, and hardware with support for transactions, speculation, and related abstractions that provide alternatives to classical lock-based concurrency mechanisms. The goals of this workshop should be construed broadly to include any novel software or hardware techniques, algorithms, or implementations for transactional concurrency abstractions applicable to multi-core, multithreaded, or high- performance parallel systems. This workshop is intended to cover foundations of concurrent programming as it relates to all forms of transactional computing, as well as tools, techniques, and applications that leverage these principles. Experience reports are also welcome. Also, PPoPP05 had at least 3 papers on transactional memory / atomicity.
Who Is Interested In That Stuff? • Google for „atomicity programming“, and find this at the top of the list:http://www.soe.ucsc.edu/~cormac/atom.html • Programming Languages Reading Group - Fall 2005 Topic: This quarter we will be studying concurrency. More specifically, transactions and atomicity.Presentation Schedule: Oct. 11 Static Race Detection Brian Oct. 18 no meeting (OOPSLA) Oct. 25 Static Atomicity Checking Jeff Nov. 1 Dynamic Atomicity Checking Mike Nov. 8* Atomicity Inference Ben Nov. 15 no meeting Nov. 22 AtomCaml: First-Class Atomicity via Rollback Chris Nov. 29 Composable Memory Transactions Alex Dec. 6 TBD TBD Dec. 13 TBD TBD Dec. 20 TBD TBD • http://supertech.csail.mit.edu/xaction.html
Let‘s Get the Terminology Straight - I From a paper I came across: „Transactions require extensive run-time support, and slow down execution significantly. If persistence and tolerance to crash failures is not needed, then a simple monitor-based design can provide the same behavior, with considerably better performance.“ Its summary says: „Atomicity is considered during initial system operation specification in the Operation Model to abstract away the complexity of concurrency. As the development process goes on, the Operation Model is refined – the system operations are broken up into smaller pieces – to slowly introduce concurrency back into the system. Finally, at the design stage, low-level concepts that provide atomicity, such as transactions or monitors, are used in the Interaction Model to ensure consistent concurrent updating of the application state.“
Let‘s Get the Terminology Straight - II • Wish: Wouldn‘t it be nice to hide concurrency from programmers? • SQL does it well • UI packages do it fine (mostly single-threaded!) • RPC does it OK • But we are moving towards more asynchrony, i.e. towards more visible concurrency (e-commerce scripts and languages, web-services, etc.). You can hide all the concurrency some of the time, you can hide some concurrency all the time, but you cannot hide all the concurrency all the time. • Asynchronous message-based concurreny does not fit easily with more traditional shared-memory synchronous concurrency control. • Goal: Make concurrent flows available and checkable at the language level. This is quoted from Luca Cardelli‘s ICSE 2005 Keynote:
i.e. spanning in-memory and external data, and expressive enough to replace rather than augment mutexes and condition variables in common uses Let‘s Get the Terminology Straight - III • Suppose we have efficient support for atomic multi-word updates • Including contention-management & fairness issues etc • Whether using a pure hardware implementation, a hybrid hardware/software one, or a pure software approach • What problems remain in using this to implement pervasive atomic blocks? • What expressiveness problems remain with using pervasive atomic blocks as a programming abstraction? This is a quote from a talk by Tim Harris:
Let‘s Get the Terminology Straight – One Last Time The paper „Investigating Atomicity and Observability“ by Jon Burton and Cliff Jones starts by saying: „Using the fiction of atomicity as a design abstraction and then refining atomicity as we develop an implementation is widely used in areas of concurrent computing such as database systems and transaction processing.“ It then goes on to make the following clarification: „In this paper, by the term ‚atomic‘ or by the property of ‚atomicity‘, we mean the isolation property described by the I of ACID in the database literature [...]. We do not mean the all-or-nothing property which is implied by the A of ACID.“
So There Are Many Words: • Transaction (our definition) • Atomicity (this flavor) • Consistency (syntactic) • Isolation • Durability • Concurrency • Synchronization • Monitor • Transaction (your definition) • Serializability • Atomicity (that flavor) • Observability • Linearizability • Consistency (semantic) • Repeatability • Nesting • Compensation • Undo/Redo • ............
And I Have Not Mentioned ... • transactional messaging; • transactional queues; • workflow; • web service composition; • and, unfortunately, your favorite topic.
An Algebraic Analogy Up to the 16th century mathematicians treated versions of the cubic equation such as ax3 + bx = c ax3 = bx + c ax3 = bx2 + c as separate problems, for which individual solutions had to be found – actually some of the solutions only worked for special values of a, b, or c. Recognizing that these are all special cases of the general cubic equation ax3 + bx2 + cx + d = 0 and finding a solution for it required a major breakthrough and is generally regarded as the beginning of modern algebra.
Atomicity Is Not Easily Understood Consider the following exchange: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=273201&SiteID=1
BTW: Why Haven‘t Transactional Programming Languages Taken Off? ?
Attempts That Have Not Really Worked • About 15 years ago, there was a considerable amount of work in what was called „extended transaction models“. • „Extended“ generally referred to the fact that at least one of the ACID properties was given up in order to achieve more flexibility in some area – diluted acid, so to speak. • However, it was never really questioned whether the implementational „schema“ of a transaction and the guarantees provided to the programming environment necessarily had to be the same. • As an example consider a very simple transaction that does an update, provided some entry condition holds. In case it does, the application wants the update with all the glory of ACID. In case the condition does not hold, the user has to be informed about the abort – but the message has to be delivered exactly once. So from the application‘s perspective we have an abort, but the reliable delivery of the abort news requires the commit of some transaction. • This is where the notion of a guarantee comes into the picture: Applications want (differents types of) guarantees regarding the executions of their services. Those guarantees imply different degrees of atomicity. Transactions, on the other hand, are the means for implementing guarantees, but there is certainly no isomorphism between the two sets of concepts.
ACID May Not Be the „Solution“ • We have largely focused on the mechanisms and the FAPs: Locking protocols, message protocols, WAL, 2PC, etc. • We have mostly discussed things at a very low level: read/write, action as opposed to effect, etc. • The semantics of the application (what does consistent execution mean from the app‘s point of view) has largely been ignored. It is harder to capture, but it is more relevant.
Units of Execution • Low-level routine 10-6 sec • I/O routine 10-3 sec • Simple interactive TA 1 sec • Simple query 101 sec • Complex query 102 sec • Routine workflow 103 sec • Simulation / data mining app. 104 sec • Medium-sized workflow 105 sec • ... • WF repr. large construction project 108 sec
What Are Interesting Aspects? • Define a framework for specifying guarantees about executions – which can then be supported by transactional mechanisms. • Give objects control over who is allowed to see them when – as opposed to transactions deciding unilaterally what they want to lock. That will help with fault containment, compensation, etc. • Use the log for fault containment, debugging, and related purposes. • Don‘t take atomicity too far: Allow things to be handled „on the side“, as long as they will be handled eventually. • Make consistency violation a first-class member of the model. • Support more realistic synchronization models for the application, not just for the internal data structures. • Use transactional guarantees for service composition. • Define different shades of isolation, e.g. „don‘t touch“, „escrow“, „strict promise“, „weak promise“, „optimism“, etc. – together with adequate programming (design) patterns. • Use transactional guarantees to automatically generate the skeletons for compensation acitivties. • When using transactionsin the context of long-lived activities, how do can excpetions be accommodated?
DB space Invoc. space Msg space Lock space Notification of success C holds D -> f(D) Notification of abort (not C) C does not hold D -> D Program calls abort for internal reasons Notification of abort (retcode) D -> D Notification of abort (retcode) System signals failure D -> D Transactional Guarantees - I Assume a simple transaction that performs function f on some set of data D if condition c holds; otherwise it aborts.
DB space Invoc. space Msg space Lock space Notification of success C holds D -> f(D) Notification of abort (not C) C does not hold D -> D Program calls abort for internal reasons Notification of abort (retcode) D -> D Notification of abort (retcode) System signals failure D -> D Transactional Guarantees - II Assume a simple transaction that performs function f on some set of data D and asynchronously invokes service s if condition c holds; otherwise it aborts. Invoke s Insert L(D) Install & activ. rec. for s Install & activ. rec. for s Install & activ. rec. for s If in addition the application needs to protect its updates for compensation after commit, it will store the identifiers in lock space.
It‘s Not Just Our Problem ... Many students identify [atomicity] with ‚the' atomic theory, and consider the truth of the latter established. Textbooks list successes of the atomic theory in classical physics and chemistry. How, then, could outstanding scientists, such as Ostwald and Mach, at the turn of the century, deny the existence of atoms? The best answer is given by these anti-atomists themselves. Their controversy with Boltzmann and Planck illustrates general points in the study of the history and philosophy of science. One argument, the irreversibility paradox, is not resolved satisfactorily to this day.