850 likes | 1.1k Views
The Whirlwind Tour. Chapter 1a. Transactions: Where It All Started.
E N D
The Whirlwind Tour Chapter 1a
Transactions: Where It All Started [Cuneiform] documents now number about half a million, three- quarters of them more or less directly related to the history of law - dealing, as they do, with contracts, acknowledgment of debts, receipts, inventories, and accounts, as well as containing records and minutes of judgments rendered in courts, business letters, administrative and diplomatic correspondence, laws, international treaties, and other official transactions. The total evidence enables the historian to reach back as far as the beginnings of writing, to the dawn of history.[ ... ] Moreover, because of the inconvenience of writing in stone or clay, Mesopotamians wrote only when economic or political necessity demanded it. (Encyclopaedia Britannica, 1974 edition)
From Transactions to Transaction Processing Systems - I The Sumerian way of doing business involved two components: • Database. An abstract system state, represented as marks on clay tablets, was maintained. Today, we would call this the database. • Transactions. Scribes recorded state changes with new records (clay tablets) in the database. Today, we would call these state changes transactions.
From Transactions to Transaction Processing Systems - II The real state is represented by an abstraction, called the database, and the transformation of the real state is mirrored by the execution of a program, called a transaction, that transforms the database.
Transactions Are In ... Communications: Each time you make a phone call, there is a call setup transaction that allocates some resources to your conversation; the call teardown is a second transaction, freeing those resources. The call setup increasingly involves complex algorithms to find the callee (800 numbers could be anywhere in the world) and to decide who is to be billed (800 and 900 numbers have complex billing). The system must deal with features like call forwarding, call waiting, and voice mail. After the call teardown, billing may involve many phone companies.
Transactions Are In ... Finance: Each time you purchase gas using a credit card, the point-of-sale terminal connects to the credit card company's computer. In case that fails, it may alternatively try to debit the amount to your account by connecting to your bank. This generalizes to all kinds of point-of-sale terminals such as cash registers, ATMs, etc. When banks balance their accounts with each other (electronic fund transfer), they use transactions for reliability and recoverability.
Transactions Are In ... Travel: Making reservations for a trip requires many related bookings and ticket purchases from airlines, hotels, rental car companies, and so on. From the perspective of the customer, the whole trip package is one purchase. From the perspective of the multiple systems involved, many transactions are executed: One per airline reservation (at least), one for each hotel reservation, one for each car rental, one for each ticket to be printed, on for setting up the bill, etc. Along the way, each inquiry that may not have resulted in a reservation is a transaction, too.
Transactions Are In ... Manufacturing: Order entry, job and inventory planning and scheduling, accounting, and so on are classical application areas of transaction processing. Computer integrated manufacturing (CIM) is a key technique for improving industrial productivity and efficiency. Just-in-time inventory control, automated warehouses, and robotic assembly lines each require a reliable data storage system to represent the factory state.
Transactions Are In ... Real-Time Systems: This application area includes all kinds of physical machinery that needs to interact with the real world, either as a sensor, or as an actor. Traditionally, such systems were custom made for each individual plant, starting from the hardware. The usual reason for that was that 20 years ago off-the-shelf systems could not guarantee real-time behavior that is critical in these applications. This has changed, and so has the feasibility of building entire systems from scratch. Standard software is now used to ensure that the application will be portable.
A Transaction Processing System A transaction processing system (TP-system) provides tools to ease or automate application programming, execution, and administration of complex, distributed applications. Transaction processing applications typically support a network of devices that submit queries and updates to the application. Based on these inputs, the application maintains a database representing some real-world state. Application responses and outputs typically drive real-world actuators and transducers that alter or control the state. The applications, database, and network tend to evolve over several decades. Increasingly, the systems are geographically distributed, heterogeneous (they involve equipment and software from many different vendors), continuously available (there is no scheduled downtime), and have stringent response time requirements.
ACID Properties: First Definition • Atomicity: A transaction’s changes to the state are atomic: either all happen or none happen. These changes include database changes, messages, and actions on transducers. • Consistency: A transaction is a correct transformation of the state. The actions taken as a group do not violate any of the integrity constraints associated with the state. This requires that the transaction be a correct program. • Isolation: Even though transactions execute concurrently, it appears to each transaction T, that others executed either before T or after T, but not both. • Durability: Once a transaction completes successfully (commits), its changes to the state survive failures.
Structure of a Transaction Program • The application program declares the start of a new transaction by invoking BEGIN_WORK(). • All subsequent operations will be covered by the transaction. Eventually, the application program will call COMMIT_WORK(), if a new consistent state has been reached. This makes sure the new state becomes durable. • If the application program cannot complete properly (violation of consistency constraints), it will invoke ROLLBACK_WORK(), which appeals to the atomicity of the transaction, thus removing all effects the program might have had so far. • If for some reason the application fails to call either commit or rollback (there could be an endless loop, a crash, a forced process termination), the transaction system will automatically invoke ROLLBACK_WORK() for that transaction.
Performance Measures of Interactive Transactions Performance/ Small/Simple Medium Complex Transaction ________________________________________________________________ Instr./transaction 100k 1M 100M Disk I/O / TA 1 10 1000 Local msgs. (B) 10 (5KB) 100 (50KB) 1000 (1MB) Remote msgs. (B) 2 (300B) 2 (4KB) 100 (1MB) Cost/TA/second 10k$/tps 100k$/tps 1M$/tps Peak tps/site 1000 100 1
Client-Server Computing: The CORBA Idea Object Implementation: Jim´s Mailbox Client on WS Presentation Services etc IDL Skeleton IDL Stub Request: Delete Object Request Broker
Client-Server Computing: The WWW Idea Java- applet HTTP Server WWW- Browser JDBC- driver code proprietary protocol Java-Applet + Java Database Connection (JDBC) Driver Code prop. protocol JDBC-ODBC- bridge ODBC driver Database Server public protocol JDBC driver JDBC network driver (e.g. TCP/IP)
Terms We Have Introduced So Far • Resource manager: The system comes with an array of transactional resource managers that provide ACID operations on the objects they implement. Database systems, persistent programming languages, and queue managers are typical examples. • Durable state: Application state represented as durable data stored by the resource managers. • TRPC: Transactional remote procedure calls allow the application to invoke local and remote resource managers as though they were local. They also allow the application designer to decompose the application into client and server processes on different computers. • Transaction program: Inquiries and state transfor-mations are written as programs in conventional or specialized programming languages. The programmer brackets the successful execution of the program with a Begin-Commit pair and brackets a failed execution with a Begin-Rollback pair.
Terms We Have Introduced So Far • Atomicity: At any point before the commit, the application or the system may abort the transaction, invoking rollback. If the transaction is aborted, all of its changes to durable objects will be undone (reversed), and it will be as though the transaction never ran. • Consistency: The work within a Begin-Commit pair must be a correct transformation. • Isolation: While the transaction is executing, the resource managers ensure that all objects the transaction reads are isolated from the updates of concurrent transactions. • Durability: Once the commit has been successfully executed, all the state transformations of that transaction are made durable and public.
Server Where To Split Client/Server? Thin Fat Presentation Flow Control Application Logic (=business objects) Data Access Fat Thin
Client/Server Infrastructure Server Client Middleware Objects Group- ware TP-Mon. DBMS OS Files GUI OOUI System Mgmt. OS SQL ORB TRPC Mail Security WWW Transport etc.
TA- context TA- context TA- context The OTS Model transmitted with request recoverable server transaction originator commit coordination invocation creation termination Transaction service
Transaction Processing System Feature List • Application development features Application generators; graphical programming interfaces; screen painters; compilers; CASE tools; test data generators; starter system with a complete set of administrative and operations functions, security, and accounting. • Repository features Description of all components of the system, both hardware and software. Description of the dependencies among components (bill-of-material). Description of all changes to all components to keep track of different versions. The repository is a database. Its role in the system must be complete, extensible, active and allow for local autonomy. • TP-Monitor Features Process management; server classes; transactional remote procedure calls; request-based authentication and authorization; support for applications and resource managers in implementing ACID operations on durable objects.
Transaction Processing System Feature List • Data communications features Uniform I/O interfaces; device independence; virtual terminal; screen painter support; support for RPC and TRPC; support for context-oriented communication (peer-to-peer). • Database features Data independence; data definition; data manipulation; data control; data display; database operations. • Operations features Archiving; reorganization; diagnosis; recovery; disaster recovery; change control; security; system extension. • Education and testing features Imbedded education; online documentation; training systems; national language features; test database generators; test drivers.
Summary of Chapter 1 • A transaction processing system is a large web of application generators, system design and operation tools, and the more mundane language, database, network, and operations software. • The repository and the applications that maintain it are the mechanisms needed to manage the TP system. The repository is a transaction processing application. • It represents the system configuration as a database and supplies change control by transactions that manipulate the configuration and the repository. • The transaction concept, like contract law, is intended to resolve the situation when exceptions arise. The first order of business in designing a system is, therefore, to have a clear model of system failure modes. What breaks? How often do things break?
Basic Terminology Chapter 1b
A Word About Words (Chapter 2) Humpty Dumpty: “When I use a word, it means exactly what I chose it to mean; nothing more nor less.” Alice: “The question is, whether you can make words mean so many different things.” Humpty Dumpty: “The question is, which is to be master, that’s all.” Lewis Carroll
Basic Computer Terms To get any confusion that might be caused by the many synonyms in our field out of the way, let us adopt the following conventions for the rest of this class: domain = data type = ... field = column = attribute = ... record = tuple = object = entity = ... block = page = frame = slot = ... file = data set = table = ... process = task = thread = actor = ... function=request=method=... All the other terms and definitions we need will be briefly introduced and explained during the session.
Basic Hardware Architecture I In Bell and Newell’s classic taxonomy, hardware consists of three types of modules: Processors, memory, and communications (switches or wires). Processors execute instructions from a program, read and write memory, and send data via communication lines. Computers are generally classified as supercomputers, mainframes, minicomputers, workstations, and personal computers. However, these distinctions are becoming fuzzy with current shifts in technology.
Basic Hardware Architecture II Today’s workstation has the power of yesterday’s mainframe. Similarly, today’s WAN (wide area network) has the communications bandwidth of yesterday’s LAN (local area network). In addition, electronic memories are growing in size to include much of the data formerly stored on magnetic disk. These technology trends have deep implications for transaction processing.
Basic Hardware Architecture III • Distributed processing: Processing is moving closer to the producers and consumers of the data (workstations, intelligent sensors, robots, and so on). • Client-server: These computers interact with each other via request-reply protocols. One machine, called the client, makes requests to another, called the server. Of course, the server may in turn be a client to other machines. • Clusters: Powerful servers consist of clusters of many processors and memories, cooperating in parallel to perform common tasks.
Memories - The Economic Perspective I • The processor executes instructions from virtual memory, and it reads and alters bytes from the virtual memory. The mapping between virtual memory and real memory includes electronic memory, which is close to the processor, volatile, fast, and expensive, and magnetic memory, which is "far away" from the processor, non-volatile, slow, and cheap. The mapping process is handled by the operating system with some hardware assistance. • Memory performance is measured by its access time: Given an address, the memory presents the data at some later time. The delay is called the memory access time. Access time is a combination of latency (the time to deliver the first byte), and transfer time (the time to move the data). Transfer time, in turn, is determined by the transfer size and the transfer rate. This produces the following overall equation: memory access time = latency + ( transfer size / transfer rate )
Memories - The Economic Perspective II • Memory price-performance is measured in one of two ways: • Cost/byte. The cost of storing a byte of data in that media. • Cost/access. The cost of reading a block of data from that media. • This is computed by dividing the device cost by the number of accesses per second that the device can perform. • The actual units are cost/access/second, but the time unit is implicit in the metric’s name. • These two cost measures reflect the two different views of a memory’s purpose: • it stores data, and • it receives and retrieves data.
Memories- The Economic Perspective III Typical large system capacity
Magnetic Memory • There are two types of magnetic storage media: disk and tape. Disks rotate, passing the data in the cylinder by the electronic read-write heads every few milliseconds. This gives low access latency. The disk arm can move among cylinders in tens of milliseconds. Tapes have approximately the same storage density and transfer rate, but they must move long distances if random access is desired. Consequently, tapes have large random access latencies—on the order of seconds. Disk Access Time = Seek_Time + Rotational_Latency + (Transfer_Size/ Transfer_Rate)
Magnetic Memory Compare the times required for two access patterns to 1MB stored in 1000 blocks on disk: • Sequential access: Read or write sectors [x, x + 1, ..., x + 999] in ascending order. This requires one seek (10 ms) and half a rotation (5 ms) before the data in the cylinder begins transferring the megabyte at 10 MBps (the transfer takes 100 ms, ignoring one-cylinder seeks). The total access time is 115ms. • Random access: Read the 1000 sectors [x, ..., x + 999] in random order. In this case, each read requires a seek (10 ms), half a rotation (5 ms), and then the 1 kb transfer (.1 ms). Since there are 1000 of these events, the total access time is 15.1 seconds.
Memory Hierarchies • The hierarchy uses small, fast, expensive cache memories to cache some data present in larger, slower, cheaper memories. • If hit ratios are good, the overall memory speed approximates the speed of the cache. • At any level of the memory hierarchy, the hit ratio is defined as: hit ratio = references satisfied by cache / all references to cache • Suppose a cache memory with access time C has hit rate H, and suppose that on a miss the secondary memory access time is S. Further, suppose that C = .01 • S. The effective access time of the cache will be as follows: Effective memory access time = H • C + (1 - H) • S = H • (.01 • S) + ( 1 - H) • S = (1 - .99 • H) • S » (1 - H) • S