1 / 47

COT 5611 Operating Systems Design Principles Spring 2014

This lecture explores the differences between cell storage and journal storage, including their implementation challenges and benefits. It also discusses the state transitions, read and write procedures, atomicity logs, and different types of logs. The lecture concludes with recovery procedures for databases in volatile storage.

laguna
Download Presentation

COT 5611 Operating Systems Design Principles Spring 2014

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COT 5611 Operating Systems Design PrinciplesSpring 2014 Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 3:30 – 5:30 PM

  2. Lecture 24 • Reading assignment: • Chapter 9 from the on-line text Lecture 24

  3. Today - Cell storage versus journal storage - All-or-nothing atomicity - Before-or-after atomicity Lecture 24

  4. Cell storage versus journal storage • Cell storage: • random-access memory and disk  a set of named, shared, and rewritable cells • hard to implement all-or-nothing semantics: storing destroys old data storing data reveals data to the later threads immediately • if the result consists of several output values, all of which should be exposed simultaneously it is harder to construct the all-or-nothing action • Journal storage: • associate with every named variable not a single cell, but a list of cells in non-volatile storage that represent the history of the variable • add a layer between the application and the cell storage; this layer appends a new prospective new value Lecture 24

  5. Lecture 24

  6. The state transitions of a journal storage system Lecture 24

  7. Read and write procedures: caller_id is the action identifier returned by NEW_ACTION Lecture 24

  8. Example: a thread has created a new record with: data_id=A, new_value=75, and client_id=1794. The procedure READ_CURRENT_VALUE will return value 24 for A and ignore versions aborted or pending. Lecture 24

  9. An all-or-nothing transfer using journal storage Note: the transaction is not before-or-after It checks if there are enough funds in the credit_account. The order of steps is unconstrained Problems: updates to version history and changes to the outcome must be all-or-nothing. But these can be done by overriding a single cell. Lecture 24

  10. Atomicity logs and journal storage • Log  An interleaved version of all variables; the information about the update of each data forms a record appended at the end of the log. • Easy access to a log, only the pointer to the last record is needed • Combine all-or-nothing atomicity of journal storage with the speed of cell storage. Two steps • Logcarry out the change in the journal storage • Install change the cell storage by overriding the previous version of each record • The log is the authoritative record of the outcome of an action; the cell storage can be reconstructed using the log. • The log should reside in non-volatile memory. Lecture 24

  11. Types of logs Atomicity log. Allows a crash recovery procedure to undo all-or-nothing actions that didn’t complete, or finish all-or-nothing actions that committed but that didn’t record all of their effects. Archive log. Many uses for archive information: watching for failure patterns, reviewing the actions of the system preceding and during a security breach, recovery from application-layer mistakes fraud control, and compliance with record-keeping requirements. Performance log. Most mechanical storage media have much higher performance for sequential access than for random access. Since logs are written sequentially, they are ideally suited to such storage media. When combined with a cache that eliminates most disk reads, a performance log can provide a significant speed-up. 4. Durability log. If the log is stored on a non-volatile medium (e.g., magnetic tape) that fails in ways and at times that are independent from the failures of the cell storage medium (e.g., magnetic disk) then the copies of data in the log are replicas that can be used as backup in case of damage to the copies of the data in cell storage. Any log that uses a non-volatile medium also helps support durability. Lecture 24

  12. Logging configurations Lecture 24

  13. Logging protocols • Reason for write-ahead-log  log append while install overwrites • Log record • Id of the all-or-nothing action performing the update • The do or redo action  component action that can perform the install if the system crashes before the install. • The undo action  component action that can reverse the effects if the system crashes during the install. • Four types of log records • BEGIN  NEW_ACTION writes this record and records the action id • CHANGE  written by the pre-commit phase • OUTCOME  written by the COMMIT or by the ABORT procedures • END  the final step of an action Lecture 24

  14. Example of log records Lecture 24

  15. Example: all-or-nothing TRANSFER with logging Lecture 24

  16. Recovery procedures for databases in volatile storage • We need recovery procedures in case of a system crash • The log is not affected as it resides on non-volatile memory • Abandon the in-core database and all all-or-nothing actions in progress • Two steps: • Backwards scan the log and identify all actions with an OUTCOME record showing that the action has been COMMITTED, call them winner actions • Forward scan the log and identify REDO actions of every winner whose OUTCOME record shows that the action has been COMMITTED. Reinstall all committed action values in cell storage. • The recovery procedure should be idempotent  if the system crashes during recovery we should be able to start again. • The ABORT procedure should also be idempotent Lecture 24

  17. Lecture 24

  18. Recovery for databases on non-volatile storage (disk) • Large databases cannot be kept in main memory. • Access to disk (reads and writes) is slower • Installs survive a system crash and we are faced with new problems. • A. The recovery procedure must reverse the effects of pending AONA that have installed changes. • B. The entire database must be reinstalled, time consuming for large databases!! • We assume a write-through cacheto avoid complications as the multi-level memory manager may defer writing. • To address A: • Backward log scan phase look for losers (rather than winners) – a loser is an action in progress at the time of the crash, has no END record. • UNDO the CHANGE record of a loser action. • Roll back all INSTALLS performed by losersas if the actions of losers never occurred • Perform a forward log scan and perform the REDO of all COMMITTED actions • Add an END record to all losers • Blind write overwrite a data without reference to its previous value. Idempotent operation for UNDO and REDO actions during the recovery. Necessary to make recovery idempotet. Lecture 24

  19. Lecture 24

  20. Undo logging or rollback recovery • To address B  undo logging or rollback recovery • Additional requirement to avoid to REDO any INSTALLs  perform all INSTALLs before logging it in the OUTCOME record  all INSTALLS are in the non-volatile storage, no need to REDO them • UNDO only the INSTALLs of losers and skip the forward scan Lecture 24

  21. Summary of atomicity logging • Log to journal storage before installing in cell storage • For non-volatile storage • Only UNDO the INSTALLs of incomplete actions if the INSTALL is carried out before logging the OUTCOME record • Only REDO the installs of incomplete actions when logging is done before INSTALL • Otherwise do both UNDO and REDO. Lecture 24

  22. Before-or-after atomicity Simple serialization  assume each transaction has a unique t_id t_id is from a compact set of integers; at initialization time the system creates a transaction with t_id=0. The rule transaction with t_id=n must wait before reading or writing any data that transaction with t_id=n-1 has either committed or aborted Produces correct results It is too conservative, it does not allow any parallelism Lecture 24

  23. Lecture 24

  24. A transaction needs to know the version history of all variables it affects, it does not need to wait for previous one to complete if they are disjoint, e.g. t4 and t3 Lecture 24

  25. The mark-point discipline • A transaction • identifies the data it intends to modify - marks it; • creates a pending version of the variable • mark point the instance when a transaction has finished updating the marking all variables it intended to change • announces that it has passed its marking point  sets a flag in the OUTCOME record • A transaction has to wait until all preceding ones have reached the marking point before it can begin reading variables marked by preceding transactions Lecture 24

  26. Skip versions of a variable created by transaction with t_id > id_current_transactionWait for pending versions of the variable created by transactions with t_id < id_current_transaction ≥ Lecture 24

  27. Markingbecomes a sequence on calls to NEW_VERSION and WRITE_VALUE Lecture 24

  28. The system should be initialized with a call to NEW_OUTCOME_RECORD to ensure that there is a previous transaction as required by BEGIN_TRANSACTION Lecture 24

  29. More on mark-point discipline • The result is guaranteed to be the same as if the transactions were executed sequentially. • Potential interaction of before-or-after and all-or-nothing atomicity if pending versions survive a system crash at the restart all PENDING transactions must be identified and marked as ABORTED • The discipline never creates deadlock  there is no circular WAIT, a transaction must wait only for preceding ones. • If a transaction waits to announce its mark point after the COMMIT or ABORT this becomes simple serialization. • Two possible errors: • A transaction calls NEW_VERSION after announcing its mark_point • WRITE_VALUE attempts to write a value for a variable for which a new version has never been created • Example: the TRANFER transaction with using mark point to achieve before-or-after atomicity Lecture 24

  30. Lecture 24

  31. Pessimistic versus optimistic concurrency control • Pessimistic  Simple serialization and mark point • Presume that interference between concurrent transactions is likely and they actively prevent interference • Prevent concurrency that would not be harmful to correctness. • Optimistic  Read-capture strategy • Assume that the interference between concurrent transactions is unlikely and allow them to proceed; if interference occurs recover e.g., abort and restart. • Increases concurrency. Lecture 24

  32. We could allow transactions to write values in any order and at any time but then there is a chance that a transaction may need to abort and abandon its serialization position, obtain a later serialization, and rerun the transaction from the beginning Lecture 24

  33. Read-capture strategy Extend the validity of a version through intervening transactions up to the reader’s own serialization position Lecture 24

  34. High water mark High water mark  the serial number of the highest number transaction that has ever read a value from the object’s version history. HWM serves as a warning to other transactions that have earlier serial numbers but are late in creating a new version  someone later in the serial ordering has already read a version of this object from earlier in the ordering so it is too late to create a new version now so the transaction must abort and be reissued later. Lecture 24

  35. Example: Transaction 4 is late in creating a new version for object A. By the time it tried to do the insertion transaction 6 has already read the old value (+10) to the beginning of transaction 6. The solution is to abort transaction 6 and reincarnate it as transaction 7. Lecture 24

  36. Lecture 24

  37. Read-capture version of READ_CURRENT_VALUE Lecture 24

  38. Read-capture version of NEW_VERSION and WRITE_VALUE Lecture 24

  39. The use of BOAA in register renaming Intel Pentium has only 8 architectural registers but 128 physical registers. Register renaming based on a circular reorder buffer. When an instruction is issued it is assigned the next sequential slot in the reorder buffer. The slot is a map of the correspondence between the architectural holding the result and the physical register that will actually hold the value. Lecture 24

  40. Lecture 24

  41. Hierarchical composition of transactions Lecture 24

  42. Lecture 24

  43. Lecture 24

  44. Lecture 24

  45. Lecture 24

  46. Lecture 24

  47. Lecture 24

More Related