Data Concurrency Control And Data Recovery

Data Concurrency Control And Data Recovery Group: -Lâm Hoài Minh Triết -Lê Trọng An Sinh

Data Concurrency Control

Content: • Why data concurrency control !? • Deadlock and starvation. • Data concurrency control techniques: • (Multiversion) Two-phase locking • (Multiversion) Timestamp based • Validation • Multiple Granularity locking

Why data concurrency control • Databaseshared data • Update datainconsistency • Analoguous to Synchronization in OS • Mutual exclusion • Deadlock & starvation • Validation • Atomic transactions

Deadlock and starvation • The same as the concept in OS • Solutions: • Deadlock prevention (ex:two phase locking techniques) • Deadlock detection and resolution (using lock tree, and lock table to detect cycles wait-for-graph) • Starvation (will not be discussed here)

Content: • Why data concurrency control !? • Deadlock and starvation. • Data concurrency control techniques: • Concepts • (Multiversion) Two-phase locking • (Multiversion) Timestamp based • Validation • Multiple Granularity locking

Data concurrency control • Basic transaction: • Read • Write • Basic operation: • Lock • Unlock • atomic operation • Basic modes: • Read (shared) mode • Write (exclusive) mode on a data item, many transactions can be in read mode, while only one transaction can be in write mode

Data concurrency control • Lock:

Data concurrency control • Unlock:

Data concurrency control • Conflict matrix:

Data concurrency control • Locking table: • may be implemented using linked • list. • lock tree • Lock manager

Data concurrency control • Well-formed transaction: • Lock data item before • reading/writing • Not try to unlock a free data item • Not lock an already locked data • item

Data concurrency control • Lock upgrade: • Read lock-->write lock • no other transactions is reading • the data item • Lock downgrade: • Write lockread lock • no condition needed to be checked

Two-phase locking • Basic two-phase locking • Lock data incrementally (just lock • what is being in need)

Two-phase locking • Basic two-phase locking

Two-phase locking • Strict basic two-phase locking • Lock data incrementally (just lock • what is being in need) • Unlock the data after terminating. • may cause deadlock • Conservative: (deadlock avoding) • Lock all data in need before starting.

Timestamp-based technique • A monitornically increasing variable • the age of the transaction (timestamp) • (the older  the more recent) • Use: • To serialize transactions

Timestamp-based technique

Timestamp-based technique Strict Order: “Strict” only one read-transaction at a time

Timestamp-based technique T is a write-transaction Thomas’s rule:

Content: • Why data concurrency control !? • Deadlock and starvation. • Data concurrency control techniques: • Concepts • (Multiversion) Two-phase locking • (Multiversion)Timestamp based • Validation • Multiple Granularity locking

Multiversion techniques • Modify copies only  update to the • original version when finish • modifying. (Multiverion-two-phase) • (new version created when in need to • modify data) • Using timestamp to decide which • version of the data will be • allocated for a reading transaction. • (Mutiversion-timestamp based) • (the read transaction will read the • version which have timestamp less than • or equal to its timestamp. )

Multiversion techniques • Side effects: • More memory needed. • Garbage collector.

Validation (optimistic) scheme • Serialization is checked before a write • transaction. • Three phase: • Read • Write divided into 2 phase: validation • and write. • (write to local copies only)

Multiple Granularity Locking • What is granularity: • If we consider granularity as an index I, • Then, the more components the data has, • the higher the value of I . • granularity may be understood as the • level of discreteness. • Granularity hierachy

Multiple Granularity Locking • What is granularity:

Multiple Granularity Locking • Granularity hierachy

Multiple Granularity Locking • Three more additional locking modes are • Used: • Intention-shared(IS): child request a • shared lock on parent. • Intention-exclusive(IX):child request an • exclusive lock on parent. • Shared-Intention-exclusive(SIX):parent • is currently locked in shared mode, but • Child requests an exclusive lock • on parent.

Multiple Granularity Locking • Lock-compatibility matrix:

Multiple Granularity Locking • Lock-rule description:

Database recovery

Content: • Why data recovery? • Some techniques. • Database recovey scheme. • ARIES algorithm

Why data recovery • To bring the database into the • State before the failure. • To preseve transaction properties.

Why data recovery Types of Failure: • Transaction failure. • System failure. • Media failure.

Some techniques Transaction log:

Some techniques Roll back & Roll forward: Roll Back (Undo): Restore BFIM on to disk (Remove all AFIMs). Roll Forward (Redo): Restore AFIM on to disk.

Some techniques T1 T2 T3 read_item (A) read_item (B) read_item (C) read_item (D) write_item (B) write_item (B) write_item (D) read_item (D) read_item (A) write_item (A) write_item (A)

Some techniques Data caching: Data items to be modified are first stored into database cache by the Cache Manager (CM) and after modification they are flushed (written) to the disk.

Some techniques Data update • Immediate Update • Deferred Update • Shadow update • In-place update

Some techniques Write-Ahead Logging (WAL): When in-place update (immediate or deferred) is used For Undo: BFIM is written to a log. For Redo: AFIM is written to a log.

Some techniques Checkpointing: • Suspend transactions temporarily. • Force write modified buffer data to disk. • Write a [checkpoint] record to the log, save the log to disk. • Resume normal transaction execution.

Content: • Why database recovery • Some techniques. • Recovery scheme • ARIES algorithm.

Recovery scheme • Deferred update • Immediate update • Shadow paging

Recovery scheme Deferred update (No Undo/ Redo): • After reboot, the log is used to redo all the transactions affected by the failure. • No undo needed.

Recovery scheme Deferred update: With check point: transactions which were recorded in the log after the last checkpoint were redone

Recovery scheme Deferred update : Two table are required: Active table: all active transactions Commit table: transactions to be committed Recovery: redo transactions in commit table only.

Data Concurrency Control And Data Recovery