220 likes | 235 Views
Signature Based Concurrency Control. Thomas Schwarz, S.J. JoAnne Holliday Santa Clara University Santa Clara, CA 95053 tjschwarz,jholliday@scu.edu. Overview. Transactional concurrency control in a distributed system: Signatures are a better version of version numbers.
E N D
Signature Based Concurrency Control Thomas Schwarz, S.J. JoAnne Holliday Santa Clara University Santa Clara, CA 95053 tjschwarz,jholliday@scu.edu
Overview • Transactional concurrency control in a distributed system: • Signatures are a better version of version numbers. • Signatures are calculated from the records.
Basic Idea • A signatures is a short string of f bits calculated from a record. • We assume here an LH* file scenario. • File is a dictionary data structure associating keys with a non-key field: key c non-key field signature
Basic Idea • When a transaction reads a record it records the signature of the record. • When the transaction is ready to commit, it checks whether any signatures of records it read have changed. • If this is the case, the transaction restarts. • Otherwise, it commits.
Basic Idea • Danger of false negative: • Two different records can have the same signature. • Control the probability of false negatives by the length of the signature • (16B) MD5, (20B) SHA1 are excepted in computer forensics.
Simple Signature Scheme • Each transaction i contains atomic operations: • Ri(x) – Read record x • Wi(x) – Write record x • Vi(x) – Verify the signature of record x • Ai – Abort • Ci – Commit
Simple Signature Scheme • Rules for transaction i • All reads precede all verify. • All verifies precede all writes. • If another transaction j writes to x between a read and a verify, then transaction i aborts. • If all verifies are successful, then the transaction does all its writes and commits.
Simple Signature Scheme • Dirty Reads: • Ri(x)Wj(x) Aj Ci or Ri (x) Wj(x) Ci Aj • Impossible, because a transaction that writes also commits.
Simple Signature Scheme • Fuzzy Reads: • Ri(x)Wj(x) Cj Ri(x) • Possible only if we were to allow multiple reads to the same item x: • R1(x) W2(x) C2 R1(x) V1(x) C1.
Simple Signature Scheme • If we do all the reads in a single block: • Has arguably ANSI REPEATABLE READ property. • Even has ANSI ANOMALY SERIALIZABLE. • But it is certainly not serializable: • R1(x) R2(x) R1(y) R2(y) V1(x) V2(x) V1(y) V2(y) W1(x) W2(x) W2(y) W1(y) C1 C2
Extended Signature Scheme • Add: Verify-Write phase is atomic. • Then: Scheme is (conflict) serializable. • Proof (Idea): • Consider all reads to be “pre-reads”. • Only the verify operations are read in the sense of concurrency control. • Then the result follows by definition.
Implementation • Lock based implementation: • Read-Calculate Phase • No locking at all. However, a transaction that reads an exclusively locked record might want to reread that record because that record might change. • Verify-Write Phase • Read lock on all the signatures of records read. • Write lock on all the signatures of records to be modified. • Verify signatures and decide on commit / abort. • Release all locks.
Implementation • Lock based implementation: • Conservative • Strict • Two-Phase Locking • Locks are short-lived: • One round of messages to acquire locks and signatures. • One round of messages for commit / abort and release messages.
Implementation • No-locking scheme • Transaction appear to servers to be very short. • Chance for conflict limited.
Signature Implementation • We do not use the record signature directly, but a region signature. • A region is a contiguous set of keys that all hash to the same bucket. • Typically, a region should have between 0.5 and 5 records on average.
Signature Implementation • Let cibe the keys in a region. • Then set the region signature to be • Arithmetic is done in a GF. • g hashes keys into GF. • The record signature of a non-existing record is zero.
Signature Implementation • The verify operations read region signatures. • Addressed by the key-space they cover. • Locking is done on regions. • Store region signatures. • Large regions have little storage overhead, small ones have large storage overhead.
Signature Implementation • Region signatures prevent phantom records.
Implementation • No-Locking Scheme • Assumes loosely synchronized clocks. • Clocks that are accurate to within a small multiple of average message delay. • Transaction acquires a time-stamp at the lowest numbered SDDS bucket it visits. • Transaction sends verify / write / vote requests to all servers it visited. • Each server votes on whether the transaction should commit. • In the usual way. • If every server returns a yes vote to the transaction manager, then the transaction commits. • Transaction manager sends out the result of the vote.
Discussion • Signature scheme interesting if transactions have large calculation times and updates are rare. • Signature scheme should be extendible to replicated databases. • Size of region can be fit to the scale of the file, so that a region always has about the same number of records. • E.g. whenever the LH* split pointer returns to zero, split regions in half.
Discussion • Future Work: • Performance evaluation