310 likes | 408 Views
Contest of XML Lock Protocols. Michael P. Haustein, SAP AG Theo Härder, Univ. of Kaiserslautern Konstantin Luttenberger, Fraunhofer Institute IESE haerder@informatik.uni-kl.de. 32nd Int. Conf. on Very Large Data Bases VLDB 2006 12-15 September 2006, Seoul, Korea. Outline.
E N D
Contest of XML Lock Protocols Michael P. Haustein, SAP AG Theo Härder, Univ. of Kaiserslautern Konstantin Luttenberger, Fraunhofer Institute IESE haerder@informatik.uni-kl.de 32nd Int. Conf. on Very Large Data Bases VLDB 2006 12-15 September 2006, Seoul, Korea
Outline • Key ideas of 2 groups of competing XML lock protocols • Doc2PL and followers • Node2PL, NO2PL, OO2PL • multi-granularity locking (MGL* group) • RIX, RIX+, IRIX, IRIX+, URIX • Our own protocols: taDOM group • taDOM2: base protocol for DOM2 • lock conversion • optimization to taDOM2+ • not considered taDOM3, taDOM3+ • Introduction to XTC • identifying nodes • meta-synchronization • Performance evaluation • taMIX framework • transaction types of the Banking benchmark • measurements and comparison • Conclusions and outlook
XML document<?xml version="1.0"?><bib> <book year="2004" id="book1"> <title>The Title</title> <author> <last>Lastname</last> <first>Firstname</first> </author> <price>49.99</price> </book></bib> bib book element node title price author last first id year book1 The Title T T T T Lastname Firstname 49.99 2004 attribute node text node DOM Storage Model DOM API (> 20 ops) • Navigation getFirst/LastChild getNextSibling getPreviousSibling getAttributes/Value • Modification appendChild insertBefore removeChild • Query getElementById getChildNodes
Basic assumption: Traversal from root to context node Sample of operations nthP retrieves the nth child of C nthM retrieves the nth child (backw.) insA inserts a new node after C insB inserts a new node before C del deletes a given node Separation of traversal and modification of document structure (T/M) content (S/X) direct jumps (IDR/IDX) no intention locks! 0. Doc2PLonly locks roots 1. Node2PLacquires locks for parent nodes root T context node C M parent P Doc2PL and its Followers lock modes: content + structure entire child axis of P affected Structural navigation to locate an object often implies ako document scan repeatable read requires T locks on all nodes
2. NO2PL acquires locks for all nodes whose (conceptual) pointers are traversed or modified example at C1: insB (C0) 3. OO2PL locks (conceptual) pointers for every node A/Z: first/last child R/L: next/previous sibling example: del C2 root T T root P TA TZ M P MR C1 C3 C2 ML C0 C1 Cn … TL TR TA TZ ML MR MA MZ M TL + + + + - + + + TR + + + + + - + + TA + + + + + + - + TZ + + + + + + + - ML - + + + - + + + MR + - + + + - + + MA + + - + + + - + MZ + + + - + + + - Doc2PL and its Followers (2) only context node and selected child nodes affected Increasing degree of concurrency: Doc2PL -> Node2PL -> NO2PL -> OO2PL
id T T idref T T T T T T id T T T T Making Full-Fledged Protocols for the *-2PL Group Idea of T/M lock modes - requires non-interrupted path traversal - prohibits indexed document access: how to protect the ancestor path in case of direct jumps? T root root M IDS IDX IDX IDS What’s about IDREF(S) links? – locks for direct jumps (IDS/IDX) needed! A single lock on the jump target is only sufficient for read ops! (Very expensive) solution for the *-2PL group:node deletion requires IDX locks on all descendents having ID attr.
Support of direct access via indexes jumps to element nodes not owning an ID attribute cheap mechanism to identify the ancestor node IDs! Lock conversion operations of a transaction necessarily share some part of the ancestor path weakest possible locks after conversion Appropriate intention locks and subtree locks needed lock depth parameter desirable Use ideas of MGL locking subtree locks + intention locks 4. IRX 5. IRX+ specialized conversion (+) depending on locking situation What Else Do Full-Fledged XML Protocols Need? compatibility matrix conversion matrix conversion matrix
6. IRIX conversion read C1 – C3 delete C2 7. IRIX+specialized conversion read C1 – C3 delete C2 IX IR IR root root root root R X R P P P P C1 C1 C1 C1 C3 C3 C3 C3 C2 C2 C2 C2 IX IX R X R Applying Multi-Granularity Locking to XML
8. URIX compatibility matrix conversion matrix read for update C1 – C3 delete C2 IX root P RIX C1 C3 C2 U IX root P RIX C1 C3 C2 X MGL Group (Cont.)
Tailored Node Locks for XML – taDOM2 • 9. taDOM2 • Node locks and compatibility matrix • refined URIX protocol with extensions to lock a complete level in a subtree • well known: IR/IX and R/X (here SR/SX) • edge locks not discussed (3 modes) Compatibility matrix Read locks Write locks
bib book title price author IR IX last first IX LR Transaction T2 is reading <book> and all direct-child nodes(<title>, <author>, and <price>) The Title IX T T T T Lastname Firstname 49.99 CX X Transaction T3 is modifyingthe book title Node Locks (1) • Node read lock (NR) • requires IR locks on the ancestor path • Level read lock (LR) • requested for reading the context node and all nodes located at the level below (all direct-child nodes) • Child exclusive lock (CX) • indicates an X lock on a child • defined, in addition to IX, to detect conflicts with LR Transaction T1 is reading <price> IR IR NR
IR IR IX bib IX LR IR IX book Transaction T2 is deleting the <last> node and its content IX title price author CX NR LR CX last first Transaction T3 is reading the <author> node X X The Title T T T T Lastname Firstname Transaction T4 is reading all direct-child nodes of <book> 49.99 but is blocked when reading all child nodes of <author> Node Locks (2) • Locking subtrees exclusively: intention exclusive lock (IX), child exclusive lock (CX), and exclusive lock (X) • requested for updating the context node's content or deleting the context node and its entire subtree • requires a CX lock on the parent and IX locks on the ancestors Transaction T1 is deleting the <first> node and its content
bib IX IR book CX IX IR title price author SR IR IX X last first however, using lock depth 2 The Title IX IR T T T T Lastname Firstname 49.99 CX IR NR X Tunable Lock Depth • Goal • reduce the number of locks held by usingcoarser lock granularity • may decrease concurrency • when nodes deeper than lock depth are accessed:lock modes SR and X are used at the lock depth level Transaction T1 is reading the author's last name Transaction T2 is updating the author's first name Transaction T1 would have to acquire an SR lock on author Transaction T2 would have to acquire an X lock on author and would therefore have to wait on author
IX IR IR CX NR LR NR NR NR bib NR book NR title price author IX CXNR last first Transaction T1 is reading <book> and all its direct-child nodes The Title T T T T Lastname Firstname Transaction T2 is reading <book>, the first child node <title> and its value 49.99 Conversion of Node Locks • Conversion for weakest possible locking paths • LR CX requires explicit NR locks on all children • node labeling scheme cannot deliver IDs of descendent nodes conversion matrix X Transaction T1 is deleting <author> and its entire subtree
- IR NR LR SR IX NRIX CX NRCX NU NX SU SX IR IR IR NR LR SR IX NRIX CX NRCX NU NX SU SX NR NR NR NR LR SR NRIX NRIX NRCX NRCX NR NX SU SX LR LR LR LR LR SR NRIXNR NRIXNR NRCXNR NRCXNR NUNR NXNR SU SX SR SR SR SR SR SR NRIXSR NRIXSR NRCXSR NRCXSR NUSR NXSR SR SX IX IX IX NRIX NRIXNR NRIXSR IX NRIX CX NRCX NX NX SX SX NRIX NRIX NRIX NRIX NRIXNR NRIXSR NRIX NRIX NRCX NRCX NX NX SX SX CX CX CX NRCX NRCXNR NRCXSR CX NRCX CX NRCX NX NX SX SX NRCX NRCX NRCX NRCX NRCXNR NRCXSR NRCX NRCX NRCX NRCX NX NX SX SX NU NU NU NU NUNR NUSR NX NX NX NX NU NX SU SX NX NX NX NX NXNR NXSR NX NX NX NX NX NX SX SX SU SU SU SU SU SU SX SX SX SX SU SX SU SX SX SX SX SX SX SX SX SX SX SX SX SX SX SX taDOM* Group – Lock Protocol Optimization • 10. taDOM2+: LRIX, SRIX, LRCX, SRCX • new lock modes enable conversion without accessing the document • e.g., LRCX (level read child exclusive) combines both modes and avoids application of conversion rule CXNR • Optimization steps • 11. taDOM3: modification of a single context node • 12. taDOM3+: new lock modes to avoid document access Example: lock conversion in taDOM3
L5 L4 L3 L2 L1 Temporary Files Transaction Log Container Files XTC – Architectural Overview Interface Services Http Agent Ftp Agent DOM RMI SAX RMI API RMI XML Processing Services XQuery Processor XML Manager XSLT Processor Node ProcessingServices Path Processing Transaction Services Node Manager Lock Manager Access Services Record Mgr Index Mgr Catalog Mgr Transaction Manager Propagation Control Deadlock Detector Buffer Manager File Services XTCserver Temp File Mgr I/O Manager OS File System determination of ancestor node IDs are of outmost importance for any locking protocol
XML document<?xml version="1.0"?><bib> <book year="2004" id="book1"> <title>The Title</title> <author> <last>Lastname</last> <first>Firstname</first> </author> <price>49.99</price> </book></bib> bib book element node title price author attribute root node last first id year book1 The Title T T T T Lastname Firstname 49.99 2004 attribute node string node text node taDOM Storage Model – View of Lock Mgr
1 2 bib bib 3 8 11 18 book book 4 9 19 12 15 5 SPLIDs (DeweysIDs) title title price price author author 10 1 1.3.5.4.3 20 21 13 16 new last new last first first 6 id id year year 1.3 book1 book1 14 17 7 The Title The Title 1.3.1 1.3.3 1.3.5 1.3.7 T T T T T T T T Lastname Lastname Firstname Firstname 1.3.1.1 1.3.5.5 49.99 49.99 1.3.5.3 1.3.3.3 1.3.7.3 1.3.1.1.1 1.3.3.3.1 2004 2004 1.3.7.3.1 1.3.1.3 1.3.5.5.3 1.3.5.3.3 1.3.1.3.1 1.3.5.3.3.1 1.3.5.5.3.1 Identifying Nodes – Node Numbering Schemes sequential • very slow, although supported by on-demand indexes • determination of parent ID and ancestor IDs, however, is very frequent
Meta-Synchronization • Meta-synchronization • allows identical runtime environment for lock contests • lock mgr provides methods: supportsSharedLevelLocking, supportsSharedTreeLocking, supportsExclusiveTreeLocking • Meta-lock requests from node manager to lock manager • request shared node lock • request shared level lock • request tree lock (shared, update, exclusive) • . . . • Meta-lock requests are mapped to the actual lock algorithm • lock manager implements a certain interface • exchange of the lock manager interface implementation exchanges the system's complete XML locking mechanism Advantages of SPLIDs used in all 12 protocols!
server XTC Server start / stop Coordinator configuration Client Client Client node node node TaMix Benchmark Framework • So far, no update benchmark for XML docs available • TaMix infrastructure for distributed OLTP benchmarks • a list of TX types is assigned to each client • each client runs n TX in parallel and keeps the workload level • Automated measurement • per measurement point 3 runs • configurable runtime interval • for 12 lock protocols • in 6 lock depths • ~ 20 hours per measurement
Amount Balance Bank AccountNo Accounts Day Protocol Receiver City Name Fname Protocols ABA_No Standing_Orders Credit No Customers Customer Address Street Zip Postings Disposition Account Posting Standing_Order Name id Owner id Performance Measurement • Data base (DataGuide) • Size • ~8 MB • 580,000 taDOM nodes
Performance Measurement (2) • Transaction types for Banking benchmark • bank transfer(5 TX/client) • jump to a randomly selected account element • navigation through the document, update operations for Balance and Posting • standing order (5 TX/client) • random account, navigation to Standing Orders, read of all orders • evaluation of the Child axis, small fraction of update operations • customer master • renaming of a Customer element (1 TX/client) • in parallel, reconstruction of randomly selected Customer fragments (5 TX/client) • account statement (5 TX/client) • reconstruction of randomly selected Account fragments • small amount of update operations (insertion an entry in Protocols) • removal of a customer from the data base (2 TX/client) • deletion of fragments • Transaction mix • processes all transaction types in parallel • constant system load of 66 transactions
400 lock protocol taDOM3+ taDOM3+, taDOM2+ taDOM3 taDOM2+ 350 taDOM2 URIX IRIX IRIX+ taDOM3, taDOM2 RIX 300 RIX+ OO2PL NO2PL Node2PL URIX 250 200 RIX(+), IRIX(+) 150 Node2PL, NO2PL, OO2PL 100 0 1 2 3 4 5 Performance Measurement (3) • Number of committed transactions in Banking benchmark lock depth
700 lock protocol RIX(+), IRIX(+) taDOM3+ taDOM3 600 taDOM2+ taDOM2 URIX IRIX 500 IRIX+ RIX taDOM, URIX RIX+ OO2PL NO2PL 400 Node2PL 300 200 100 Node2PL, NO2PL, OO2PL 0 0 1 2 3 4 5 Performance Measurement (4) • Number of aborted transactions in Banking benchmark lock depth
400 Node2PL, NO2PL, OO2PL 350 taDOM2, taDOM3 300 250 200 lock protocol taDOM3+ taDOM3 150 taDOM2+ taDOM2 URIX IRIX 100 IRIX+ RIX taDOM, URIX RIX+ 50 OO2PL NO2PL RIX(+), IRIX(+) Node2PL 0 0 1 2 3 4 5 Detailed Performance Measurement (5) • Successful bank transfers • node-based navigation, update operations lock depth
800 taDOM3+, taDOM2+ 700 Node2PL, NO2PL, OO2PL MGL 600 taDOM3, taDOM2 500 400 lock protocol taDOM3+ taDOM3 300 taDOM2+ taDOM2 URIX IRIX 200 IRIX+ RIX RIX+ taDOM,URIX 100 OO2PL NO2PL Node2PL RIX(+), IRIX(+) 0 0 1 2 3 4 5 Detailed Performance Measurement (6) • Successfully modified standing orders • evaluation of child axis, few update operations lock depth
1500 lock protocol taDOM3+ taDOM taDOM3 1400 taDOM2+ taDOM2 URIX IRIX 1300 IRIX+ RIX RIX+ OO2PL NO2PL 1200 Node2PL URIX 1100 1000 RIX(+), IRIX(+) 900 Node2PL, NO2PL, OO2PL 800 0 1 2 3 4 5 Detailed Performance Measurement (7) • Customer Master: successfully modified Customer elements + reconstructed Customer fragments • renaming of an inner element node lock depth
800 lock protocol taDOM3+ taDOM3 700 taDOM2+ taDOM2 taDOM, MGL URIX 600 IRIX IRIX+ RIX RIX+ 500 OO2PL NO2PL Node2PL 400 300 200 Node2PL, NO2PL, OO2PL 100 0 0 1 2 3 4 5 Detailed Performance Measurement (8) • Successfully processed account statements • reconstruction of fragments, few update operations lock depth
140 taDOM, MGL 120 NO2PL, OO2PL 100 80 lock protocol taDOM3+ taDOM3 60 taDOM2+ taDOM2 RIX(+), IRIX(+) URIX IRIX 40 IRIX+ taDOM,URIX RIX RIX+ OO2PL 20 NO2PL Node2PL Node2PL 0 0 1 2 3 4 5 lock depth Detailed Performance Measurement (9) • Successfully removed customer records • deletion of fragments
Conclusions and Outlook • XTC is used as a test vehicle for empirical DB research • effectiveness of XML concurrency control • fine-granular locking on nodes and edges • meta-synchronization allows comparison of different compatibilities • taDOM* protocols • multiplicity of lock modes • intention locks are important • indexed document access is frequent • ancestor path locking without accessing the storage engine desirable • performance evaluation revealed • use of tailored lock modes pays off • indexed document access is frequent • effect of isolation levels on transaction throughput • influence of node numbering schemes (insertions at arbitrary positions) • Outlook • phantom prevention • mapping different XML language models via access models to our XML storage model, e. g., to analyze the locking behavior of XQuery processing
Contest of XML Lock Protocols Thank you. Any questions?