1 / 27

Peter Boncz (CWI) Sjoerd Mullender update actions Jens Teubner XQUF parsing Niels Nes logging

everything you always wanted to know about Updates in MonetDB/XQuery but were afraid to ask. Peter Boncz (CWI) Sjoerd Mullender update actions Jens Teubner XQUF parsing Niels Nes logging Stefan Manegold the rest. XQuery Update Facility (XQUF) semantics & the update tape

Download Presentation

Peter Boncz (CWI) Sjoerd Mullender update actions Jens Teubner XQUF parsing Niels Nes logging

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. everything you always wanted to know about Updates in MonetDB/XQuery but were afraid to ask Peter Boncz (CWI) Sjoerd Mullender update actions Jens Teubner XQUF parsing Niels Nes logging Stefan Manegold the rest

  2. XQuery Update Facility (XQUF) • semantics & the update tape • Updatable XML storage in BATs • maintaining order in an array without O(N) cost • Snapshot Isolation • why we want it, how we got it • Concurrency Control • optimistic, with “abort convoys” • Durability • physical logging • Conclusion & Future Challenges Overview

  3. XQuery Update Facility (XUF) • January 2006, first proposal • Internal primitives: • upd:insertBeforeupd:insertAfterupd:insertIntoupd:insertIntoAsLastupd:insertAttributesupd:deleteupd:replaceValueupd:rename • Pending update list concept • upd:applyUpdates

  4. Example insert <item id="{id}"> <location>Brazil</location> <quantity>200</quantity> <name>XML in a nutshell</name> <payment>Credit Card, Personal check</payment> <shipping>Will ship internationally</shipping> <incategory category="category1"/> </item> as last into fn:doc("xmark.xml")/site/regions/samerica

  5. Semantics let $root = doc(“foo.xml”) for $i in (1,2,3) return do insert <x>$i</x> as first into $root), do insert <y>$i</y> as first into $root))

  6. Semantics let $root = doc(“foo.xml”) for $i in (1,2,3) return (do insert <x>$i</x> as first into $root), do insert <y>$i</y> as first into $root)) •  • We need to • define an execution order, and • enforce it

  7. The Update Tape update = sequence ( int, node, node/str, node/str) fn:delete()  (DELETE, node, nil, nil) fn:insert_*()  (INSERT, tgt-node, tgt-level, expr-node) fn:set-attr()  (ATTR, node, qn, val) fn:unset-attr()  (ATTR, node, qn, nil) fn:set-text()  (TEXT, node, val, nil) fn:set-pi()  (PI, node, ins-val, arg-val) fn:set-comment()  (COMMENT, node, val, nil) ( element construction ), that combines updates, will enforce the correct order of the update tape. Pathfinder compiler automatically inserts call to fn:update(item*) on the result of all update queries

  8. ancestor following preceding descendant XPath Accellerator [SIGMOD02] <a> <b> <c> <d/> <e/> </c> </b> <f> <g/> <h> <i/> <j/> </h> </f> </a> Node-based relational encoding of XQuery's data model

  9. XML Storage Revisited post = pre + size - level

  10. ancestor following preceding descendant Updates: Mission Impossible? SIZE + |I| <a> <b> <c> <d/> <e/> </c> </b> <f> <g/> <h> <i/> <j/> </h> </f> </a> PRE+ |I| INSERT SUBTREE size(following) = O(N)  killer (?)

  11. XML Storage Revisited post = pre + size - level Allow holes Define logical pages

  12. rid = pre.swizzle( ) XML Storage Revisited post = pre + size - level Allow holes Define logical pages

  13. XML Storage Revisited Update-friendly • rid-table is append-only • rid-tuples may be unused • rid = autoincrement column MonetDB: • rid not stored but computed (virtual oid) • allows positional lookup/join Not stored  no need to update it either

  14. XML Storage Revisited Update-friendly • rid-table is append-only • rid-tuples may be unused • rid = autoincrement column Updatable document collection: • pf:add-doc(URI, docname, perc>0) • pf:add-doc(URI, docname, collname, perc>0) • pre := nid.leftfetchjoin(nid_rid).swizzle(map_pid) Read-only document collection: • pf:add-doc(URI, docname, 0) • pf:add-doc(URI, docname, collname, 0) • NID = RID = PRE • pre := nid.leftfetchjoin(nid_rid).swizzle(map_pid) = FREE!!

  15. Snapshot Isolation • Versus 2-phase locking (2PL) == full serializability • Why not 2PL XML: • lock semantics much more complex than in relational case (order matters!!) • node-level locking in staircase join?? (now 10 cycles/node…)

  16. Snapshot Isolation

  17. Snapshot Isolation • Versus 2-phase locking (2PL) == full serializability • Why not 2PL XML: • lock semantics much more complex than in relational case (order matters!!) • node-level locking in staircase join?? (now 10 cycles/node…) • Why Snapshot Isolation: • great for read-queries, great for ll_scj (runs unmodified) • quite strong. Better than repeatable read. Oracle/Postgres do it. • Problem with Snapshot Isolation: • in XQuery, it is unknown at compile-time what to snapshot (fn:doc(..))

  18. Snapshot Isolation • Read Query1 Read Query 2 Update Query • Isolation By Shadow Paging (copy-on-write mmap) • rid/pre delete/insert + attr-replace • Touch one byte per physical page: *addr = *addr; • MMU traps, OS replaces page by a copy • we would like to replace the master copy once, not all client copies

  19. Snapshot Isolation • Read Query1 Read Query 2 Update Query Isolate-page • Isolation By Shadow Paging (copy-on-write mmap) • rid/pre delete/insert + attr-replace • Touch one byte per physical page: *addr = *addr; • MMU traps, OS replaces page by a copy • we would like to replace the master copy once, not all client copies

  20. Snapshot Isolation • Read Query1 Read Query 2 Update Query Isolate-page • Isolation By Shadow Paging (copy-on-write mmap) • rid/pre delete/insert + attr-replace • Touch one byte per physical page: *addr = *addr; • MMU traps, OS replaces page by a copy

  21. Snapshot Isolation • Read Query1 Read Query 2 Update Query Master-update • Isolation By Shadow Paging (copy-on-write mmap) • rid/pre delete/insert + attr-replace • Touch one byte per physical page: *addr = *addr; • MMU traps, OS replaces page by a copy • we would like to replace the master copy once, not all client copies

  22. Durability • Masters become dirty • no time to flush them during query • log all changes to a WAL • = log all tuples that changed = entire pages • Recovery: • after a crash, we do not know whether dirty pages got saved • solution: overwrite tables with values from the WAL • Checkpointing Thread: • every 5 minutes, if ‘many’ changes occurred, checkpoint • memory mapped bats are sync()-ed  ony dirty pages get written • checkpoint locks collection, halts query processing

  23. Durability • Masters become dirty • no time to flush them during query • log all changes to a WAL • = log all tuples that changed = entire pages • Recovery: • after a crash, we do not know whether dirty pages got saved • solution: overwrite tables with values from the WAL • Checkpointing Thread: • every 5 minutes, if ‘many’ changes occurred, checkpoint • memory mapped bats are sync()-ed  ony dirty pages get written • checkpoint locks collection, halts query processing

  24. The Update Sequence • Execute Query • build update tape • queries get isolated copies of a document (VM copy-on-write mmap) • Prepare Intensional Updates • execute update tape. • does not modify masters (except append-only tables) • Commit Phase (locked phase – per doc-collection) • precommit • detect conflicts (not the size-ancestors) • write WAL (globally locked) • read master-size-ancestors, use delta, log result • update master tables • isolate first! Only then update masters. • update index structures

  25. Many more Issues Solved • Indexing and Updates • Runtime QN  NID mapping, with hash table • read-only: not a hash, but keep sorted & persistent • keep INS + DEL deltas to commit without changing the hash table • Runtime NID  ATTR hash table • isolation loses you MonetDB dynamic hash table reuse • share an old copy, exploit append-mostly Concurrency Updates  Checkpoint Shredding  Query Shredding  Updates • Conflicting Updates • detect conflicting queries: • look at RID page numbers and attr-IDs • reacting to conflicts: • abort query + automatic restart • run CONVOY of 5 next update queries serially • ACID properties on the Meta Level • Shredding a new doc into a collection  Query • Shredding a new doc into a collection  Update • Using a collection  Deleting/adding documents • Meta Querying  Deleting/adding documents • Allocating New Pages and NIDS • Offload shredding interference with freelist • Unlocked access to private pages

  26. Snapshot Isolation • Versus 2-phase locking (2PL) == full serializability • Why not 2PL XML: • lock semantics much more complex than in relational case (order matters!!) • node-level locking in staircase join?? (now 10 cycles/node…) • Why Snapshot Isolation: • great for read-queries, great for ll_scj (runs unmodified) • quite strong. Better than repeatable read. Oracle/Postgres do it. • Problem with Snapshot Isolation: • in XQuery, it is unknown at compile-time what to snapshot (fn:doc(..)) 2PL (++) 375 transactions/5 minutes = 1.2 transaction/sec

  27. Conclusions • It works! Reasonable/good performance! • transaction mgmt as a module extension outside a kernel works • identified VM primitives that databases really need • Future work: • Test on XML update benchmark TPOX (DB2: 700 trans/second) • Packed Memory Arrays: alternative for page remapping? • page remapping is technically O(N) • Engineering: • support for value-indexing (does PF support it already) • asynchronous WAL writing to boost throughput • port MIL to C primitives; port C primitives to Monet5

More Related