Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University

Enumerating XML Data for Dynamic UpdatingL.Kit and V.Ng, Hong Kong Polytechnique University Sang-Ho Nah Lily Daniel Yun Hee Lee

Introduction • n-Inode: new model-mapping approach • Multidimensional node ID for indexing and node-to-node relationship calculation • Supports dynamic updating of XML data flexibly

n-INode • XML document represented as nc-ary complete tree, where nc=maximum number of child node per node • Multidimensional node ID: k-dimensional ID:(id1, io1, id2, io2, …, idk, iok) idx: Node identifier assigned by numbering scheme iox: Insertion order, sequential number starting from 0 • Presence of iox allows more than nc child nodes to be inserted. No re-calculation of existing nodes’ id required

n-INode cont • Insertion Rules • If newly inserted node’s id1 exists in the tree, its io1 is incremented from maximum io1 among existing nodes with the same id1 • If new node is inserted to the “right most position”, and maximum io1 (of all the nodes with the same id1) is less than nc, then io1 = nc+1 • A new dimension is introduced to all descendants of a node that has io1 > 0. Parent’s first dimension is assigned to the child’s first dimension.

n-INode cont • Parent-Child relationship: • Pair of nodes with the same number of dimensions • Pair of nodes with dimensional difference of one • Parent and Child MUST share the identical first dimension • Ancestor-Descendant relationship: • Above 2 situations • Pair of nodes with dimensional difference of more than one

Implementation & Experiment • Required storage space is not the smallest of all the models tested • Other test results show that this is a reasonable trade-off • Query time is reasonable and consistent – shows it does not depend heavily on the type of query

Possible flaws in n-INode • Node relationship calculation/verification rule excludes a case where both nodes in the pair have 1-dimensional ID (first dimensions cannot be the same) • Path sequence of each node changes by allowing more than nc child nodes to be inserted – therefore path sequence should not be used in node identifier calculation

Conclusion • Identifying the insertion order removes restriction on the number of child nodes to be inserted • Re-calculation of existing nodes’ ID is not required • This allows for more effective and efficient node locating operation, supporting dynamic updates of XML data. • However, some aspects were overlooked and this makes the proof of correctness presented in the paper somewhat deficient.

Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University

Enumerating XML Data for Dynamic Updating L.Kit and V.Ng, Hong Kong Polytechnique University

Presentation Transcript

Hong Kong Virtual University

City University of Hong Kong

Chinese University of Hong Kong

Hong Kong Polytechnic University

THE HONG KONG POLYTECHNIC UNIVERSITY

(1) ISEIS, Chinese University of Hong Kong, NT, Shatin, Hong Kong

The Hong Kong Polytechnic University

THE HONG KONG POLYTECHNIC UNIVERSITY

Chinese University of Hong Kong

A/Professor Kathleen Tait Hong Kong Baptist University Hong Kong SAR

Hong Kong

City University of Hong Kong

THE HONG KONG POLYTECHNIC UNIVERSITY

Hong Kong

Hong Kong Fax Number Data

Hong Kong Fax Marketing Data

City University of Hong Kong

Hong Kong Fax Number Data

Data Center Solution Hong Kong