250 likes | 370 Views
Compressed Accessibility Map: Efficient Access Control for XML. Ting Yu : University of Illinois Divesh Srivastava : AT&T Labs Laks V.S. Lakshmanan : University of British Columbia H.V. Jagadish : University of Michigan. Information Sharing in business over the Internet.
E N D
Compressed Accessibility Map: Efficient Access Control for XML Ting Yu: University of Illinois Divesh Srivastava: AT&T Labs Laks V.S. Lakshmanan: University of British Columbia H.V. Jagadish: University of Michigan
Information Sharing in business over the Internet • XML as a standard information exchange/sharing format • Direct access to XML documents • Offer advantages in terms of cost, accuracy and timeliness • Security is crucial • Nature of selective access in this context is complex
Access Control for XML • Fine-grained access control • Business relationship is sophisticated • Constraints on tag/attribute level instead of only on document level • Complex access control rules • Efficient evaluation of data’s accessibility is desired • Focus of this talk
An Example XML Document with Access Control Info. <division name=“security” access=“public”> <about_div> <member> … </member> <member> … </member> </about_div> <res_activity> <description access = “internal”> The purpose of … </description> <project access = “public” type =“system”> <name access=“internal”>Access Control</name> <fund access=“internal”>…</fund> <report access=“internal” code=“R1-99”> … </report> </project> <project access = “public” type=“theory”> … </project> </res_activity> ….. </division> *based on examples in [Damiani et al. 2000]
Two Potential Approaches • Approach 1: use access control rules directly • Pros: Flexible • Cons: Time-inefficient • Approach 2: fully materialized accessibility map (access control list) • Pros: Time-efficient • Cons: Space-inefficient
Our Approach • Compressed Accessibility Map (CAM) • Take advantage of structural locality of accessibility • Index accessibility information in a compressed way • Both time-efficient and space-efficient
Structural Locality of Accessibility • Data items grouped together have similar accessibility properties • Common in hierarchically-structured data like XML [Bertino et al. 1999][Damiani et al. 2000] • Declarative authorization rules based on hierarchical structures • Accessibility propagation and overriding
Compressed Accessibility Map (CAM) • Essentially an accessibility index • Maintain a CAM for each user and access type • Identify “crucial” data items and store extra accessibility information on them • Other data items’ accessibility can be inferred efficiently
(d+,s+) A (d-,s+) B Identify Crucial Data Items A B G C D H E F I J Accessible node Inaccessible node
Ancestor Accessibility and Unit Regions • If a node is accessible, so are its ancestors • A unit region is a maximal subgraph of an XML database such that ancestor accessibility holds • Easy to partition an XML database into unit regions
Unit Region Partition A B G C D H E F I J Accessible node Inaccessible node
CAM for Unit Regions • Allowed labels in unit region cam • (d+,s+), (d-,s+) and (d-,s-) • Inference rules • Label on a node is most specific, thus overrides other inferences • Ancestor accessibility overrides descendents’ inference • Nearest labeled ancestor overrides other labeled ancestors
B K L C F I M G H J Accessible node Inaccessible node Valid CAM (d-,s+) A A A (d+,s+) B D K L B D D K L (d-,s+) (d+,s+) C E F I M C E E F I M G H J G H J Accessibility Unknown
CAM Lookup Algorithm • Given a node e, look up CAM • If e is labeled, check the sign of self label s • If e has labeled descendents, e is accessible • Get e’s nearest labeled ancestor f. e’s accessibility is determined by the sign of f’s label d. • Complexity: proportion to the product of the depth of e in the XML tree and log of the size of CAM.
Optimal Unit Region CAM • CAM with minimum size • Space-efficient • Also reduce lookup time • Build optimal CAM • Assign labels to each data node in a bottom-up way • Remove redundant labels
Accessible node Inaccessible node Redundant Labels: Induced labels • Labels that are the same as what is inferred from its ancestors’ labels (d-,s+) A (d+,s+) B D E (d-,s-) (d-,s-) C (d+,s+) redundant
redundant Accessible node Inaccessible node Redundant Labels: Upward Redundant Labels • labels that can be inferred from its descendents’ labels (d+,s+) A (d-,s+) B E (d-,s+) C D F
Build Optimal CAM • Assign labels in a bottom-up way • Accessible leaf (d+,s+), inaccessible leaf (d-,s-) • Internal nodes’ labels is assigned according to children’s labels • Remove redundant labels • First remove induced labels • Then remove upward redundant labels
(d-,s-) (d-,s-) (d-,s+) (d+,s+) (d+,s+) (d-,s-) (d-,s-) (d+,s+) (d-,s-) (d-,s-) Accessible node Inaccessible node Build Optimal CAM (d-,s+) A (d+,s+) B D K L (d-,s+) (d?,s+) F I M C E G H J
CAM for Multi Unit Regions • Only need to mark out those nodes (marker nodes) that start a unit region • Build optimal CAM for each unit region • Combine CAM for each unit regions • Lookup algorithm is almost the same, but need to take marker nodes into consideration. • complexity remains the same
Further Compression in CAM for Multiple Unit Regions A (d+,s+) H H B G C D H E F I J
Experimental Verification • Metric – compression ratio • Size of CAM / fully materialized accessibility map • Synthetic data set • Generated by IBM XML generator • Study accessibility locality’s impact on compression ratio of CAM • Real data set • Large file systems with real access control data
Impact of Accessibility Locality Compression ratio when accessible nodes are uniformly distributed in the XML tree
Impact of Accessibility Locality Compression ratio when accessibility locality is high
Conclusion • Compressed accessibility map as an efficient way to evaluate access control data for XML documents • Time-efficient and space-efficient • Future work • Better support for incremental CAM updates • Take advantage of commonalities of users’ access rights and globally optimize CAM