Accelerating Multilevel Secure Database Queries using P-Tree Technology

Accelerating Multilevel Secure Database Queries using P-Tree TechnologyImad Rahal and Dr. William Perrizo Computer Science Department North Dakota State University

Outline • Introduction 1- What are MLS/DBSs ? 2- The Mandatory Access Control (MAC) Policy • Attempts • The Sea View model (Secure data model) and PRISM model [6] • PRISM is based on Sea View but eliminates spurious tuples during recovery • Deficiencies of Seaview/PRISM (mainly speed) • Query Acceleration using P-trees • Replace the Recovery data structure of PRISM • Advantages: time efficiency

What are MLS/DBSs • DBSs that implement secure access control policies to protect their data • Each user or process is called a subject • Each data item (column value or tuple) is a called an object • The security hardware & software are stored in a TCB (Trusted Computing Base) (sometime referred to as Reference Monitor or Security Kernel)

R(A1,C1, A2, C2….,An, Cn,TC) is a multi-level relation or view • Ai’s are fields • Ci’s are their respective sensitivity levels (form a lattice) • We use the convention that A1,C1 is the apparent key • The apparent key does not have uniqueness but will be a key if all security fields are combined together. • A1,C1,C2,……,Cn is the primary key • TC is the classification level of the tuple • Notice that • TC = highest Ci for all i • C1= lowest Ci for all i

The Mandatory Access Control (MAC) policy • Each subject has a clearance level • Each object has a sensitivity level • Bell-Lapadula restrictions: • Simple Security Policy for READs (read down, i.e., subject can read at his level or down) • *-Policy for WRITEs (write up, i.e., his level or up) • X (a subject) dominates Y (an object) means X’s classification level must be equal to or exceed Y’s classification level

A simple example of DoD classification levels are (in descending order): 1- Top Secret(TC) 2- Secret (S) 3- Confidential (C) 4- Unclassified (U)

Attempts • Seaview Model(Secure Data View) • Sponsored by RADC • Joint effort by SRI, Gemini and Oracle • Objective: Build an A1 (very secure) MLS/DBMS • PRISM Model improves on Seaview by eliminating spurious tuples during recovery automatically using a bit vector approach to mask surious tuples • Some other Models • LDV(Lock Data View) • ASD(Advanced Secure DBMS)

SEA View Model • Multilevel relations exist at logical level only(views of Single-level relations which are stored and managed by TCB) • Decomposition algorithm creates single level relations from a multilevel relation. • Recovery Algorithm creates an output multi-level relation from a set of physically stored single level relations.

Decomposition algorithm • Let A1=key and Ai = any attribute • Let x denote classifications of A1 • Let y denote classifications of Ai • For every x, create RA1,x(A1) or just RA1,x • i.e., for the key, we vertically partition by attribute and horizontally partition by security level. • For every y, create RAi,x,y(A1,Ai) x  y or just RAi,x,y • I.e., for non-keys vertical partitioning by attribute and key and horizontal partitioning by attribute and key classification level.

Missiles Name* range speed TC MT1 U 350 U 750 C C NT5 U 450 U 750 U U R name,u R range,u,u R speed,u,u NT5 U 480 C 1000 C C Name Name Range Name Speed FD7 C 450 C 900 C C MT1 MT1 350 NT5 750 NT5 NT5 450 R speed,u,c Name Speed R range,u,c MT1 750 Name Range R name,c NT5 1000 NT5 480 Name R range,c,c R speed,c,c FD7 Name Range Name Speed FD7 450 FD7 900

R name,u R range,u,u R speed,u,u Name Name Range Name Speed MT1 MT1 350 NT5 750 NT5 NT5 450 R speed,u,c Name Speed R range,u,c MT1 750 Name Range R name,c NT5 1000 NT5 480 Name R range,c,c R speed,c,c FD7 Name Range Name Speed FD7 450 FD7 900 Resulting decomposed single level relations are:

Deficiencies of the SEA View /PRISM Models • The deficiencies of the SEA View Model (in its recovery algorithm) • Creation of spurious tuples (due to polyinstantiation) • Space cost of temporary tables • Time cost of unions • Time cost of joins • PRISM solves the spurious tuple problem, but still suffers from time cost problems

Recovery acceleration using P-trees • Based on the Sea View / PRISM Model • Uses its Decomposition algorithm • New Recovery algorithm using the P-tree technology (given a query, creates an output multi-level relation from the single level relations). • Main contribution is in addressing the space and time cost problems.

Recovery Algorithm • For every relation RAi,x,y (single level relations containing all entries from the multilevel relation having keys at classification level x and Ai attribute values at classification level y), excluding base relations (those containing the key only), create a P-tree, PAi,x,y, denoting the presence or absence of the keys at level x. The recovery algorithm is very analogous to the PRISM solution, but addresses time costs (and to some extent space costs – the space savings due to P-tree compression are the main reason for the time savings). Next we introduce P-trees.

bSQ Format • Split each numeric attribute into separate bit files (one for each bit position). • Reasons of using bSQ format • Different bits contribute to the value differently. • bSQ format facilitates the representation of a precision hierarchy (from 1 bit precision, upwards). • bSQ format facilitates the creation of an efficient data structure, the P-tree, P-tree algebra and T-cube.

The “tabular” formats (inverted list) • BSQ and bSQ are “tabular” formats • BSQ consist of a separate table for each feature attribute • bSQ consist of a separate table for each bit • One can view it this way: • Data set is initially 1 relation or table, R(K1,..,Kk, A1,…, An) K1,..,Kk are structure attributes and Ai are feature attributes. • Structure attributes of a 2-D image are X,Y coordinates of the pixels (rows). • Structure attribute of a relation is a 1-D structure consisting of the key • In BSQ we separate each feature into a separate file (similar to the Decomposition Storage Model (DSM), Copeland et al, SIGMOD85, 268-279.) • bSQ, separate each bit of each feature into a separate file (with a consistent structural order assumed) (similar to the Bit Transpose File (BTF) model, Wong et al, VLDB85, pp 448-457.)

Peano Count Tree (P-tree) • A basic P-tree is a representation of a bSQ file in a recursive, segmentized (quadrant-by-quadrant in images) arrangement. • The basic P-trees provide a compressed, lossless, easily-manipulated representation of the original data.

55 55 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 16 16 8 8 15 15 16 16 3 3 0 0 4 4 1 1 4 4 4 4 3 3 4 4 1 1 1 1 1 1 0 0 0 0 0 0 1 1 0 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 64-tuple bSQ file An example Ptree for one bSQ file of an image 64-pixel bSQ raster image file • Peano or Z-ordering • Pure (Pure-1/Pure-0) quadrant • Root Count • Level • Fan-out • QID (Quadrant ID)

001 55 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 1 2 3 16 8 15 16 2 3 0 4 1 4 4 3 4 3 111 1 1 1 0 0 0 1 0 1 1 0 1 2 . 2 . 3 ( 7, 1 ) 10.10.11 ( 111, 001 ) An example of Ptree • Peano or Z-ordering • Pure (Pure-1/Pure-0) quadrant • Root Count • Level • Fan-out • QID (Quadrant ID)

0 m 1 1 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 m 0 0 m 1 1 m 0 0 0 1 1 0 m 1 1 1 1 m 0 1 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 0 1 0 1 1 0 1 1 P-tree variation – PM-tree • Peano Mask tree (PM-tree) uses mask instead of count. • 1 denotes pure-1, 0 denotes pure-0 and m denotes mixed. • It provides an efficient way for ANDing. • Predicate Tree (1 iff predicate is true for quadrant • E.g., Pure1-Tree (predicate: quad is all 1’s • Most compact form (all are lossless)

Ptree Algebra • And • Or • Complement • Other (XOR, etc) Ptree: 55 ____________/ / \ \___________ / ___ / \___ \ / / \ \ 16 ____8__ _15__ 16 / / | \ / | \ \ 3 0 4 1 4 4 3 4 //|\ //|\ //|\ 1110 0010 1101 Complement: 9 ____________/ / \ \___________ / ___ / \___ \ / / \ \ 0 ____8__ __1__ 0 / / | \ / | \ \ 1 4 0 3 0 0 1 0 //|\ //|\ //|\ 0001 1101 0010

PM-tree1: m ______/ / \ \______ / / \ \ / / \ \ 1 m m 1 / / \ \ / / \ \ m 0 1 m 1 1 m 1 //|\ //|\ //|\ 1110 0010 1101 PM-tree2: m ______/ / \ \______ / / \ \ / / \ \ 1 0 m 0 / / \ \ 1 1 1 m //|\ 0100 Result: m ________ / / \ \___ / ____ / \ \ / / \ \ 1 0 m 0 / | \ \ 1 1 m m //|\ //|\ 1101 0100 Ptree ANDing Operation Depth-first Pure 1 path code 0 100 101 102 12 132 20 21 220 221 223 23 3 & 0 20 21 22 231  RESULT 0 0  0 20 20  20 21 21  21 220 221 223 22  220 221 223 23 231  231

Basic, Value and Tuple Ptrees Basic Ptrees (a Pure1-Trees predicate-tree for target bit of target attribute) e.g., P11, P12, …, P18, P21, …, P28, …, P71, …, P78 AND Target Attribute Target Bit Position Value Ptrees (predicate: quad is purely target value in target attribute) e.g., P1, 5 = P1, 101 = P11 AND P12’ AND P13 AND Target Attribute Target Value Tuple Ptrees (predicate: quad is purely target tuple) e.g., P(1, 2, 3) = P(001, 010, 111) = P1, 001 AND P2, 010 AND P3, 111 AND/OR Cube Ptrees (predicate: quad is purely in target cube (product of intervals) e.g., P([13],, [0.2]) = (P1,1 OR P1,2 OR P1,3) AND (P3,0 OR P3,1 OR P3,2)

Using Ptrees for MLS data(key=structure attribute)

If we have the following query: • “Select name, dev-by, length fromR where range  35”

Time improvements to the recovery process using P-trees 12 10 8 PRISM 6 P-Tree 4 2 0 100 500 900 1300 1700 Number of records (in thousands)

Advantages • Acceleration results from operating on p-trees and restricting I/O to only those fields that are involved in the output of the query • Space efficiency due to p-tree compression • Correct output results (no spurious tuples in the output table)

Accelerating Multilevel Secure Database Queries using P-Tree Technology

Accelerating Multilevel Secure Database Queries using P-Tree Technology

Presentation Transcript

Outline

Outline

Outline

Outline

Outline

Outline

Outline

outline

outline

OUTLINE

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline

Outline:

Outline

Outline

OUTLINE: