Building Trustworthy Semantic Webs: Multilevel Secure Data Management Implications

Building Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Multilevel Secure Data Management and its implications to Multilevel semantic web technologies October 27, 2008

Outline • What is an MLS/DBMS? • Summary of Developments • Challenges • Data Models • Implications for semantic web

Overview of MLS/DBMS • What is an MLS/DBMS • Users are cleared at different security levels • Data in the database is assigned different sensitivity levels--multilevel database • Users share the multilevel database • MLS/DBMS is the software that ensures that users only obtain information at or below their level • In general, a user reads at or below his level and writes at his level • Need for an MLS/DBMS • Operating systems control access to files; coarser grain of granularity • Database stores relationships between data • Content, Context, and Dynamic access control • Traditional operating systems access control to files is not sufficient • Need multilevel access control for DBMSs

Summary of Developments • Early Efforts 1975 – 1982; example: Hinke-Shafer approach • Air Force Summer Study, 1982 • Research Prototypes (Integrity Lock, SeaView, LDV, etc.); 1984 - Present • Trusted Database Interpretation; published 1991 • Commercial Products; 1988 - Present

Taxonomy for MLS/DBMSs • Integrity Lock Architecture: Trusted Filter; Untrusted Back-end, Untrusted Front-end. Checksum is computed by the filter based on data content and security level. Checksum recomputed when data is retrieved. • Operating Systems Providing Access Control/ Single Kernel: Multilevel data is partitioned into single level files. Operating system controls access to the filed • Extended Kernel: Kernel extensions for functions such as inference and aggregation and constraint processing • Trusted Subject: DBMS provides access control to its own data such as relations, tuples and attributes • Distributed: Data is partitioned according to security levels; In the partitioned approach, data is not replicated and there is one DBMS per level. In the replicated approach lower level data is replicated at the higher level databases

Integrity Lock

Operating System Providing Mandatory Access Control

Extended Kernel

Trusted Subject

Distributed Approach - I

Distributed Approach II

Some Challenges: Inference Problem • Inference is the process of forming conclusions from premises • If the conclusions are unauthorized, it becomes a problem • Inference problem in a multilevel environment • Aggregation problem is a special case of the inference problem - collections of data elements is Secret but the individual elements are Unclassified • Association problem: attributes A and B taken together is Secret - individually they are Unclassified

Some Challenges: Polyinstantiation • Mechanism to avoid certain signaling channels • Also supports cover stories • Example: John and James have different salaries at different levels

Some Challenges: Covert Channel • Database transactions manipulate data locks and covertly pass information • Two transactions T1 and T2; T1 operates at Secret level and T2 operates at Unclassified level • Relation R is classified at Unclassified level • T1 obtains read lock on R and T2 obtains write lock on R • T1 and T2 can manipulate when they request locks and signal one bit information for each attempt and over time T1 could covertly send sensitive information to T1

Multilevel Secure Data Model: Classifying Databases

Multilevel Secure Data Model: Classifying Relations

Multilevel Secure Data Model: Classifying Attributes/Columns

Multilevel Secure Data Model: Classifying Tuples/Rows

Multilevel Secure Data Model: Classifying Elements

Multilevel Secure Data Model: Classifying Views

Multilevel Secure Data Model: Classifying Metadata

Status and Directions • MLS/DBMSs have been designed and developed for various kinds of database systems including object systems, deductive systems and distributed systems • Provides an approach to host secure applications • Can use the principles to design privacy preserving database systems • Challenge is to host emerging secure applications including semantic web technologies (MLS XML, MLS RDF etc.)

Multilevel Semantic Web Technologies • Take RDF as an example • What so we classify for RDF?: Triples? • Security properties for RDF Schema • Query modification with SPARQL • Inference problem based on RDF data and reasoning • Design of the system – Trusted subject? • Extra credit assignment: Design of Multilevel RDF Data Store • Potential opportunity for RA position for Spring semester – to implement the design.

Building Trustworthy Semantic Webs: Multilevel Secure Data Management Implications