290 likes | 305 Views
This paper discusses the security implications of semantically enhanced semi-structured data and proposes a new security framework. It explores the use of XML language, semantic tools, and the potential security threats posed by increased connectivity and extensive XML support. The paper also presents solutions to address these new security challenges, including XML views, access control mechanisms, and global disclosure control.
E N D
Integrated Security Framework for Semantically Enhanced Semi-Structured Data Andrei G. Stoica and Csilla Farkas Department of Computer Science & EngineeringUniversity of South Carolina i
Overview • Machine understandable data semantics: • domain and context definition • ontologies • metadata • What are the security implications? • New security mechanisms? • New security paradigm?
XML Language • High-level application messaging • Used for storage application • reduces computation overhead • uniform access • Base for semantic orientated languages - RDF, DAML • Increased popularity
Semantic Tools • The information process is augmented with a semantic layer. • Infrastructure allows computers to reason about data meaning. • Computers exchange information transparently on behalf of the user. • Implications • Intelligent high-volume processing
Security Setup Increased Connectivity + Extensive XML support + Semantic Infrastructure = New Security Threats • Established Security Models do not address this dimension: • Indirect disclosure • Undesired Inference • Available inference models difficult to transfer from database security • open domains
Related Work • Document Instance Security • Digital Signatures • Encryption • XML Access Control Models • Security labels assignment • Multi-level XML Security • Extensions from Database Security
Problems? • Semantic correlations ignored • Inconsistent reply • Indirect unauthorized disclosure
Example medicalFiles countyRec milBaseRec <medicalFiles> UC <countyRec> S <patient> S <name>John Smith </name> UC <phone>111-2222</phone> S </patient> <physician>Jim Dale </physician> UC </countyRec> <milBaseRec> TS <patient> S <name>Harry Green</name> UC <phone>333-4444</phone> S </patient> <physician>Joe White </physician> UC <milTag>MT78</milTag> TS </milBaseRec> </medicalFiles> physician Jim Dale physician Joe White milTag MT78 patient patient name John Smith phone 111-2222 name Harry Green phone 333-4444 View over UC data
Inference • Set of data + associations derive the target data • Traditionally a human task • At the limit, infer any target given enough related data and metadata.
Problems? If the inference target is confidential information Security Violation
Example • Simulation Exploitation Using Open Source Information: • Objective: US Government would like to share a limited simulation software with friendly countries. • Can this software be used to explore the capabilities of US weaponry? • Can sufficient information be found from public sources to create such simulation?
Example • Findings: • Most of the information needed for the simulation was available on the Internet. • Needed human aid to combine available information
Proposed Solution What do we do? • XML Views Considering Semantic Dimension. • do not disclose more information (including structure of the document). • cover stories. • Web Inference • make sure the information we publish does not lead to our confidential data.
Proposed Solution • XML Access Control • semantic consistent reply • prevent illegal inference from query reply (cover stories). • Global Disclosure Control • detect and prevent a set of undesired inferences using public Internet data in correlation with public local data
Security Engine Local Organization Access Control Corrective Measures Request SecView Local XML Database Interface Module Return Oxsegin Update Local Ontology Upload Global Data Privacy Control Local Data Internet Data
Secure XML Views • Builds secure & semantic consistent single security level partial views • Minimum Semantic Conflict Graph • avoids semantic conflicts • Multi-Plane DTD Graph MPG • structural relationships between tags • Andrei Stoica, Csilla Farkas. “Secure XML Views”, In Proc. of IFIP 2002
Example DTD Graph medicalFiles MSCG name phone countyRec milBaseRec emrgRec physician patient milTag physician name phone
Oxsegin Local Classified Database Inference Engine Local Public Database Security Violations Internet Databases Corrective Measures
Corrective Measures • Local Public Data • Remove information • Release misleading information • Internet Public Data • Release misleading information • Target desirable inference results
Inference Engine Replicated Data Inf. Public+Local Database Local Classified Database Violation Pointers Prob. Coef. Correlated Data Inf. Inf. Struct Ontology
Replicated Data Inference • Identifies replicated information under different security classifications • Violation Pointer = similar units of data at different security levels • Inference is guided by inference structures built on ontology concept hierarchy • Andrei Stoica, Csilla Farkas. “Ontology guided XML Security Engine”, In Journal of Intelligent Information Systems, to appear.
Replicated Data Inference Inf. Tree Ontology Classified Data file Public Data file A Patriot Freq. N0 M1 B B C , M2 M4 M3 N1 N2 D E PAC-2 Freq. PAC-3 Freq. PAC-2 Freq. PAC-3 Freq. M7 M7 N5 N5 N6 N7 Scientific data on radar components Missiles Tracking Systems Confidence Level (M7,N5) = ƒ (,,,)
Correlated Data Inference • Identifies sensitive data in the public domain (relative to a given classified database – usually the local database). • Inference guidance: • Ontology concept hierarchy • Structural similarity of public data • Csilla Farkas, Andrei Stoica. “Correlated Data Inference, Ontology Guided XML Security Engine”, In Proc of IFIP 2003.
Correlated Data Inference • Features of similarity: • Levels of abstraction for each node • Distance of associated nodes from association root • Similarity of the distances • Length of the distance • Similarity of sub-trees originating from correlated nodes
Air show address fort address fort Correlated Data Inference • Association similarity: • Distance of each node from the association root • Difference of the distance of the nodes from the association root • Similarity of the sub-trees originating at nodes
address fort Water source base district basin Correlated Data Inference Object[]. waterSource :: Object basin :: waterSource place :: Object district :: place address :: place base :: Object fort :: base ?
place address fort district basin Water source base Confidential Correlated Data Inference Object[]. waterSource :: Object basin :: waterSource place :: Object district :: place address :: place base :: Object fort :: base base Public Public Water Source
Summary • Secure XML Views provide semantic consistent query reply and cover stories. • Oxegin architecture and methods detect undesired inferences • Structural similarity • Semantic concept hierarchy • Confidence in derived inferences
Next Class • Stream data