370 likes | 452 Views
Semantic Data Integrity. DARPA PI Meeting. David Rosenthal Odyssey Research Associates July 17-21, 2000. Cornell Business & Technology Park 33 Thornwood Drive, Suite 500 Ithaca, NY 14850-1250 (607) 257-1975. Team Members.
E N D
Semantic Data Integrity DARPA PI Meeting David Rosenthal Odyssey Research Associates July 17-21, 2000 Cornell Business & Technology Park 33 Thornwood Drive, Suite 500 Ithaca, NY 14850-1250 (607) 257-1975 1
Team Members • Odyssey Research Associates (a subsidiary of Architecture Technology Corporation) • David Rosenthal, Matt Stillerman, David Guaspari, Francis Fung • WetStone • Chet Hosmer, Milica Barjaktarovic, Mike Duren • SUNY Binghamton • Jiri Fridrich 2
Forgery Detected Technical Objectives • Support intrusion tolerance by developing improved data integrity methods to identify and recover attacked data • localize possible alterations • provide partial recovery, where feasible • provide policy-based selection of mechanisms • Emphasis is on images 3
Technical Approach • Develop techniques for identifying and protecting data subsets • Develop new watermarking/self-embedding techniques • Explore how to recover data subsets using secondary data (DSI Marks) • Develop software to test the effectiveness of approach 4
Segmented Image Original Image I-FIRE Segmentation 6
I-FIRE Segment Verification Forged Image Segment Level Image Verification 7
I-Fire Segment Recovery • Set parameters for very fast recovery • Streak suggesting aircraft fire was repaired • White rectangles indicate where recovery did not succeed 8
I-FIRE Anomalous Pixel Detection Impossible Data Identification 9
Technical ApproachHierarchical Subsets • Develop algorithms for automatically subsetting images based on uniformity criteria (combination of color, intensity, texture similarity) • Split image into quadrants, test quadrants for uniformity; if a quadrant is uniform, do not subdivide it further. Otherwise, continue subdividing • Then, merge all “adjacent” segments that share the same uniformity characteristics (or possibly some other desirable characteristics such as a common edge) 10
Technical ApproachHierarchical Subsets (cont.) • Impose different integrity mechanisms at different layers of the decomposition, to achieve policy goals more efficiently 11
Intersecting Hash Methods • Intersecting hashes • Each hash covers some set of set cells • Permit the sets of covered cells for two different hashes to intersect • Hierarchical decomposition and hashing is a special case of this • Intersecting hash techniques permit a tradeoff between • strength of protection, • diagnostic ability / damage isolation 12
Hash 2 Cell 1 Cell 3 Hash 1 Cell 2 Hash 2 Cell 1 Cell 3 Hash 1 Cell 2 Hash 2 Cell 1 Cell 3 Hash 1 Cell 2 Forgery Strategies and Strength of Protection • Assume that Cell 2 is modified • Compensating with Cell 2 costs |h1| * |h2 | • Compensating with Cell 1 and then Cell 3 costs |h1| + |h2| 13
To be forged Example: Sequential Forgery Repair with Hierarchical Hashes • Fix hashes in two stages • First Correction: Fix three hashes of left branch • Second Correction: Fix two hashes of right branch 14
Strength of Intersecting Hashes • Strength of protection can be defined in terms of the cost of the attacker’s best strategy. • Can be difficult to compute for arbitrary intersecting hashes • For hierarchical hashes, we believe we have identified an effective algorithm for computing the best strategy 15
Identifying Damaged Area • Regard segment as nn grid of cells • Compare two different kinds of protection strategies • Compute 2n (weak) hash values for sets of cells • “Linear” strategy • Hash disjoint blocks of n/2 cells • Entire block is suspicious if its hash check fails • “Quadratic” strategy • Hash each row and each column • Cell is suspicious if both its row and its column fail 16
Quadratic vs. Linear Hashing • On average, a quadratic hash identifies a smaller suspicious area in the following situations (with n rows and n columns) • Sparse random errors • Where there are up to (approx.)0.96n bad cells • Note: this asymptotic result, converges rapidly • Concentrated random errors • Errors confined to at most n/2 rows or columns (Verified analytically for many n, conjectured for all) 17
Reconstruction • Compute strong hash for the whole image, use weaker hashes on subsets • Brute force reconstruction • Use weak hashes to identify suspicious area • Search suspicious area for candidate reconstructions solving all weak hashes • Check candidates against strong hash 18
Feasibility of Reconstruction • Basic questions for brute force reconstruction • Reliability: probability we are not deceived by answer • Adequacy: probability the answer is in the search space • Computational cost of reconstruction • Reliability: lower bound set by strong hash • Can estimate adequacy and cost, assuming • Random errors • Independence of row and column hashes. 19
Width of cell # hash bits Adequacy exceeds 100 20 .99980 100 30 .99999980 200 20 .99960 200 30 .99999960 Adequacy 20
Computational Cost • Search space constrained by • Initial identification of suspicious area • Homogeneity of the image, limiting the number of candidate values per cell • “Crossword puzzle” style of reconstruction (only available with quadratic hashing) • Search grows rapidly, but feasible in some limited cases 21
# bad rows # bad columns # “check hash” ops 5 10 107 5 100 108 10 10 1013 10 20 1034 Estimates of reconstruction cost • Cost: Assuming 16 candidate values (homogeneity), 32 hash bits 22
Self-Embedding • Self-Embedding: Save important information about an image in the picture • Developing techniques that are unobtrusive, survive JPEG compression, and resistant to some classes of attack • Before and after embedding 23
Original image Forgery using a collage attack Technical Approach Secure Fragile Authentication Watermark • Investigated some attacks that affect several proposed fragile watermark schemes • Developed a secure fragile watermark that is resistant to these attacks • Uses secret key and the watermark is difficult to forge • Resistant to collage attack 24
Technical ApproachPolicy • Provide simple layer at the level of mechanism • Help to bridge user needs to the mechanisms that are available • Provide mapping from typical needs to default assignments for how mechanisms should work • Would eventually like to have a better connection between mechanisms and more characteristics • Importance of the data or sub-data, threats that need to be countered, recovery time constraints, resource limitations, detectability of integrity measure, current situation 25
Digital CRC Robust Fragile Self Signature watermark Watermark embedding Entire image X Important Segments X X X X Non-important X X segments -adjacent to important segments Important segment X with adjacent segments Segments that contain X self-embedding information Non-important object X segments Top-level segments X Technical ApproachPolicy (cont.) • A critical data policy at the mechanism layer 26
Technical ApproachDemonstration Environment • We have developed a tool, called I-FIRE for demonstrating and testing our methodology • Current features include: • Split-and-merge with parameters • Damage detection • Reconstruction and pixel verification • Partial reconstruction with self-embedded data • Policy-based integrity mechanism selection 27
Major Risks and Planned Mitigation • Previous Risk • Partial recovery of subsets may not be very practical (too resource-intensive) • Mitigation • Have developed some other techniques that may make data useable in the case where recovery is not feasible • Detection features can be useful even without recovery 28
Accomplishments to Date • Prototype Tool • Demonstrates hierarchical subset methods • Implements current detection and recovery methods • Developed new watermarking methods • Linear vs. Quadratic method analysis • Some initial results on assurance with incomplete recovery with hierarchical methods • Some initial analysis of a scenario 29
Quantitative Metrics • Metrics that may be used are • Size of DSI mark • Time to apply integrity protection • Time for partial reconstruction techniques • Area of “known” correct part of image • Cost of recovery for a given class of attacks • Strength of protection 30
Expected Major Achievements • A method and tool to facilitate the use of altered data, by • recognizing unharmed subsets • supporting partial recovery techniques 31
Task schedule • Feb 2000: • First version of I-FIRE • July 2000: • Second version of I-FIRE • Some analysis results • December 2000 • Final version of software • Extended analysis results 32
Tech Transfer -- Military • Integrity enhancement for expensive transmissions, e.g., air-to-ground targeting data - • For Air Combat Command and Air Material Command • Planned for small part of JFX2000 experiment (Sept. 2000) 34
I-FIRE L-Band IFGR Aircraft Secure Message DSI L-Band IFGR Ground Secure Message DSI SIPERNET JFX 2000 and I-FIRE • I-FIRE and JFX2000 35
Tech Transfer -- Commercial • Possible commercial transitions • Injection of key technologies into WetStone’s SMARTWatch integrity checker • Investigating some other possibilities 36
What do you need from the DARPA PM? • No pending requirements 37