240 likes | 419 Views
Data Hiding (3 of 3). Curtsey of Professor Min Wu Electrical & Computer Engineering Univ. of Maryland, College Park. Watermark-Based Authentication. (c). (a). (d). after alteration. (b). (e). (g). alter. (f). Document Authentication.
E N D
Data Hiding (3 of 3) Curtsey of Professor Min Wu Electrical & Computer EngineeringUniv. of Maryland, College Park
Watermark-Based Authentication Min Wu @ U. Maryland 2002
(c) (a) (d) after alteration (b) (e) (g) alter (f) Document Authentication • Embed pre-determined pattern or content features beforehand • Verify hidden data’s integrity to decide on authenticity Min Wu @ U. Maryland 2002
Image/Video Authentication via Watermarking • Motivation • “Picture never lies”? Easy to edit digital media ~ Photoshop • Important to detect tampering ~ evidence in litigation, insurance & government archive • Original “true” image cannot be used to convince judge • Basic idea for detecting tampering • Recall authentication problem in crypto • Embed some data in the image and certain relationship/property gets changed upon tampering • Rely on • fragility of embedding scheme, and/or • embedding content features of original true image • Two issues to address • how to embed data? • what data to embed? Min Wu @ U. Maryland 2002
Useful Crypto Tools/Building-Blocks • Crypto’ly strong hash or digest function H( ) • One-way “compression” function • M-bit input to N-bit output often with fixed N and M >> N • Often used to produce a short ID for identifying the input • Properties to be satisfied: 1) Given a message m, H(m) can be calculated very quickly 2) Given a digest y, it is computationally infeasible to find a message m s.t. H(m) = y (i.e., H is one-way) 3) It is computationally infeasible to find messages m1 & m2 s.t. H(m1) = H(m2) (i.e. H is strongly collision-free) • Keyed Hash: • H( k, m ) = Hash( concatenated string derived from k & m ) • Commonly used crypto hash • 160-bit SHA (Secure Hash Algorithm) by NIST • 128-bit MD4 and MD5 by Rivest Min Wu @ U. Maryland 2002
Data Integrity Verification (data authentication) • Authentication is always “relative” • with respect to a reference • How to establish and use a reference [Method-1] Give a “genuine” copy to a trusted 3rd party [Method-2] Append “check bits” • Want hard to find a different meaningful msg. with same “check bits”=> use crypto’ly strong hash • Want tamper-proof if hash func. is public • Encrypt concatenated version of message and hash • Keyed Hash (Message Authentication Code) ~ no extra encryption needed • Digital signature algorithm (using public-key crypto) • Signed Msg|Hash ~ i.e., encrypt by private key s.t. others can’t forge Min Wu @ U. Maryland 2002
DCT coefficient 23Q 24Q 25Q 26Q lookup table mapping … 0 0 1 0 … Extension to Grayscale/Color Images • “Semi-fragile” watermarking • Want to distinguish content-preserving changes (e.g. moderate compression) vs. content tampering • Achieve controlled robustness often via quantization • How to embed • One approach: enforce pre-quantized DCT coefficients using a look-up table • What to embed • A visually meaningful pattern and/or a pre-selected one • facilitate quick visual check and locate alteration • Content features to avoid malicious counterfeiting attack • limited precision (e.g., most significant bits) Min Wu @ U. Maryland 2002
unchanged content changed Watermark-based Authentication • Embed patterns and content features using a lookup-table • High embedding capacity/security via shuffling • locate alteration • differentiate content vs. non-content change (compression) Min Wu @ U. Maryland 2002
Issues Beyond Embedding Mechanism Min Wu @ U. Maryland 2002
Robustness Upper Layers …… Coding of embedded data Security Capacity Error correction Uneven capacity equalization Imperceptibility Multiple-bit embedding Lower Layers Imperceptible embeddingof one bit Issues and Challenges • Tradeoff among conflicting requirements • Imperceptibility • Robustness & security • Capacity • Key elements of data hiding • Perceptual model • Embedding one bit • Multiple bits • Uneven embedding capacity • Robustness and security • What data to embed Min Wu @ U. Maryland 2002
1st bit 2nd bit ... Techniques For Multi-bit Embedding • Amplitude modulation • Use M different amplitude of watermark to represent log2M bits • i {- J, -(M-3)J/(M-1), …, (M-3)J/(M-1), J} where J is JND • accurate detection require clear distinction in received amplitudes • use modulo-M operation for enforcement embedding • Orthogonal and Biorthogonal • Embed one of M orth. patterns representing log2M or log2(2M) bits • TDMA-type (temporal or spatial or both) • Embed each bit in different non-overlapped region or frame • Unevenness in embedding capacity due to non-stationarity • CDMA-type(Coded Modulation) • Use plus vs. minus a pattern to embed one bit • detector need to know the mutually orthogonal patterns Min Wu @ U. Maryland 2002
Orthogonal Modulation TDMA/CDMA Comparison (brief) • TDMA vs. CDMA • Equivalent in terms of watermark energy allocation • Need to handle uneven embedding capacity for TDMA • Need to set up and store orthogonal vectors for CDMA • Orthogonal vs. TDMA/CDMA • Orthogonal modulation has higher energy efficiency • To explore further, See Section V and the reference therein of M. Wu, B. Liu: "Data Hiding in Image and Video: Part-I -- Fundamental Issues and Solutions'', submitted to IEEE Trans. on Image Proc., Jan. 2002 Min Wu @ U. Maryland 2002
Comparison (1) • Applicable Media Types • not always easy to find many CDMA orthogonal directions (e.g., binary image) • Amplitude is applicable to most features • TDMA can be applied temporally and spatially • TDMA vs. CDMA • Equivalent in terms of watermark energy allocation • Need to handle uneven embedding capacity for TDMA • Variable Embedding Rate (need to embed some side info.) • Constant Embedding Rate (shuffling helps increase embed.rate) • Need to set up and store orthogonal vectors for CDMA Min Wu @ U. Maryland 2002
Orthogonal Modulation TDMA/CDMA Comparison (2) • TDMA / CDMA vs. Orthogonal Modulation • Constant minimum separation for orthogonal modulation as # of embedded bits B increases but total wmk energy unchanged • Orthogonal modulation require book-keeping more orthogonal vectors and more computation in classic detection • Combining the two to improve embedding rate with small increase in computation and storage Min Wu @ U. Maryland 2002
Comparison (3) • Amplitude Modulation vs. Other Techniques • Amplitude modulation can embed multiple bits on a single feature/direction • Without the need of many orthogonal vectors • Minimum separation for same avg. wmk energy and # embedding bits B • O( 2 -B1/2 ) for amplitude modulation • O( B -1/21/2 ) for TDMA/CDMA • O( 1/2 ) for orthogonal modulation • Modulation techniques for communications [Proakis] • Bandwidth-efficient techniques vs. Energy-efficient techniques • Non-trivial amplitude modulation is not good when signal energy is limited (very low SNR), esp. for blind detection Min Wu @ U. Maryland 2002
What Data to Embed? Recall: Important to determine what data to embed in authentication applications Min Wu @ U. Maryland 2002
Alice cable co. w1 Sell w2 Shakespeare in Love Bob w3 Carl Tracing Traitors • Robustly embed digital fingerprint • Insert ID or “fingerprint” to identify each customer • Prevent improper redistribution of multimedia content • Collusion: A cost-effective attack • Users with same content but different fingerprints come together to produce a new copy with diminished or attenuated fingerprints • Anti-collusion fingerprinting • Trace traitors and colluders to actively deter collusion/redistribution • Rely on joint fingerprint encoding & embedding Min Wu @ U. Maryland 2002
original media Customer: Eve Sell Content Fingerprint 101101 … compress embed Fingerprint Tracing: Candidate Fingerprint Suspicious Search Database extract 101101 … Customer: Eve Embedded Fingerprinting for Multimedia Min Wu @ U. Maryland 2002
Averaging Attack Interleaving Attack . . . Collusion Scenarios • Result of collusion: Fingerprint energy decreases • Jointly design encoding and embedding of fingerprints Min Wu @ U. Maryland 2002
( -1, 1, 1, 1, 1, 1, …, -1, 1, 1, 1 ) User-4 User-1 ( -1,-1, -1, -1, 1, 1, 1, 1, …, 1 ) Collude by Averaging Uniquely Identify User 1 & 4 Extracted fingerprint code ( -1, 0, 0, 0, 1, …, 0, 0, 0, 1, 1, 1 ) 16-bit ACC Example for Detecting ≤ 3 Colluders Min Wu @ U. Maryland 2002
Anti-Collusion Fingerprint Codes • Simplified assumption • Assume fingerprint codes follow logic-AND op in colluded images • K-resilient AND ACC code • A binary code C={c1, c2, …, cn} such that the logical AND of any subset of K or fewer codevectors is non-zero and distinct from the logical AND of any other subset of K or fewer codevectors • Example: {(1110), (1101), (1011), (0111)} • ACC code via combinatorial design • Balanced Incomplete Block Design (BIBD) Simple ExampleACC code via (7,3,1) BIBD for handling up to 2 colluders among 7 users To explore further, see Trappe-Wu-Liu paper (2001). Min Wu @ U. Maryland 2002
Anti-Collusion Fingerprint Codes (cont’d) • (v,k,l)-BIBD code is an (k-1)-resilient ACC • Defined as a pair (X,A) • X is a set of v points • A is a collection of blocks of X, each with k points • Every pair of distinct points is in exactly l blocks • # blocks • Example (7,3,1) BIBD code • X={1,2,3,4,5,6,7} • A={123, 145, 167, 246, 257, 347, 356} • Code length for n=1000 users • This code O( n0.5 ) ~ dozens bits • Prior art by Boneh-Shaw O( (log n)4) ~ thousands bits Min Wu @ U. Maryland 2002
Lenna Fingerprinted with U1 T | U = 20.75 T | U = - 0.22 N 1 ~ 4 N 5 ~ 8 T | U = 29.53 T | U = - 0.85 N 1,2 N 3,4 T | U = T | U = N 1 N 2 42.72 - 0.59 U U U U U U U U 1 2 3 4 5 6 7 8 Efficient Collusion Detection for Orth. Mod. Amount of correlations needed: Considerable reductions in computation! EffDet(y,S): Break S into S0 and S1 ifthen if |Sj| =1 then Output Sj, else EffDet(y,Sj); Min Wu @ U. Maryland 2002