360 likes | 534 Views
UNIVERSAL COUNTER FORENSICS METHODS FOR FIRST ORDER STATISTICS. M. Barni , M. Fontani, B. Tondi , G. Di Domenico Dept. of Information Engineering, University of Siena (IT). Outline. MultiMedia Forensics & Counter-Forensics Universal counter-forensics Proposed approach
E N D
UNIVERSAL COUNTER FORENSICS METHODS FOR FIRST ORDER STATISTICS M. Barni, M. Fontani, B. Tondi, G. Di DomenicoDept. of Information Engineering, University of Siena (IT)
Outline • MultiMedia Forensics & Counter-Forensics • Universal counter-forensics • Proposed approach • Application to pixel domain • Application to DCT domain • Results and discussion
MM Forensics & Counter-Forensics • MM Forensics: • Goal: investigate the history of a MM content • Rapidly evolving field, but… • Countermeasures are evolving too! • Counter-Forensics: • Goal: edit a content without leaving traces (fingerprints) 01101101001100000011100001 project www.rewindproject.eu
Forensics & Counter-Forensics • MM Forensics is evolving rapidly… • Countermeasures are evolving too! • Counter-Forensics goal: allow to alter a content without leaving traces (fingerprints) [K07] M. Kirchner and R. Böhme, “Tamper hiding: Defeating image forensics,” in Information Hiding, ser. Lecture Notes in Computer Science, vol.4567. Springer,2007,pp. 326–341.
Universal Counter - Forensics • General idea: • Ifyouknowwhatstatisticisused by the analyst • just adapt the statistic of yourforgery to be veryclose to the statistic of “good” sequences • Any detector based on thatstatisticwill be fooled! • Game Theory: • This scenario can be seenas a game[B12] • Forensic Analyst vs. Attacker • Differentgames are possible: • The adversarydirectlyknow the statistic of the “untouchedsequences” • The adversaryonlyhas a training set of “untouchedsequences” [B12] M. Barni. A game theoretic approach to source identification with known statistics. In Proc. of ICASSP 2012, IEEE Int. Conference on Acoustics, Speech, and Signal Processing, 2012.
Outline of the scheme • Fool a detector = force it to misclassify • Approach: make the processed image statistic close to that of (an) untouched image • If it’s close enough… the detector must do a false-positive or a false-negative error • Assumptions: • Analyst’s detector relies only on first order statistics • Adversary has a database (DB) of histograms of untouched images • So the adversary: • Processes the image • Searches the DB for the nearest untouched histogram • Computes a transformation map from one histogram to the another • Applies the transformation, minimizing perceptual distortion
Practical applications • We show how the proposed method can be used for two different CF tasks: • Hiding traces left by processing operations in the histogram of pixel values • Hiding traces left by double JPEG compression in the histogram of quantized DCT coefficients • You will notice that switching between different domains do not change the scheme, but just the implementation of each “block”
Application #1Conceal traces in the image histogram • We propose a method to conceal traces left by any processing operation in the image histogram • Many detectors exist based on histogram analysis: • Detection of Contrast Enhancement (pixel histogram) [S08] • Detection of double JPEG compression (histograms of DCT coefficients) [B12] • We make no assumptions on the previous processing [S08] M. C. Stamm and K. J. R. Liu. Blind forensics of contrast enhancement in digital images. In Proc. of ICIP 2008, pages 3112–3115, 2008. [B12] T.Bianchi, A.Piva, "Image Forgery Localization via Block-Grained Analysis of JPEG Artifacts", IEEE Transactions on Information Forensics & Security, Volume: 7, Issue: 3 , Page(s): 1003 - 1017
Basic notation • Y and hY denote the processed image and its histogram • X and hX denote the untouched image and its histogram • Z and hZ denote the attacked image and its histogram • Γ denotes the set of histograms (in the database) respecting possible constraints imposed by the attacker (e.g: retaining a minimum contrast) • With ν* we always denote the normalized version of the h* histogram
Phase 1: histogram retrieval • Goal: search a database of untouched image histograms to find h* such that: • It has the most similar shape w.r.t. hY • It belongs to Γ • We propose to use the Chi-square distance, defined as • Therefore, the retrieved histogram is
Phase 2: histogram mapping • Goal: find the best mapping matrix that turns to • number of pixels to be moved from value to • A maximum distortion constraint is given, that avoid changes bigger than of the value of a pixel • We choose the Kullback-Leibler divergence to measure the statistical dissimilarity between the histograms, and yield the following optimization problem: Convex! Mixed Integer Non Linear Problem
Phase 3: pixel remapping • We have the mapping matrix, but which specific pixels should be changed? • Intuition: editing pixels in textured/high-variance regions causes smaller perceptual impact • We propose an iterative approach: for each couple (i,j) • Evaluate the SSIM map between Z and Y • Find pixels having value i, and: • scan these pixels by decreasing SSIM, change the first n(ij) to j • mark edited pixels as “unchangeable”, repeat 2. for (i, j+1) • If no more pixel of value i have to be remapped, repeat from 1., with (i+1,j) • Remarks • SSIM map evaluated iteratively, to take into account on-going modifications • Obtained image will have, by construction, the desired histogram Pixel Remapping DB
Advantage of iterative remapping • If SSIM map is not iteratively computed, visible artifact are likely to appear… With iterative update Without iterative update
Experimental validation • We use the proposed technique to hide traces left by: • Gamma-correction • Histogram Stretching (equalization) • Both these operators leave strong traces in image histogram Original GammaCorrected Equalized
Histogram Database Remapped image Case study Histogram from DB Processed image (gamma-correction) Original Image Resulting histogram Remapped histogram Best match Search
Before Counter-Forensics DB histogram After Counter-Forensics Dmax = 4
Histogram enhancement detection • Stamm’s detector [S08] • It detects the peak-and-gap behavior of the histogram • This is done by considering the contribution of high-frequencies in the Fourier transform of the histogram Original Gamma Corrected Equalized [S08] M. C. Stamm and K. J. R. Liu. Blind forensics of contrast enhancement in digital images. In Proc. of ICIP 2008, pages 3112–3115, 2008.
Dataset & Experiment setup • Database of untouched histograms from 25.000 JPEG images (MIRFLICKR dataset). Total weigth: ~10MB • Apply gamma-correction and histogram equalization to 1300 images from the UCID dataset • Each processed image is “attacked” with the proposed technique, using {2,4,6} as values for the Dmax constraint • We constrain the database search to histograms whose contrast is not smaller than that of the enhanced image (this is our Γ ) • We evaluate performance of Stamm detector in distinguishing: • Processed vs. untouched images • Processed&Attacked vs. untouched images • We evaluate the similarity between attacked and processed images using: • PSNR (“mathematical” metric) • Structural Similarity Index (“perceptual” metric) [W04]
Experimental results • Results in countering detection of gamma-correction Attacked – Processed distance
Experimental results • Results in countering detection of histogram equalization Attacked – Processed distance
Application #2Conceal traces in the image histogram • Method to conceal traces left by double compression in the histograms of quantized DCT coefficients • Huge number of detectors exploit double quantization, e.g.: • Estimation of previous compression [P08] • Forgery detection [H06] [P08] T. Pevny and J. Fridrich, “Estimation of primary quantization matrix for steganalysis of double-compressed JPEG images,” Proceedings of SPIE, vol. 6819, pp. 681911–681911–13, 2008 [H06] J. He, Z. Lin, L. Wang, and X. Tang, “Detecting doctored JPEG images via DCT coefficient analysis,” in Lecture Notes in Computer Science. Springer, 2006, pp. 423–435.
Double Quantization • DQ is a sequence of three steps: • quantization with step b • de-quantization with step b • quantization with step a Characteristic gaps
More on DQ… • Why is it interesting? • Allows forgery detection • Tells something about thehistory of the content(e.g. fake quality problem) • NOTICE: • Effect is visible when first quantization is stronger than the second • The behavior is observed in the histogram of quantized DCT coefficients • If JPEG compression has been carried, holes are always present in the histogram of de-quantized coefficients
More on DCT histograms… • Double JPEG compression leaves the trace in the histogram of each DCT coefficient • How is this histogram calculated? • Intuition: 8x8DCT Coeff.Analysis Block-wise DCT Image Single blocks
Perception in the DCT domain • Understand relationship between changes in the DCT domain and effects in the spatial domain • Just Noticeable Difference (JND) => minimum amount of change in a coefficient leading to a visible artifact • Watson defined JND for the DCT case, taking into account Human Visual System (HVS) properties: • More sensitive to low frequencies • Luminance masking: brighter blocks can be changed more • Contrast masking: more contrast allows more editing
What we want to do • In this case, traces are left in DCT histograms of quantized coefficients… • We must change these histograms, to make them similar to those of an singly-compressed image! • We need to revisit the previous application to adapt to the DCT domain • More histograms (64 instead of 1) • More variables (coefficients vary from -1024 to 1016) • Less intuitive remapping rules…
Histogram retrieval… revisited! • Need all DCT histograms of singly compressed images • Just take some JPEG images and extract them? NO! • DCT histograms depends on the undergone quantization • Search would be practically dominated by this fact • We need to simulate JPEG compressed images: • Take DCT histograms of never-compressed images • During search, quantize each of them with the same factor of the query histogram • Distances may be weighted, to give more importance to low frequency coeffs
Histogram mapping… revisited! • The problem is the very same, repeated 64 times • Problem: how to set the perceptual constraint (Dmax)? • Idea: make it depend on JNDs => allow at most the amount of change leading to a JND • Here we cannot exploit local information (luminance/contrast) • Notice: • we’re working on quantized coefficients! • Changes will be expanded after de-quantization! • => Watson’s matrix must be divided by the quantization step
Pixel mapping… revisited! • We have to move some DCT coefficients from a value to another… how do we choose them? • We exploit Watson model again • This time, we can exploit local information too • Algorithm: • Evaluate the JND for all blocks; • For each element n(ij) • Find coefficients having value i, and: • scan these coeffs by decreasing JND, change the first n(ij) to j • mark edited coeffsas “unchangeable”, repeat 2. for (i, j+1) • If no more pixel of value i have to be remapped, repeat from 2., with (i+1,j)
Does it work so smoothly? • No, it doesn’t • Artifacts show up, probably due to the high number of changed coefficients in high frequencies • Possible solutions • Consider the joint impact of changes in more than one frequency • Anything else? [open question!] • However, most detectors usually rely on low-frequency coefficients • We made some experiments remapping only the first 16 (in zig-zag ordering) coefficients
Experimental setup: detector • We implement a detector for double compression based on calibration • Calibration allows toestimate the originaldistribution of a quantizedsignal • Basic idea with JPEG: • Cut small number of rows/columns • Compute 8x8 DCT andhistograms Read from file Estimated
Experimental setup: method • 200 TIFF (never compressed) images • Experiment consists in evaluating detector performance before and after counter – attack • Detector evaluated in these tasks: • Discriminate single- vs. double- compressed images • Discriminate single- vs. attacked images • We do not want to cheat • i.e., we do not use threshold values from the first experiment to do classification in the second
Experimental results Mean SSIM: 0.968 Mean PSNR: 42.9 dB
Conclusions • Our universal CF methods allow to conceal traces left by any processing in the first-order statistic • Evaluation of the effectiveness should probably rely on statistic measures rather than on detectors • Future works: • Explore connections with Optimal Transportation theory • Explore the use on un-quantized DCT coefficients (conceal traces of single compression) • Develop an integrated method to re-compress an image without leaving traces • Explore the use of different objective function for the histogram mapping problem
Thank youQuestions? Acknowledgments This work has been supported by the REWIND project