1 / 46

jessica FRIDRICH jan KODOVSK Ý miroslav GOLJAN vojt ě ch HOLUB

Breaking HUGO – the Process Discovery presented jointly with Steganalysis of Content-Adaptive Steganography in Spatial Domain. jessica FRIDRICH jan KODOVSK Ý miroslav GOLJAN vojt ě ch HOLUB. Are there “issues” with adaptive stego ?.

soren
Download Presentation

jessica FRIDRICH jan KODOVSK Ý miroslav GOLJAN vojt ě ch HOLUB

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Breaking HUGO – the Process Discovery presented jointly with Steganalysis of Content-Adaptive Steganography in Spatial Domain jessicaFRIDRICH janKODOVSKÝ miroslavGOLJAN vojtěchHOLUB

  2. Are there “issues” with adaptive stego? • Content adaptive embedding  leakage about placement of embedding changes. • Is HUGO’s probabilistically-known selection channel a problem? • Why should it be a problem? • It is all about how well we can model the content. • Honestly, fellow BOSS competitors, you all started here, • haven’t you? Fridrich, Kodovský, Holub, Goljan

  3. Probability of embedding change … can be estimated from the stego image fairly well: cover estimated actual changes true Fridrich, Kodovský, Holub, Goljan

  4. Complex texture of 512×512 images 512×512 image 4MP image Fridrich, Kodovský, Holub, Goljan

  5. Look at what HUGO did … Seven images from BOSSrank can be detected visually as stego images: Close-up of its LSB plane BOSSrank image No. 235 Fridrich, Kodovský, Holub, Goljan

  6. Weighted-Stego attack for HUGO? Assume that we can estimate Problem:E[c] varies much with content, cannot be easily thresholded or calibrated despite the fact that E[c] < E[s] in general (and sometimes by as much as 60% but on average by 1.74%). Fridrich, Kodovský, Holub, Goljan

  7. Pixel domain is not useful, right? HUGO approximately preserves ~107 statistics computed from neighboring pixels. Intimidating, isn’t it? Forget the pixel domain, go to a different domain. Wavelet perhaps? Brushed off dust from WAM, put it on steroids, whacked HUGO with it. What we tried: added moments from LL band to inform steganalyzer about content (makes sense for content adaptive stego) add the same feature vector from re-embedded image (relying on “saturation effect” with re-embedding) replace Wiener filter in WAM with adaptive filter based on estimated probability of change: BOSSrank score: 59%  Fridrich, Kodovský, Holub, Goljan

  8. Go back to pixel domain! • Your best chances for detection are in the embedding domain. • Compute the residual where is an estimator of xij from its local neighborhood. • Advantages of computing detection statistics from rij: • narrower dynamic range • image content suppressed • higher SNR between stego-signal and noise • Undoubtedly, the best estimator is xij. However, should not depend on xij to avoid biased estimate (this is why denoising filters do not work well). Fridrich, Kodovský, Holub, Goljan

  9. Higher-order local models (HOLMES) • HUGO approximately preserves joint distribution of three 1st-order differences among four neighboring pixels. • We need to get out of HUGO’s model: • Use four or more differences – cooc dimension grows too fast, bins in coocs become empty or underpopulated. • Use higher-order differences – they “see” beyond 4 pixels. SPAM feature set uses  locally constant model constant model linear model quadratic model … … Fridrich, Kodovský, Holub, Goljan

  10. Higher-order local models, cont’d Hugo is likely to embed here even though the content is modelable in the vertical direction However, pixel differences will mostly be in the marginal. Linear or quadratic models bring the residual back inside the cooc matrix Edge close up Image with many edges Fridrich, Kodovský, Holub, Goljan

  11. Quantize and truncate Before computing the coocs, the residual is first quantized and then truncated. Note that we marginalize instead of cutting. The marginals (bins at the boundary) arevery important! Fridrich, Kodovský, Holub, Goljan

  12. First successful features Take min/max of 2nd-order residuals in 4 directions: Features are two 3D cooc matrices: MINMAX: T = 4, q = 1, dim = 2×(2T+1)3 = 1458 QUANT : T = 4, q = 2, dim = 1458 Fridrich, Kodovský, Holub, Goljan

  13. Encouraging results Early October Features: MINMAX, dim  1458 Training database: 2×9074 BOSSbase 0.91 Classifier: FLD BOSSrank: 71% Features: MINMAX+QUANT, dim  2916 Training database: 2×9074 BOSSbase 0.91 Classifier: G-SVM BOSSrank: 73% Fridrich, Kodovský, Holub, Goljan

  14. Unexpected stego-source mismatch BOSSbase 0.91 was prepared with  4,  10 BOSSrank with  1 BOSSbase 0.92 embedded with  1. Retraining our classifier on the correct stego database gave: October 14 Features: MINMAX+QUANT, dim  2916 Training database: 2×9074 BOSSbase 0.92 Classifier: G-SVM BOSSrank: 75% Fridrich, Kodovský, Holub, Goljan

  15. Do not say “hop” before you jump 79 78 77 Hugobreakers’ frustration BOSSrank 76 75 74 Oct 14 Nov 13 This is when BOSS became GOSS: “Guess Our Steganographic Source” Fridrich, Kodovský, Holub, Goljan

  16. The dreaded cover-source mismatch The tell-tale symptom of the mismatch: Adding more features improved score on BOSSbase but worsened BOSSrank score. The problem: we trained on one source but tested on another (different) source. Our detector lacked robustness. Note that this is an issue of robustness rather than overtraining. Well recognized in detection and estimation. Very difficult problem as the mismatch can have so many different forms. Fridrich, Kodovský, Holub, Goljan

  17. Trying to resolve the CSM a) Train on a more diverse source (adding 6000 images to BOSSbase lowered BOSSrank – making mismatch worse?) b) Use classifiers with a simpler decision boundary (L-SVM) (the same problem and lower accuracy) c) Contaminate the training set with BOSSrank images: - put denoised BOSSrank  covers (use adaptive denoising based on estimated probabilities) - put re-embedded BOSSrank stego (unable to obtain consistent results with contamination when experimenting with BOSSbase, decided to toss it) d) Find out more about the cover source - estimate resampling artifacts – we could obtain info about the original image size (no artifacts detected by Farid’s code) - extract fingerprint from BOSSbase cameras, detect in images from BOSSrank, train on images from the right source. Fridrich, Kodovský, Holub, Goljan

  18. Forensic analysis of BOSSrank • Fingerprint extracted from all 7 BOSSbase cameras and detected in BOSSrank. • ~500 images tested positive for Leica M9, no other camera tested positive • Leica • Rebel PCE BOSSrank images Fridrich, Kodovský, Holub, Goljan

  19. Forensic analysis of BOSSrank, cont’d Most images taken in Pacific North-West

  20. Forensic analysis of BOSSrank, cont’d Fingerprint extracted from 25 JPEG images from Tomas Filler’s camera (Panasonic Lumix DMC-FZ50) taken previously at SPIE conferences. Resized to 512×512 using the same script. Positively identified in ~77 BOSSrank images. Could not use for BOSS as other competitors did not have this opportunity. We closed our investigation with ~50% from Leica, the rest declared unknown. PCE BOSSrank images Fridrich, Kodovský, Holub, Goljan

  21. Forensic-aidedsteganalysis Option #1: Buy Leica M9 and generate our own database. Oops … price is $7,000!! Option #2: LensRentals.com, rent it for a week. Took 7,301 images with Leica M9. Experiment#1 Train two classifiers – one trained only on Leica to analyze only Leica images, and one trained on all to analyze the rest. Merge the prediction files. Experiment#2 Add Leica images to the BOSSbasebatabase and train on all. Result: BOSSrank score either the same or slightly worse. Bummer  Fridrich, Kodovský, Holub, Goljan

  22. Can a cover source be replicated? • Cover source is a very complex entity shaped by: • Camera and its settings • short exposure  lower dark current • high ISO  increased level of noise • stopping lens at 5.6  sharper images than when stopped at 2.0 • Lens • short focus  low depth of field  easier for analysis • Content • Binghamton in Fall is a poor replacement for French Riviera. • Average amount of edges, smooth regions. • We rented the wrong lens (50 mm), Patrick used 35 mm. Fridrich, Kodovský, Holub, Goljan

  23. Model diversity is the key QUANT, go 4D, use 3rd order differences (quadratic model), merge. Difference order Cooc. Tq dim 2nd 3 32 686 3rd 3 32 686 2nd 4 2 2 1250 3rd 4 2 2 1250 November 13 Features: dim  3872 Training database: 2×9074 BOSSbase 0.92 Classifier: G-SVM BOSSrank: 76% With increased dimensionality, machine learning became a serious bottleneck. Fridrich, Kodovský, Holub, Goljan

  24. Ensemble classifier (SVM) • To facilitate further development, we started using ensemble classifiers instead of SVMs. • Set l1 • Randomly select k features out of d, kd. • Train a FLD on this random subspace on all BOSSbase images, set threshold to obtain minimum PE, store the eigenvector el. • Make decisions on BOSSrank (fjis the jth feature): • fj el > 0  Dec(l,j) 1 (stego) • fj el < 0  Dec(l,j)  0 (cover) • Repeat 2–4 L-times, obtain L decisions Dec(1..L, 1..1000) for each test image. • For each image, fuse decisions by voting. • Advantages • Low complexity (training of a 9288-dim set on 2×17,000 images with L31 and k 1600 takes only 8 minutes on a PC. • Performance comparable to SVM. Fridrich, Kodovský, Holub, Goljan

  25. Scaling up feature dim seemed to work Mid November Feature set: Previous 3872 + 1458 (MINMAX) = 5330 Training database: 2×9074 BOSSbase v. 0.92 Classifier: Ensemble, L 31, k 1600 BOSSrank: 77% However, adding more features computed from various residuals did not improve BOSSrank, despite steady improvement on BOSSbase. Fridrich, Kodovský, Holub, Goljan

  26. A little more empirical magic … Train on  2N images where N is about 20–50% larger than feature dimension. November 29 Feature set: 5330 + QUANT4 + SQUARE + KB = 9288 Training database: 2×9074 + 2×6500 = 2×15,574 Classifier: Ensemble, L 31, k 1600 BOSSrank: 78% 2500 2500 1458 QUANT4: SQUARE: + “square” cooc KB (Ker-Bőhme) kernel: cooc = H + V -1/4 1/2 -1/4 1/2 0 1/2 -1/4 1/2 -1/4

  27. The final behemoth of dim 24,933 • Combination of 32 feature subsets containing • 1st–6th order differences • multiple versions with different values of q (quantization) • EDGE residuals (effective around edges) • Calibrated features (from a low-pass filtered image) • 5D coocs with T = 1 December 31 Feature set: 24,933 Training database: 2×34,719 Classifier: Ensemble, L 71, k 2400 BOSSrank: 81% Accuracy on Leica: 82.3% Accuracy on Panasonic: 70.0% Fridrich, Kodovský, Holub, Goljan

  28. Score progress Fridrich, Kodovský, Holub, Goljan

  29. Detecting HUGO without cover source mismatch alias • Steganalysis of Content-Adaptive • Steganography in Spatial Domain Fridrich, Kodovský, Holub, Goljan

  30. Effect of quantization Quantization allows the features to sense changes in textured areas and around edges. 3D coocs are best quantized with q = c = central coefficient in the residual computation. c 1 c 2 c 3 c 6 c 10 c 20 Fridrich, Kodovský, Holub, Goljan Fridrich, Kodovský, Holub, Goljan

  31. Best quantization value for 3D and 4D coocs Feature set MINMAX, 4th-order differences, 3D, T= 4. q 2 4 6 8 10 12 PE 30.5 26.8 26.1 26.8 27.7 28.2 Feature set MINMAX, 4th-order differences, 4D, T= 2. q 2 4 6 8 10 12 PE 34.2 30.7 28.2 26.8 27.5 28.4 For 3D coocs, the best q is  c For 4D coocs, the best q is  1.5c Fridrich, Kodovský, Holub, Goljan

  32. Testing higher-order residuals Average accuracy when training on 8074 and testing on 1000 images from BOSSbase repeated 100 times (all results with ensemble). Fea. type (diff, q, T) dPE Best Worst Lk “SPAM”(3D)* (2nd,1,4) 1458 71.4 74.5 69.0 31 1000 MINMAX(3D) (2nd,1,4) 1458 72.7 74.9 68.7 31 1000 QUANT(3D) (2nd,2,4) 1458 73.8 76.8 71.6 31 1000 QUANT(3D)+ (2nd–6th,c,4) 7290 80.0 82.2 77.4 81 1600 QUANT(4D)+ (2nd–6th,c,2) 6250 79.1 81.0 76.5 81 1600 *“SPAM” is a direct equivalent of SPAM vector with 1st order differences replaced with 2nd order. +2nd–6this a merger of QUANT features from 2nd–6th differences quantized with q =c= central coefficient in the residual Fridrich, Kodovský, Holub, Goljan

  33. Accuracy on BOSSbase across cameras Accuracy per image of BOSSbase on 1000 splits 8074/1000 (trn/tst). Lines = avgs for each camera 6627 cover images always classified as cover 6647stego images always classified as stego 4836 images always classified correctly as cover AND stego

  34. PentaxK20D is the easiest ROC and scatter plot with QUANT (dim 1458) Fridrich, Kodovský, Holub, Goljan

  35. Canon Rebel is the hardest Scatter plot with QUANT (dim 1458) Fridrich, Kodovský, Holub, Goljan

  36. Accuracy correlates with texture FLD scatter plot with QUANT (dim 1458) Average absolute 2nd difference Fridrich, Kodovský, Holub, Goljan

  37. Leicaimages Typical Leica image histogram (possibly caused by the resizing script). Decreased dynamic range makes detection of embedding easier. Fridrich, Kodovský, Holub, Goljan

  38. Scatter plot for LSB matching (QUANT 1458) Dependence on content is much weaker! Fridrich, Kodovský, Holub, Goljan

  39. Comparison to 1 embedding and CDF … ensemble with 33,963-dim behemoth HUGO with BOSS payload, accuracy 84.2% Fridrich, Kodovský, Holub, Goljan

  40. Implications for steganalysis • As steganography becomes more sophisticated, steganalysis needs to use more complex models to capture more subtle dependencies among pixels. • The key is diveristy! The model should be rich – a union of smaller submodels. • Feature dimensionality will inevitably increase. • Automatic handling of the dimensionality problem is preferable to hand-tweaking – ensemble classifiers scale well w.r.t. feature dim and training set size and are suitable for this task. • Detectability of HUGO embedding in larger images will increase faster than what Square Root Law dictates because neighboring pixels will be more correlated • Cover source mismatch is an extremely difficult problem that will hamper deployment of steganalysis in practice. • Robust machine learning is badly needed. Fridrich, Kodovský, Holub, Goljan

  41. Implications for steganography • Adaptive stego implemented to minimize distortion in model space is the way to go • Critical: choice of model and distortion function • HUGO’s model is high-dim but too narrow • By making the model more diverse (rich) better steganography can likely be built • Despite progress made during BOSS, HUGO remains the most secure stego algorithm we ever tested Fridrich, Kodovský, Holub, Goljan

  42. BOSS jump-started new directions • Optimal choice of residual and its quantization? • Perhaps learning both from given source and for stego algorithm? • Alternative to coocs as statistical descriptors of the random field of residuals? • Helped us develop ensemble classification as alternative to SVMs • Drew attention to CSM • training set contamination • training only on (processed) test images Fridrich, Kodovský, Holub, Goljan

  43. Our current results on detection of HUGO and much more in the Rump Session. Fridrich, Kodovský, Holub, Goljan

  44. Some more interesting stats 1000 splits of BOSSbase into 8074/1000 BEST … images always classified correctly as cover AND stego FAs …… images always classified as stego when cover MDs ….. images always classified as cover when stego Images Avg. gray Satur. pixls Texture BEST 74.1 2046 1.73 FAs 101.3 4415 4.66 MDs 102.0 5952 3.95 Texture: Scaled average |xij– xi,j+1| Fridrich, Kodovský, Holub, Goljan

  45. Effect of quantization Cooc covers only this range Thin marginal Quantized distribution Original distribution Thick marginal Changes to elements from marginal are undetected. Fridrich, Kodovský, Holub, Goljan

  46. Another example after scaling to 512×512 4MP image Fridrich, Kodovský, Holub, Goljan

More Related