1 / 23

The Unseen Challenge Data Sets

The Unseen Challenge Data Sets. Anderson Rocha Walter Scheirer Siome Goldenstein Terrance Boult. The Data Sets. Two data sets are provided PNG: lossless compression JPEG: lossy compression Prevalence of images on the Internet Sources: Google images, Yahoo Images, and Flickr.

sharis
Download Presentation

The Unseen Challenge Data Sets

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Unseen Challenge Data Sets Anderson Rocha Walter Scheirer Siome Goldenstein Terrance Boult

  2. The Data Sets • Two data sets are provided • PNG: lossless compression • JPEG: lossy compression • Prevalence of images on the Internet • Sources: Google images, Yahoo Images, and Flickr

  3. Message Sizes • For each tool, we provide four different embedding size: • Tiny: < 5% of the channel capacity • Small: > 5% & < 15% of the channel capacity • Medium: > 15% & < 40% of the channel capacity • Large: > 40% of the channel capacity • For the PNG set, the message size is explicitly stated • For the JPEG set, the message size is NOT stated

  4. Message Content • Random bit sequences • Snippets of mp3 songs • Plain text • Other images A B C

  5. Categories • Each set consists of clean and stego images • Clean set • Modified: cropping, overlay, object-appending • Non-modified: original • Stego set • 4 categories for JPEG, 3 categories for PNG, one for each tool

  6. Categories • JPEG subcategories • Stego • Animals • Business • Maps • Natural • Tourist • Vacation • Clean • Misc

  7. Clean Manipulated Images Object Appending Image Cropping Overlay

  8. PNG Tools • Camaleão(http://www.ic.unicamp.br/~rocha/sci/stego) • Simple LSB insertion/modification software • Uses cyclic permutations and block ciphering to hide messages in LSBs • SecurEngine (http://www.sharewareplaza.com/SecurEngine-download_4268.html) • Incorporates 5 crypto algorithms: Blowfish, Gost, Vernam, Cast256, and Mars • LSB encoding

  9. PNG Tools • Stash-It (http://www.smalleranimals.com/stash.htm) • Windows based stego tool • Simple LSB insertion/modification software • No encryption feature

  10. JPEG Tools • F5 (http://www.inf.tu-dresden.de/~aw4) • Resilient to 2 statistical attack • Instead of replacing LSBs directly, F5 decreases the absolute value of the DCT coefficients • Chooses DCT coefficients randomly • Matrix embedding • JPHide (http://linux01.gwdg.de/~alatham) • Uses blowfish to generate a stream of pseudo-random control bits to define bit encodings • Large embeddings trivial to detect

  11. JPEG Tools • JSteg (http://zooid.org/~paul/crypto/jsteg) • 40 bit RC4 Encryption • Channel capacity determination • LSB encoding in quantized DCT coefficients • Outguess (http://www.outguess.org/detection.php) • Preserves statistics based on frequency counts • Seed based iterator available to choose embedding locations • Change minimization calculation for each seed • Remains one of the most difficult tools to detect

  12. PNG Data Set - Breakdown • Training 4,000 total images in the PNG clean category 4,731 total images in the PNG stego category

  13. PNG Data Set - Breakdown • Testing 2,993 total images in the PNG stego category

  14. JPEG Data Set - Breakdown • Training 29,185 total images in the JPEG stego category

  15. JPEG Data Set - Breakdown • Training 29,185 total images in the JPEG stego category

  16. JPEG Data Set - Breakdown • Testing 4,596 total images in the JPEG stego category

  17. Sample Usage: stegdetect • JPEG Training Set Detected, C: correct algorithm detected Detected, I: incorrect algorithm detected Overall false detect rate for the clean image set is 8.6%

  18. Sample Usage: stegdetect • JPEG Testing Set Overall false detect rate for the clean image set is 8.0%

  19. Sample Usage: stegdetect • Detailed results for JPHide Test Set

  20. Sample Usage: stegdetect • Conclusions • Significant differences between the results of training and testing • Weaker performance overall for testing • Designed difficulty of testing set • Stegdetect performs poorly for large embeddings (non-intuitive), as well as small and tiny embeddings (expected)

  21. The Unseen Challenge Data Sets • Lossy (JPEG) and Lossless (PNG) imagery • 3 tools for PNG set, 4 tools for JPEG set • 4 distinct embedding sizes for PNG, varying sizes for JPEG • Clean imagery across all sets

  22. The Unseen Challenge Data Sets • Valid approaches for use: • Detection • Detection and recovery (size or content) • Detection and destruction • Fusion No standard data set exists for steg evaluation! This set is a step in that direction!

  23. Download! http://www.liv.ic.unicamp.br/wvu/datasets.php

More Related