150 likes | 339 Views
Hashes in Forensics. Kieron Craggs Originally presented as part of IntaForensics 2014 Graduate Training Week. What we’ll cover. What is a hash? The importance of hashes in Forensics Hash Sets Other hashing techniques. What is a hash?.
E N D
Hashes in Forensics KieronCraggs Originally presented as part of IntaForensics 2014 Graduate Training Week
What we’ll cover • What is a hash? • The importance of hashes in Forensics • Hash Sets • Other hashing techniques
What is a hash? • A one way unique digital signature of a file or files • Result of a calculation made on the content (Algorithm) • Always returns the same size result no matter the input • Different algorithms return a different length result
Importance of hashes in Forensics • Easy to use, common & secure* (MD5, SHA1..) • Collisions are a very small possibility • Can be used to verify data integrity (ACPO P1) • Used to identify ‘good’ & ‘bad’ files (Hash Sets) • Breakdown large amounts of data
Hash Sets • Used to find good and bad files – cuts down the search • Good – NSRL • Bad – Team Cymru (Malware), Law Enforcement & various others including tools such as C4All • Efficient and quick way to identify files
Fuzzy Hashing and others • Traditional hashing but of parts rather than the whole file • Piecewise Hashing – split the data into fixed blocks, hash and look for matches • Context Triggered Piecewise (Fuzzy Hashing) – Matches portions of data which shares a similarity with a comparison but the data might not be in the same place
Piecewise Hashing Example (dcfldd) Create a dd image and calculate hashes Each block now has an MD5 and SHA1 hash Some time later… Let’s compare a copy against the original Original File Here’s the change
Issues • Context Triggered Hashing needs enough data to compare – larger/multiple files are more successful • Computationally expensive, can be time consuming • False positives
Summary • A hash is a unique signature of a files contents (a portion of data) • Helps to make forensic easier but adds credibility • Can help sort large amounts of known/unknown data • Can be applied in different ways to solve problems