ZOZZLE:

ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection

What is the Problem? • JavaScript allows authors to run any code when a user visits a web page • JS-based malware attacks are the majority of successful mass-scale exploitation • Malware is easy to hide: self-generating code that produces more code to run • JS severs important functionality for many sites • In-browser solutions have not been fully accepted because of the performance hit • Browsers use offline scanning to check URLs but there are too many sites and malware typically comes and goes frequently

Challenges • Performance • Detection is not fast enough to be used in a browser • Accuracy • False positive rates of 5% is acceptable for static analysis tools but is over 100x what is acceptable for in-browser detection • Obfuscated malware • Most JavaScript code is frequently obfuscated so purely static detection is generally ineffective • Ex. eval,document.writegenerate code at runtime that is difficult to pattern-match • Malware transience • Offline-only scanning is not effective because web malware “infects fast and dies young” • Nearly 20% of malicious URLs were gone after 1 day

Solution : Zozzle • Performance • AST-based detection is fast and scalable • Fast classification: throughput at over 1 MB of JavaScript code per second • Accuracy • AST-based detection uses hierarchical (context-sensitive) features more precise than text-based • Low false positive rate: 0.0003% (< 1 in 1/4 million) • De-obfuscation • Uses JavaScript engine of a browser to expose obfuscation and get the final, expanded version of JavaScript code

What Is Zozzle? • A highly precise, mostly static detector for malware written in JavaScript suitable for in-browser deployment • 3 Steps: • JavaScript context collection and labeling as benign or malicious • Feature extraction and training of a naïve Bayesian classifier • Applying the classifier to a new JavaScript context to determine if it is benign or malicious

Zozzle: How It Works • JavaScript runtime engine exposes attempts to obscure malware • JS code is unfolded to just before it’s executed • Intercept calls to compile()in the JavaScript engine • It’s invoked when eval is called and whenever new code is included with an <iframe> or <script> tag • Observe JS code at each level of its unpacking just before it's executed by the engine.

How It Works cont. • A static classifier trained with a context-sensitive AST (abstract syntax tree) and a collection of labeled malware samples analyzes JS • Nozzle runtime detector dynamically crawls millions of URLs and collects sample malware by observing the behavior of running JS code • Tries to avoid transience and cloaking by scanning a wide range of URLs

Benign vs. Malicious Samples

ZOZZLE:

ZOZZLE:

Presentation Transcript

Zozzle : Low-overhead Mostly Static JavaScript Malware Detection

ZOZZLE: Fast and Precise In-Browser JavaScript Malware Detection