E N D
Poster Title The Hybrid Toolkit The Hybrid Toolkit is a flexible, extensible collection of software tools for capturing, storing, analyzing, and visualizing data. Hybrid was initially developed under the Sandia LDRD program as a system to bring scalable data analytics to enterprise security analysts working on the problem of detecting phishing and spear phishing attacks in email messages. • Benefits • Cross platform development and application • Windows, MacOS, Linux • Deployment extensibility • Desktop, server, HPC system • Development support • Multi-core computation • Extensible logging module • Rapid prototyping • Proven production experience Web interfaces support cyber team collaboration The broad applicability of the capabilities currently in Hybrid has led to adoption of the toolkit for addressing a wider range of problems: data exfiltration, malware analysis, host-based anomaly detection, and large-scale network emulation, to name a few. Hybrid Toolkit Details • Python implementation • Worker-Manager-Executor • computation model • Data and I/O abstraction • Statefuldata • Data storage • CouchDB • MongoDB • Data processing • Plaintext (Unicode and ASCII) • PDF • MS Word • HTML • SMTP • Data analysis • Descriptive statistics • Clustering: Partitional (e.g. K-Means) and hierarchical • Topic modeling: Latent DirichletAllocation • Classification: k-nearest neighbor, multi-layer perceptrons, sentiment analysis • Application integration • Bro (http://www.bro.org) • Splunk(http://www.splunk.com) • Contacts: • Warren Davis (PI) wldavis@sandia.gov • Danny Dunlavydmdunla@sandia.gov • Christopher Nebergallcneberg@sandia.gov Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND No. 2011-XXXXP