260 likes | 414 Views
Stopping the Con: Detecting Electronic Social Engineering Attacks. Allen Stone 11/28/2005 CMSC691I UMBC. Project Goals. To create an effective detection framework for electronic social engineering attacks.
E N D
Stopping the Con: Detecting Electronic Social Engineering Attacks Allen Stone 11/28/2005 CMSC691I UMBC
Project Goals • To create an effective detection framework for electronic social engineering attacks. • To create a proof-of-concept for a hybrid model framework that will lend itself to future work and implementation. • To run a short demonstration of the concept in action on a simplified playing field.
What NLDetect Will Not Do It is not the purpose of this project to expound on the following: • Best practices for signatures • Statistical analysis of best threshold value • Reading data off the wire • Incident Response
Outline • Background • Social Engineering • Traditional Defenses • Natural Language Processing • Hybrid Model - NLDetect • Detection Matrix • Ontology • Lexicon • Analyzer • Testing Methods • Expectations • Future Work • References
The Art of Deception • Social Engineering is the act of manipulating a person or group of people for the purpose of gaining sensitive information or unauthorized system access. • Hard to Detect • Pervasive and Effective • Lack of Sophistication is the advantage.
Examples • False Reward • The well-known foreign royalty con • System Pop-ups • Fraudulently claims the system to be infected or compromised. Provides a link or installation button to “fix” the problem • Impersonation / Extortion • The most dangerous attacks • Email from trusted with attachment or link (impersonation) • Email from stranger with threatening tone (extortion)
Traditional Defenses • Email Spam Filters: not 100% effective. • Impersonation emails • Web Pop-up Blockers: only work against pop-up ads • False banner ads • Different ports • Websites ask you to disable it • Exploited browsers • Spyware/Antivirus: after-the-fact • Firewalls: can be circumvented, since most of these attacks require an outbound connection through the firewall.
Other Traditional Defenses • Social Engineering • Preventative, multi-tiered training. • Periodically re-train. • Security in General – Intrusion Detection • Passively listen to traffic • Signatures (strings or rules) • Anomaly Detection
Know Your Enemy • All traditional methods fall short in defense of Social Engineering • Training is uncertain. • Signature-based IDS cannot keep up. • Anomaly Detection depends on “normalcy” concept. • Plain, common language is the attack vector
Natural Language Processing • Making computers “understand” statements in human languages • Context and Semantics. • Multi-faceted • Ontology • Lexicon • Parser • Analyzer • Referential Analyzer • Light Dabbling • Not using Referential Analyzer • Weak “processing”
NLDetect Design • Completely innovative approach • Detection based on natural words • Use of NLP in IDS • Automated Detection of Social Engineering • Impact: • Rise in social engineering for spyware, virus, and impersonation emails • No effective electronic defense available
Hybrid Model - NLDetect • Preventative Training • Installation and periodic updating • Intrusion Detection • IDS capture components are the front end of the system. The output of the system is also presented as detection output. • The goal of the system • Natural Language Processing • The basis of the framework. The system is literally processing natural language.
Framework Design • Main Goal: to provide a framework for post-processing of detection data to determine if electronic social engineering attacks have taken place. • Installation of this software effectively “trains” the machine, due to ontological concepts. • The parser is the collection mechanism, which is either the IDS or live data. • Ontology and Lexicon are statically defined by the user • Analyzer is the NLP backbone. It uses all of the other tools to decide whether a response needs to be made. • Entire system is encoded in Perl. Simple system calls format the data and search for the number of lexicon words that exist in the test data, and the analyzer decides which archetypal attack it is and whether it is cause for alarm
The Detection Matrix • The underlying intrusion detection system • Parser for the system • Reads input data • Outputs to Analyzer • Standard form
Ontology • “Preventative Training” • Concepts of Manipulation • Different Types of Attack • False System Messages, Fraud Advertisements, Impersonation, Extortion
Lexicon • Plain Natural Words, Plus a Little More • Common words in manipulation • Common strings associated with attacks • “<a href=“ • Isn’t this just signature list bloat? • Threshold system • Many attacks per concept • Slower growth than standard signature list.
Analyzer • Core of the System • Compare with Lexicon • If it meets the threshold, alert according to what type of attack it most likely is. • Points/Percentages For Words • Threshold problem • Ignore the non-hitting traffic less words in attack. • Consider the non-hitting traffic attacks bloated with trivial words
Design Details for Testing • Three ports: 80, 25, and 1026 • Four types of possible attack: false reward, system messages, impersonation, and extortion. • All collection is assumed to be done beforehand. The system parses formatted text for detection. • NLDetect is a framework to be tacked onto existing mechanisms.
The Test • NLDetect’s framework will be set against examples of manipulative data, with the lexicon being defined with common persuasive words. • A basic string matcher will be written to match the digital signatures of the attacks. • The systems will be run simultaneously on the data. • A more thorough test will have to be established on live wire data with different signatures later.
Expectations • NLDetect should detect all system message pop-ups. • String matcher will detect all attacks with the common digital signatures. • NLDetect will miss a special case web traffic exploit • String matcher will miss attacks without digital signatures and pick up on innocuous traffic. • NLDetect will pick up on innocuous traffic that is verifiable.
Future Work • Practical, more powerful implementation of NLDetect • Response Software • Detection Metrics • Branch Out • Use NLP for generalized IDS • Further research automated SE defenses.
References • Atallah, Mikhail J., Craig J. McDonough, Victor Raskin, and Sergei Nirenburg. “Natural Language Processing for Information Assurance and Security: an Overview and Implementations.” “Proceedings of the 2000 Workshop on New Security Paradigms.” Ed. ACM Special Interest Group on Security, Audit, and Control (SIGSAC). New York, NY: ACM Press 2001. • Borders, Kevin and Atul Prakash. “Web Tap: Detecting Covert Web Traffic.” “Proceedings of the 11th ACM Conference on Computer and Communications Security.” Ed. ACM Special Interest Group on Security, Audit, and Control (SIGSAC) and Association for Computing Machinery (ACM). New York, NY: ACM Press 2004.
References • Ertoz, Levent, Eric Eilertson, Aleksandar Lazarevic, Pang-Ning Tan, Vipin Kumar, Jaideep Srivastava, and Paul Dokas. “MINDS – Minnesota Intrusion Detection System.” Data Mining: Next Generation Challenges and Future Directions. Ed. Hillol Kargupta, Anupam Joshi, Krishnamoorthy Sivakumar, and Yelena Yesha. Cambridge, MA: AAAI Press, 2004, Chapter 11. • Kienzle, Darrell M. and Matthew C. Elder, “Recent Worms: a Survey and Trends.” “Proceedings of the 2003 ACM Workshop on Rapid Malcode.” Ed. ACM Special Interest Group on Security, Audit, and Control (SIGSAC) and Association for Computing Machinery (ACM). New York, NY: ACM Press 2003.
References • Kruegel, Christopher and Giovanni Vigna. “Anomaly Detection of Web-Based Attacks.” “Proceedings of the 10th ACM Conference on Computer and Communications Security.” Ed. Association for Computing Machinery (ACM). New York, NY: ACM Press 2003. • Mitnick, Kevin and William L. Simon. The Art of Deception. Indianapolis: Wiley Pulblishing, 2002. • Orgill, Gregory L., Gordon W. Romney, Michael G. Bailey, and Paul M. Orgill. “Security III: The Urgency for Effective User Privacy-education to Counter Social Engineering Attacks on Secure Computer Systems.” “Proceedings of the 5th Conference on Information Technology Education.” Ed. Association for Computing Machinery (ACM). New York, NY: ACM Press 2004.
References • Rabek, Jesse C., Roger I. Khazan, Scott M. Lewandowski, and Robert K. Cunningham. “Detection of Injected, Dynamically Generated, and Obfuscated Malicious Code.” “Proceedings of the 2003 ACM Workshop on Rapid Malcode.” Ed. ACM Special Interest Group on Security, Audit, and Control (SIGSAC) and Association for Computing Machinery (ACM). New York, NY: ACM Press 2003. • Stolfo, Salvatore J., Wenke Lee, Philip K. Chan, Wei Fan, and Eleazar Eskin. “Data Mining-Based Intrusion Detectors: an Overview of the Columbia IDS Project.” ACM SIGMOD Record 30.4 (2001): 5-14.