310 likes | 635 Views
CS-558. Heat-seeking Honeypots: Design and Experience. John P. John, Fang Yu, Yinglian Xie , Arvind Krishnamurthy, and Martín Abadi . Smyrnaki Ourania. Goal. Attackers search for vulnerable servers . Aim to understand the behavior of attackers : How they find them Compromise
E N D
CS-558 Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, YinglianXie, Arvind Krishnamurthy, and MartínAbadi. SmyrnakiOurania
Goal • Attackers search for vulnerable servers. • Aim to understand the behavior of attackers: • How they find them • Compromise • misuse vulnerable servers
Introduction • Present in the paper heat-seeking honeypots that: • attract attackers • generate and deployautomatically honeypot pages Output: • Analyze logs in order to identify attack patterns
Design of Heat Seeking honeypots • Heat Seeking honeypots consist of 4 components • 1. Obtaining attacker queries • 2. Creation of honeypot pages • 3. Advertisinghoneypot pages to attackers • 4. Detecting malicious traffic (whitelist approach)
1. Obtaining attacker queries How do attackers find Web servers? 1st approach • perform brute-force port scanning on the internet. 2nd approach • Make use of Internet search engines.
1. Obtaining attacker queries • 2nd approach try to attract attackers that perform malicious queries • E.g. query a php vulnerability: phpizabi v0.848b c1 hfp1 • Results: • List of Web sites that have that php vulnerability
1. Obtaining attacker queries How can we obtain these malicious queries? • SbotMiner and SearchAudit to automatically identify malicious queries from attackers in the Bing log.
2. Creation of honeypot pages Given the query used by the attacker, how do we create an appropriate honeypot? • 1st approach Install vulnerable Web Software • Manually Install Web Applications that were frequently targeted. • Each application placed in a different VM. • But why? • If one VM gets compromised, it will not affect the working of the other applications.
2. Creation of honeypot pages When is an application compromised? New files added or application files have been compromised Disadvantage Manually identify and set up software
2. Creation of honeypot pages 2nd approach Set up web pages matching the query • Instead of setting up the actual software, we can create similar pages that are similar to the ones created by the software. (AUTOMATICALLY) • Issue the malicious query to the Bing and Google search engines and collect the results URLS.
2. Creation of honeypot pages Crawler fetches the Web pages at these URLs, along with the other elements require to render these pages (e.g. images,css) • Strip all Javascript content and rewrite all links of the page to point to the local versions. • Log all information of a visit to a database
3. Advertise Honeypots to Attackers • Submit URLS of honeypot pages to search engines • Add links to other public Web pages to point to additional links not prominently visible to regular users. But why? • Legitimate sites shall not lose traffic from ordinary users
4. Detecting malicious traffic Log all visits to our local heat-seeking honeypots. Process log and automatically extract attack traffic. Honeypots receive legitimate traffic and malicious traffic since our honeypots are publicly accessible. 2 kinds of legitimate traffic: • Search engines – Crawlers • Regular users, ordinary users
Identify Crawlers-malicious traffic How can we identify crawler traffic? Looking for known user agent strings Disadvantage: Does not always work! Why? User agent string easily spoofed, attackers can use a well known string to avoid detection.
Identify Crawlers-malicious traffic Crawlers visit static and dynamic links. Dynamic links generated by the real software. Static links refer to automatically generated honeypots.
Distinguishing traffic • From honeypot logs we observed that most of the attackers are not targeting the static pages. • Try to access non-existent files that were not publicly accessed.
Distinguishing traffic • Whitelist approach • Each site master enumerates the list of dynamic and static links. • This set is considered as Whitelist. • Requests to links that are not in the Whitelist are considered malicious.
Crawler visits:Detecting Dynamic Links More that 200 Software Honeypot pages that contain dynamic links have been crawled by 3 search engines.
Fraction of pages visited by ASes Anyone visiting more than Threshold > 75% is considered a crawler, while others are considered legitimate users to reach their honeypot pages. • e
Attacker visits Most popular page with over 10.000 visits was for site running Joomla, a CMS.
Comparing Honeypots • Web Server no hostname, access only its IP address. Not in the index of any search engine or crawler • Vulnerable Software installed 4 commonly targeted Web applications. Publicly accessible on the Internet. Crawled and indexed by Web sites. • Heat-seeking Honeypot pages Simply HTML pages,wrapped in php script that performs logging. Crawled and indexed by Web sites. (96 pages)
Conclusion • Present heat-seeking honeypots, which generate honeypot pages automatically. • Captured a variety of attacks including: password guesses,software installation attempts, SQL-injection attacks, remote file inclusion attacks, and cross-site scripting (XSS) attacks. • Heat-seeking honeypots and their use can effectively inform appropriate monitoring and defenses.