1 / 36

Analyzing Web Protection with Big Data Intelligence

Discover how Akamai leverages petabytes of security data to enhance web protection in the cloud. Learn about measuring WAF accuracy, OWASP ModSecurity CRS, and the Akamai Intelligent Platform.

Download Presentation

Analyzing Web Protection with Big Data Intelligence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data IntelligenceHarnessing Petabytes of WAF statistics to Analyze & Improve Web Protection in the Cloud Ory Segal, Tsvika Klein Akamai Technologies

  2. About Us • Ory Segal • Principal Product Architect, Cloud Security • Tsvika Klein • Product Manager, Cloud Security Hosted by OWASP & the NYC Chapter

  3. Topics to Cover Akamai & OWASP ModSecurity CRS Relationship Security Big Data @ Akamai Measuring WAF Accuracy @ Akamai CRS through the Big Data Prism (Lessons Learned) Hosted by OWASP & the NYC Chapter

  4. About Us But we only have 45 minutes… And too much data to cover…

  5. Akamai & OWASP CRS This is not an Akamai marketing presentation Akamai has been offering its cloud-based WAF since 2009. Kona Site Defender: • OWASP CRS (Akamai Kona Rules) • DDoS Protection • DNS Protection • Bot Detection • Site Shield / Site Cloaking OWASP CRS was ported to Akamai MD, and does not run directly on ModSecurity

  6. Security Big Data @ Akamai

  7. Akamai Intelligent Platform Akamai’s cloud platform enables secure, high-performing user experiences on any device, anywhere 120,000+ Servers 2,000+ Locations 750+ Cities 82 Countries 1,100+Networks • Highlights: • 100 million page views per second and 500 billion hits per day • 734 Million IP addresses seen quarterly • 260+ Terabytes of compressed daily logs • 30% of all internet traffic

  8. CSI Platform Statistics 10 Terabytes of daily attack data 2 Petabytes of security data stored 45 days retention 140K concurrent connections (incoming data) 600K log lines / sec. indexed by 30 dimensions 8000 queries daily scanning terabytes of data

  9. CSI High Level Architecture AKAMAI EDGE SERVERS FE Applications BE Applications LOG AGENT YODA HBASE YODA ADAPTER HADOOP

  10. Yoda (Distributed Query Engine) Interactive Multiple data streams Intuitive query language High cardinality aggregation

  11. Security Big Data Challenge #1

  12. Security Big Data Challenge #2

  13. Sample Data App - SARA Interactive Tool to Analyze Security Events

  14. Back to WAF & OWASP CRS…

  15. WAF Accuracy Lingo • Imagine a WAF that protects against 100% of all possible attack vectors …by blocking100% of all HTTP requests • Accurate WAF testing requires you to measure: • How many real attacks got blocked (TP) • How much valid requests were allowed through (TN) • How much valid traffic was inappropriately blocked (FP) • How many attacks were allowed through ((FN) Lets talk about measuring Precision, Recall, Accuracy, MCC…

  16. Things You Need to Know % of blocked requests that were actual attacks % of attacks that were actually blocked % of decisions that were good decisions Correlation between WAF decisions and actual nature of requests * MCC: http://en.wikipedia.org/wiki/Matthews_correlation_coefficient

  17. Lets Look at Some Examples A WAF’s accuracy needs to be measured both in its ability to block attacks, as well as it’s ability to allow good traffic through…

  18. Introducing: Akamai WAF Testing Framework

  19. Akamai WAF Testing (AWT) Framework • Ability to send both valid & attack traffic • Easily create or add new test cases: • 3 methods: Text files, Burp Extender, Wireshark .pcaps • Easily import test cases from Akamai’s Big Data platform • Configurable and can work with any WAF • Easily define success / fail criteria • Intuitive XML & HTML reports • Easy debugging of FP/FN w/ Anomaly Scoring (rule comb.)

  20. AWT Built-In Test Cases In order to accurately assess WAF, we collected test cases from the following sources: Web interaction recordings of Alexa Top 100 internet sites – Commerce, Health, Consumer Electronics, Reference, Finance, … Recorded commercial web application scanner traffic Havij & SQLMap attacks Ported common False Positive cases from Akamai customers (Big Data) Attacks from Akamai CSI big data platform Exploits from the internet (fuzzers, exploit-db, …* Ported “valid” test cases from other tools* Tens of Thousands of HTTP Requests, divided 95%- 5%

  21. AWT Reports – High Level Statistics

  22. AWT Reports – Protection Statistics

  23. AWT Reports – False Positives Analysis

  24. OWASP CRS – Lessons Learned

  25. CRS Issue #1 – Risk Groups • CRS 2.2.x uses a single anomaly score • Visibility (granularity) issues – What really happened? • Separate anomaly score “accounting” to smaller risk groups (attack types) • Clear understanding of which attack took place • Challenge: • requires rule mapping to risk groups • Some rules contribute to more than 1 risk group • Requires to put some more thought into anomaly scoring – it’s not just one pile of rules/scores XSS = 35, SQLi= 10, RFI = 0, LFI = 0, …

  26. CRS Issue #2 –Multiple Thresholds <xss> Different risks require different anomaly thresholds Threshold <xml> 25 XSS Attack: <script>alert('xss')</script> => Score 30 CMDi Attack: ; /bin/sh cat /etc/passwd => Score 5 5 Valid XML: <book> Hello World </book> => Score 10

  27. CRS Issue #2 TH

  28. CRS Issue #3 – HTTP Violations “BLOCK HTTP PROTOCOL VIOLATIONS ?!???THAT’S LIKE 1.21 PETABYTES OF LOGS PER DAY!!!!!”

  29. CRS Issue #3 – HTTP Violations • HTTP RFC Enforcement?! Good Luck! • APIs, REST services, RSS feeds, Good Bots – most don’t adhere to HTTP RFC • Prior to system tuning: • Missing Accept Header (960015): 14% • Missing User-Agent Header (960009 ): 3% • Can’t trust HTTP violation rules on their own • “Invalid HTTP” risk group with its own threshold • Blocks only seriously-damaged HTTP requests • Build more focused tool fingerprints • See next slide for an explanation on 960015

  30. 960015 – Research into 3 hours of triggers Which URLs trigger this rule? 85% Static Media Files Perhaps a Unique User-Agent? Common: Android (50%), AppleWebKit (19%), News (21%), App (20%) “Android” String found in 50% Anything in Common? 95.1K “Unique” UAs Can You Give Me Something Else?

  31. CRS Issue #4: Cookies YEAR: 2003 SESSID = 12f0a0193b4d93e9s92a39af; Quite easy to spot a SQLi or XSS payload in a cookies

  32. CRS Issue #4: Cookies YEAR: 2013 C1state = 24~1~-1~-1~E~6~6~6~10~10~0~0~|~37A1B34A~2EBA820B~0AEBA380~130959B9~0327C30B~7617CC73~21B797A5~C6392AF5~5FE036DB~|~8A173E13~7F5D33BF~30DFEF65~|~~|~0~1~2~3~4~5|3~4~6~7~8||0~1~2|4~4~6||~|~0~0~0~0~0~0~|~0~0~0~0~0~|~~|~~|~~|~~|; C2state = PC#1382573257902-104085.19_06#1384742638|cat#true#1383533098|session#1383533019933-203317#1383534898; C3data = {"v":1,"rid":"1371546489873_699561","to":5,"c":"http://www.some.site/page.aspx?a=5","pv":2,"lc":{"d0":{"v":2,"s":true}},"cd":0,"sd":0,"f":1371546904751} ; Cinfo= 1403D3394_232#scroll on "//<![CDATA[(function() { var f5_cspm = { pass_params: '1102912_0394939_19210_24253..."

  33. CRS Issue #5: Score Spreading Across Selectors In many FP scenarios, score spreads across “selectors” c1 = 1384044727071|ABCD:2::|AC:1::|PSD:0:AKFJ~MOBILE^CLAK_KOL:1385149290276 [950901 - 5] c2 = bn:Samsung|mn:GT-I9300 Galaxy S III|tb:false|mb:true|dos:Android|dosv:4.1|bos:KJSKKL|bosv:9 [981172 - 3] c3 = PC#1383939352901-916004.20_14#1386636727|check#true#1384044787|session#1384044726390-399957#1384046587 [981231 - 3] c4: = ”” [981318, 981242 – 2, 5] (Total Score: 18) Consider a FP reduction heuristics that reduces the total score when spread across selectors? There are security implications,…

  34. CRS Issue #6: Rule Inefficiency During our big data analysis & AWT usage, we noticed a few troubling rule issues: • Many rules have redundancies in expressions • This tends to push the anomaly score up in many scenarios (“reinforcing a FP”) • Forces pushing the threshold much higher than really needed • Some rules combine weak & strong signatures • FP-prone rules generate high score – reducing their “weight” hurts the accurate signatures in them • Some rules seemed almost useless – e.g. 981172

  35. Summary • Big Data: • OWASP / ModSecurity should consider collecting anonymized trigger information • CRS would greatly benefit from a much larger sample set • CRS Future: • Akamai has already contributed to the CRS project, and would continue to contribute back to the community • We highly recommend adopting some of the major changes done @ Akamai – mainly the “risk groups” model & multiple thresholds • WAF Testing: • Now that the WAF industry has matured, it is time that WAF deployments will be measured for accuracy using tools & methods mentioned here– Precision, Recall and MCC

  36. Thank you

More Related