1 / 41

Use of AI algorithms in design of Web Application Security Testing Framework

Use of AI algorithms in design of Web Application Security Testing Framework. HITCON Taipei 2006. Or a “non-monkey” approach to hacking web applications. By fyodor and meder fygrave@o0o.nu meder@o0o.nu. “No. we are not writing another web scanner!!”. Agenda. Why hacking web applications

akasma
Download Presentation

Use of AI algorithms in design of Web Application Security Testing Framework

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Use of AI algorithms in design of Web Application Security Testing Framework HITCON Taipei 2006

  2. Or a “non-monkey” approach to hacking web applications By fyodor and meder fygrave@o0o.numeder@o0o.nu

  3. “No. we are not writing another web scanner!!”

  4. Agenda • Why hacking web applications • What scanners do. Why they are useless (or not) • What else could be done, but isn’t (yet) • Introduction to YAWATT • User-session based approach • Distributed • Intelligent (or not?) • Modular • More than “application security scanner” coverage

  5. This work background • STIF, STIF2 automation – agent-based cooperative automated hacking environment http://o0o.nu/sec/STIF

  6. So, why going for the web • They learnt to configure their firewalls • They learnt to disable services they don’t want • They finally know how to use nmap (and even nessus!!) …. • But they still want web • And they can’t learn to code

  7. So why web applications • Applications get complex • Multilayered frameworks make it even more fun • Amount of web application based services grow • Number of web application programmers increase (home brewed web applications) but …

  8. Web application remains a larger hole into one’s network • Web application programmers skills aren’t usually the best • Firewalls are there – just to let you in • application firewalls can stop limited number of web application attacks, but are useless when it comes to detection of logical vulnerabilities • IDS systems aren’t smart enough to pick up on Application attacks

  9. Scanners.. use of.. • checking for enumeration ... YES • checking for exectution ... YES • checking if we can drop table YES • checking if we can drop database .. YES .. • CANNOT CONNECT TO APPLICATION

  10. Scanners - summary • Nessus et all – don’t see web applications beyond the underlying software configuration • Libwhisker/nikto – signature based. Relatively primitive. Efficient for default bugs • Wikto/e-Or – session aware, coding flaws scanner • Kavado/Appscan/Webinspect/N-Stalker/Watchfire Appscan – intelligent scanners. Session aware. Closed “blackbox” (some allow scripted plugins)

  11. Why scanners ain’t enough • Single-host based • Commercial scanners are black-box (not extendable, non-correctable) • Little or no control on “hacking” process • Not easily extendable on the fly with new ‘automation” modules • Often primitive, signature based logic

  12. What would we like to have • Maximum automation of web hacking process • Minimum of code writing. • Autonomous functionality • Knowledge transfer • Ability to add ‘hacks’ on the fly • Deal with uncertainty in “intelligent way” • Learn from valid user session data

  13. Other good things to have • Be able to test new class of bugs (i.e. session hijacking) • Be able to attack web application from multiple-locations (bypass IP restrictions, improve brute-forcing process) • Be able to automate testing of application logic bugs • Be able to make intelligent guesses

  14. Introducing YAWATTmethod

  15. User sessions • User sessions – collections of user request/response pairs (url, name/value pairs, session information and selective HTTP protocol data) • Classified user session data include semantic classification of URL, parameters, responses and HTTP protocol data (server type, backend system(s) if visible, “unusual” HTTP headers content)

  16. User sessions • User session data can be obtained from: • Proxy servers (burp, paros, ..) • Web server logs • Browser automation scripts (i.e. WATIR framework) • Spiders (burp)

  17. Less code, more automation • Application content is learnt from user sessions (data feeders) • Additional application information could be gathered by agent’s plugins (i.e. directory splitting tests) • User session data is classified by: • Semantic and functional classification of URL • HTTP protocol classificators (server type, cookies ..) • Session classificators • Input data classification – type, semantics • Output classification (application error detection, redirects, “bogus’ responses etc) • Test-case suites and executed in groups • Stateless tests • Stateful tests • Mixed

  18. Classification process as new data arrives into the system

  19. Go Intelligent Main components: • Web application components (URL) classification • Semantic classification for web application input data • LSI based mapping and comparison of web content In response analysers. • Use of external search engines • Limited “binary analysis” of downloaded files (decoding pdf, doc, rtf (other formats later)

  20. Knowledge Transfer to machine • Possibility to create new classification rules on the fly (and let the system re-learn from it) • Possibility to ‘reclassify’ application responses • Possibility to add new ‘testing’ plugins on the fly

  21. How is URL classification used • Vulnerability scenario testing – uses ‘classificators’ subscribtion mechanism. • For example: login page tester will need ‘login’, ‘executable’ and ‘session’

  22. How does input data semantics identification happen

  23. How the classified user session data is used

  24. Additional research directions • Other ideas to work on: • Detection of “hidden” parameters • Identification of “hidden” urls • Identification of “negative” and ‘positive” responses • Detection of application failures, redirects • Evaluation and priority based execution for plugins

  25. A note on distributed architecture • Cooperative Agents Infrastructure • Design cooperative agent system • Multi-platform • Portable

  26. Distributed architecture

  27. Distributed architecture (another look)

  28. What distributed approach gives us: • DDoS – EASY!!! • Distributed brute-forcing. Bypassing IP based restrictions, bandwidth limitations • IDS – more tricks • Bypass packet filtering restrictions an agent behind the firewall!

  29. Communication framework • Modified version of spread • Robust • Reliable message delivery • Portable (windows/unix) • Available in C/C++ and Java flavours. Bindings exist for Python, Ruby!

  30. In progress • Agents communicate with message • Task distribution algorithms – in progress

  31. More on intelligence • Aside from application vulnerabilities, other things of interest are: • Email addresses, user ids that could be seen within web content • Domain names (within web pages, comments, binary files, etc) • Building ‘target-oriented’ dictionary files (used by brute-force cracking modules)

  32. Other good things • Add your plugin code on the fly (attack automation plugins via subscription mechanism, classification plugins etc): • Can’t be simpler:

  33. Look mah, no hands! • No reload is needed, plugins executed next time the new data is processed

  34. beyond normalities of average application scanner • Integration and use of other tools to collect and analyse data (search engine queries, ..) • Integration with other tools (script in python or ruby, or hack “plugin” in java or C) • If you like your favourite application hax0r tool – you still can use it (and feed the data to us!)

  35. Other remainders: • Direct interaction with analyst (not fully implemented yet):

  36. Other remainders: • Data lookup and data mining services for plugins (via mySQL database wrapping DataMiner).

  37. Other ‘nice to have’ things in progress • Propogation module: manual or automated agent installation on vulnerable server (controlled worm spreading capability!)

  38. Demo • Code is spaghetti (sorry about that) • Will demonstrate functional bits

  39. Questions and Answers Sample questions, pick one: ;---------) • Why another web hacking tool? • Can you do X too..?

  40. Thanks • Thanks for your patience • The code, slides and docs will be available in a while: http://o0o.nu/sec

  41. Xcon plug •   XCon2006 the Fifth Information Security Conference will be held in Beijing, China, during August 22-24, 2006. • Speaking: abit late, but you can try: casper@xfocus.org • Attending: should be possible and interesting • No politics! ;-) • Thanks!

More Related