250 likes | 401 Views
Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities. Ben Smith and Laurie Williams. Input Validation Vulnerabilities. There is a plethora of proposed mitigation techniques, no solution eliminates all vulnerabilities.
E N D
Using SQL Hotspots in a Prioritization Heuristic for Detecting All Types of Web Application Vulnerabilities Ben Smith and Laurie Williams 1
Input Validation Vulnerabilities • There is a plethora of proposed mitigation techniques, no solution eliminates all vulnerabilities. • In the CWE/SANS Top 25 for 2009. • Continue to be in the CWE/SANS Top 25 for 2010. • Also indicated by SANS as the most common attacks for compromising web sites. 3
How do we stop this? • Development organizations do not have the time or resources to detect vulnerabilities in every source file before release. • Validation and verification must be prioritized to start with vulnerable files first. • SQL hotspots may help with this prioritization process. • Though typically associated with SQL injection, hotspots may be useful for predicting any type of vulnerability. 4
Goal The goal of this research is to improve the prioritization of security fortification efforts by investigating the ability of SQL hotspots to be used as the basis for a heuristic for the prediction of all vulnerability types. 5
Agenda • What are SQL hotspots? • Case Studies • Projects • Methodology • Results: Eight Hypotheses about Hotspots • Conclusion: A heuristic for prioritizing V&V efforts 6
SQL Hotspot A SQL Hotspot is any point in the application source code where the program interacts with a database management system. Typically indicated with mysql_query() or other library functions in PHP. 7
SQL Hotspots (2) $username = $_POST[‘username’]; $password = $_POST[‘password’]; $result = mysql_query( “select * from users where username = ‘$username’ AND password = ‘$password’”); $firstresult = mysql_fetch_array($result); $role = $firstresult[‘role’]; $_COOKIE[‘userrole’] = $role 8
Study Subjects • WordPress • Advanced blog management • 74% bloggers run WordPress • Uses MySQL and PHP • 138,967 SLOC • WikkaWiki • Wiki management system • 532 websites are using WikkaWiki • Uses MySQL and PHP • 46,025 SLOC 9
CWE Classifications WordPress WikkaWiki 11
Tracing Vulnerabilities to Files WikkaWiki WordPress 12
Prediction Model • Contained two terms: no. hotspots, SLOC • Logistic regression • Trained on releases 1…N, tested on release N+1. (1.0 to 1.3, tested on 1.4). • tp, tn, fp, fn 14
Descriptive Statistics Used open source tools R to test statistical hypotheses, and Weka for model evaluation. 15
Hypotheses about Files H1: The more hotspots a file contains per line of code, the more likely it is that the file contains any type of web application vulnerability (Logit, p < 0.05). H2: The more hotspots a file contains, the more times that file was changed due to any kind of vulnerability (SLR, p < 0.0001, Adjusted R2 = 0.4208, 0.3802). 16
Hypotheses about Issue Reports H3: Input validation vulnerabilities result in a higher number average repository revisions than any other type of vulnerability. (Consistent with SANS report). Mann-Whitney-Wilcoxon Test (p < 0.05) 17
Hypotheses about Prediction H4: Hotspots can be used to predict files that will contain any type of web application vulnerability in the current release (predictive model that does better than a random guess). H5: The more hotspots a file contains, the more likely that file will be vulnerable in the next release (coefficients on predictive model). 18
Hypotheses Comparing Projects H6: The average number of hotspots per file is more variable in WordPress than WikkaWiki. (F-test, p < 0.000001) H7: WordPress suffered a higher proportion of input validation vulnerabilities than WikkaWiki.(Chi-Squared Test, p = 0.0692) H8: In WordPress, more lines of code that were changed due to security issues were hotspots than in WikkaWiki.(Chi-Squared Test, p < 0.000001) 20
Limitations • We can never find or know all vulnerabilities. • Our definition of a hotspot may be insufficient or incorrect. • Issue reports were subject to human error both in reporting and in analyzing. • We are limited to these two open source projects. 21
Conclusion • Hotspots can be used in a V&V prioritization heuristic as follows: More SQL and non-SQL vulnerabilities will be found in files that contain more hotspots per line of code. • Input validation vulnerabilities: prominent problem, no single solution. • Separating the concern of database interaction is associated with a decrease in the proportion of reported input validation vulnerabilities. 22
Thank you! • Any questions? 23
Precision & Recall A measure of the level of exactness exhibited by the model The number of vulnerable files the model retrieves. 24
SQL Injection Attacks $username = $_POST[‘username’]; $password = $_POST[‘password’]; $result = mysql_query( “select * from users where username = ‘’ OR 1=1 ---’ AND password = ‘$password’”); $firstresult = mysql_fetch_array($result); $role = $firstresult[‘role’]; $_COOKIE[‘userrole’] = $role ‘ OR 1=1 -- 25