210 likes | 226 Views
Grant Paton-Simpson Paton-Simpson & Associates Ltd www.p-s.co.nz. Parameter Binding & PHP. Protecting your web application from SQL injection attacks.
E N D
Grant Paton-Simpson Paton-Simpson & Associates Ltd www.p-s.co.nz Parameter Binding & PHP Protecting your web application from SQL injection attacks Revision based on presentation feedback – added simple introduction to prepared statements; added better introduction to SQL injection; mentioned risk of cross-site scripting attacks if no input filtering in addition to use of parameter binding; removed MSSQL page added at last minute (was on stored procedures not prepared statements - :-( );
Introduction • PHP – easy to start, easy to get into bad programming habits (esp if not a programmer ;-) ) • Easy to write code to interact with a database • e.g. $staff_id = $_POST['staff_id']; • strSQL = "SELECT salary FROM tblstaff WHERE staff_id = $staff_id"; • And this is how major texts teach it too • The (major) problem - SQL injection
SQL Injection • User input should always be treated as dangerous, esp when interacting with a database • Example - Instead of the user entering "8" in the ID textbox, they enter "8 OR 1=1" • or enter "bogus' OR 1=1" into the password field • or write "bogus'; DROP ..." • Open Sesame! • Very common – one estimate is about 1 in 8 websites or possibly much higherhttp://portal.spidynamics.com/blogs/msutton/archive/2006/09/26/How-Prevalent-Are-SQL-Injection-Vulnerabilities_3F00_.aspx
Parameter binding as response • Solutions – input filtering etc and possibly parameter binding • What is parameter binding? You make a query template and define the input parameterse.g. "SELECT * FROM tblTable WHERE id=?"Then you feed it the parameters (in this case, the id)Will not change the syntax of the statement, only the value of the input. The input stays safely contained in its place • NB may need input filtering anyway to prevent Cross-site scripting attacks – no point faithfully storing a dangerous script in your database and serving it up to your unsuspecting users
Parameter Binding – A Minimum Standard? • Focusing here on parameter binding • To some people, saying "I don't use parameter binding" is like saying "I lock my house with sellotape"! • The following quote from slashdot is typical of many: • http://it.slashdot.org/comments.pl?threshold=5&mode=thread&commentsort=0&op=Change&sid=191584 • "That bears reiterating. If you are passing user input to a database in anything but a bind variable, you are incompetent. Period. End of story. • I've seen it so many times. Why do programmers think that it's a good idea to write their own escape routines when every database has a facility for denoting what is variable data and what is not? Unbelievable."
Unfair criticism? • Not necessary if already using thorough input filtering/ validation and presumably mysqli_real_escape_string** mysqli_real_escape_string is NOT absolutely bulletproof under all circumstances – in particular if you change to a multibyte charset (other than UTF-8, which is safe) in your script - seehttp://ilia.ws/archives/103-mysql_real_escape_string-versus-Prepared-Statements.html • Often simpler, cleaner code when avoiding binding • There are all sorts of downsides to using prepared statements (more on that later) • Not just a straight substitution • changing existing code can be impractical (nothing is probably absolutely impossible ... but it could get very, very ugly, and complicated)
Not just a straight substitution • May be hard to wrap in classes • Although I managed with the help of call_user_func_array() on the bind method of stmt etc, often the interface was no simpler, just non-standard! • Lots less flexibility (part of the logic of explicitly binding to parameters and not changing the actual semantics of the query template) • Can't replace "SELECT * FROM tblfoo $filter" where $filter will sometimes be a WHERE clause and sometimes be an empty string • "[placeholders] are not allowed ... to specify both operands of a binary operator such as the = equal sign. The latter restriction is necessary because it would be impossible to determine the parameter type.http://www.php.net/manual/en/function.mysqli-stmt-prepare.php • Can't change list of fields displayed etc etc etc
List of Problems with Prepared Statements in PHP • 1. Query cache does not work • 2. Extra server round trip required if statement used only once • 3. Not all statements can be prepared. So you can’t use prepared API exclusively you’ll need to fall back to normal API for some statements • 4. Newer and sometimes buggy code. I had a lot of problems with PHP prepared statements. It is getting better but still it is less mature than standard API • 5. You can’t use placeholders in place of all identifiers. For example you can’t use them for table name. In certain version it even does not work for LIMIT boundaries • 6. Inconvenient list handling. Unlike in for example PEAR emulated prepard statements there is no nice way to pass list of values to IN • 7. Harder tracing. Logs were now fixed to include full statement text not only “Execute” but in SHOW INNODB STATUS you would still see statements without actual values - quite inconvenient for analyses. http://www.mysqlperformanceblog.com/2006/08/02/mysql-prepared-statements/
More balanced approach • Prepared statements can potentially have lots of benefits ... • ... but there can be some issues/problems with using prepared statements – some generic and some specific to the language implementation • So ... worth learning how to use when appropriate
Different implementations • Not a language-specific skill/approach • Python example • # Never do this -- insecure! • symbol = 'IBM' • c.execute("... where symbol = '%s'" % symbol) • # Do this instead • t = (symbol,) • c.execute('select * from stocks where symbol=?', t) • Will focus on PHP and MySQL using mysqli (could also use PDO or PEAR:DB or other databases)
More on Python implementation • SQLite in Python http://www.sqlite.org/lang_expr.html • Parameters can take several forms: • ?NNN A question mark followed by a number NNN holds a spot for the NNN-th parameter. NNN must be between 1 and 999. • ? A question mark that is not followed by a number holds a spot for the next unused parameter. • :AAAA A colon followed by an identifier name holds a spot for a named parameter with the name AAAA. Named parameters are also numbered. The number assigned is the next unused number. To avoid confusion, it is best to avoid mixing named and numbered parameters.
Prepared Statements in MySQL/PHP • Best to use the mysqli extension (PHP 5+ and MySQL 4.1+) • George Schlossnagle made some classes to implement a mysql version in Advanced PHP Programming 2004 p.48 (only to demonstrate the Adaptor design pattern) but it is probably safer to use the built-in version • Prepared Statements in MySQL - about both security or performance
mysqli – Should I make the change? • Some people believe the mysqli extension is slower than mysql unless you are running prepared statements (which can be the fastest depending on the query and level of reuse) • The developer of mysqli says mysqli is faster and more secure (plus more features etc) • MySQL has developed a tool to help PHP programmers switch code from mysql to mysqlihttp://forge.mysql.com/wiki/Converting_to_MySQLi
Template and placeholders • Two flavours of prepared statements – bound parameter and bound result (http://devzone.zend.com/node/view/id/686) • Focus here on bound parameter prepared statements • A query template is created and sent to the MySQL server. The MySQL server receives the query template, validates it to ensure that it is well-formed, parses it to ensure that it is meaningful, and stores it in a special buffer. It then returns a special handle that can later be used to reference the prepared statement. • When a query needs to be made, data to fill in the template is sent to the MySQL server, and then a complete query is formed and then executed.
Possible Performance Benefits • If query "template" reused, lots less data to send around (only the changes), so much better for repeated queries • BUT can be slower - two round-trips to the server, which can slow down simple queries that are only executed a single timehttp://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html • Sends data around in binary form rather than as text • MySQL pre-4.1 used text protocol for data transfer - query was sent as text and result returned back as text. ... number 123 would be sent as string “123″. ... serious performance implication - queries had to be parsed fully each time, all return values had to be converted to the strings on server side and back on the client side, which is pretty expensive especially for certain data types. Furthermore BLOBs require escaping as not all characters could be used in textual protocol, which not only consumed time but also required extra memory consumption both on server and client (http://www.mysqlperformanceblog.com/2006/08/02/mysql-prepared-statements/)
Security Benefits • The data for the query does not need to be passed through a function like mysql_real_escape_string() to ensure that no SQL injection attacks occur. Instead, the MySQL client and server work together to ensure that the sent data is handled safely when it is combined with the prepared statement (http://devzone.zend.com/node/view/id/686) • [escaping is] ... unnecessary when dealing with prepared statements. The separation of the data allows MySQL to automatically take into account these characters and they do not need to be escaped using any special function (http://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html)
mysqli Syntax • $stmt = $mysqli->prepare("INSERT INTO CountryLanguage VALUES (?, ?, ?, ?)"); • $stmt->bind_param('sssd', $code, $language, $official, $percent); • $code = 'DEU'; $language = 'Bavarian'; $official = "F"; $percent = 11.2; • $stmt->execute(); • sssd – string, string, string, double/float (i is integer, and b is blob – s is for anything other than the rest really inc dates) • bind types enable data to be encoded for greater efficiency
Extra details using mysqli • input variables must be bound before executing the statement • output variables must be bound after executing the prepared statement (PHP5 Power Programming Gutmans, Bakken & Rethans, Prentice Hall, 2005, p.157). • remember to close prepared statements - Many memory leaks reported in MySQL Server turned out to be prepare statements or cursors which were forgotten to be closed. Watch Com_stmt_prepare and Com_stmt_close to see if you’re closing all prepared statements. In newer versions you can also use prepared_stmt_count variable to track number of open statements diretly. You can also adjust max_prepared_stmt_count variable which limits how many statements can be open at the same time to avoid overloadhttp://www.mysqlperformanceblog.com/2006/08/02/mysql-prepared-statements/
PDO and Prepared Statements • PDO provides a data-access abstraction layer, which means that, regardless of which database you're using, you use the same functions to issue queries and fetch data. PDO does not provide a database abstraction; it doesn't rewrite SQL or emulate missing features. • Prepared statements are so useful that they are the only feature that PDO will emulate for drivers that don't support them. This ensures that you will be able to use the same data access paradigm regardless of the capabilities of the database (http://nz.php.net/pdo) • PHP 5.1 onwards only
PDO Syntax • Either positional or named placeholders • Positional using "?" : • $stmt = $dbh->prepare("INSERT INTO REGISTRY (name, value) VALUES (?, ?)"); • $stmt->bindParam(1, $name); • Named using ":" prefix : • $stmt = $dbh->prepare("INSERT INTO REGISTRY (name, value) VALUES (:name, :value)"); • $stmt->bindParam(':name', $name);
Links • Article by Zak Greant and Georg Richter http://devzone.zend.com/node/view/id/686 • MySQL articlehttp://dev.mysql.com/tech-resources/articles/4.1/prepared-statements.html • PDO on PHP net http://nz.php.net/pdo • MySQL Performance Blog http://www.mysqlperformanceblog.com/2006/08/02/mysql-prepared-statements/