1 / 56

Input Validation

Input Validation. James Walden Northern Kentucky University. CWE: Input Validation. Topics. The Nature of Trust Validating Input Entry Points Web Application Input. Trust Relationships. Relationship between multiple entities. Assumptions that certain properties are true.

reba
Download Presentation

Input Validation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Input Validation James Walden Northern Kentucky University

  2. CWE: Input Validation CSC 666: Secure Software Engineering

  3. Topics • The Nature of Trust • Validating Input • Entry Points • Web Application Input CSC 666: Secure Software Engineering

  4. Trust Relationships Relationship between multiple entities. • Assumptions that certain properties are true. • example: input has a certain format • Assumptions that other properties are false. • example: input never longer than X bytes Trustworthy entities satisfy assumptions. CSC 666: Secure Software Engineering

  5. Who do you trust? Client users • example: encryption key embedded in client Operating system • example: dynamicly loaded libraries Calling program • example: environment variables Vendor • example: Borland Interbase backdoor 1994-2001, only discovered when program made open source CSC 666: Secure Software Engineering

  6. Trust is Transitive If you call another program, you are trusting the entities that it trusts. • Processes you spawn run with your privileges. • Did you run the program you think you did? • PATH and IFS environment variables • What input format does it use? • Shell escapes in editors and mailers • What output does it send you? CSC 666: Secure Software Engineering

  7. Validate All Input Never trust input. • Assume dangerous until proven safe. Prefer rejecting data to filtering data. • Difficult to filter out all dangerous input Every component should validate data. • Trust is transitive. • Don’t trust calling component. • Don’t trust called component: shell, SQL CSC 666: Secure Software Engineering

  8. Validation Techniques Indirect Selection • Allow user to supply index into a list of legitimate values. • Application never directly uses user input. Whitelist • List of valid patterns or strings. • Input rejected unless it matches list. Blacklist • List of invalid patterns or strings. • Input reject if it matches list. CSC 666: Secure Software Engineering

  9. Raw Input Raw Input Raw Input Trust Boundaries Syntax Validation Semantic Validation App Logic Safe Syntax Trust Boundaries CSC 666: Secure Software Engineering

  10. Wrap Dangerous Functions Existing Enterprise Security Services/Libraries Input is context sensitive. • Need more context than is available at front end. Soln: create secure API • Apply context-sensitive input validation to all input. • Maintain input validation login in one place. • Ensure validation always applied. • Use static analysis to check for use of dangerous functions replaced by API. OWASP ESAPI CSC 666: Secure Software Engineering

  11. Usability Validation ≠ Security Validation Usability Validation helps legimitate users • Catch common errors. • Provide easy to understand feedback. • Client-side feedback is helpful for speed. Security Validation mitigates vulnerabilities • Catches potential attacks, including unusual, unfriendly types of input. • Provide little to no feedback on reasons for blocking input. • Cannot trust client. Always server side. CSC 666: Secure Software Engineering

  12. Check Input Length Long input can result in buffer overflows. • Can also cause DoS due to low memory. Truncation vulnerabilities • 8-character long username column in DB. • User tries to enter ‘admin x’ as username. • DB returns no match since name is 9 chars. • App inserts data into DB, which truncates. • Later SQL queries will return both names, since MySQL ignores trailing spaces on string comparisons. CSC 666: Secure Software Engineering

  13. Entry Points • Command line arguments • Environment variables • File descriptors • Signal handlers • Format strings • Paths • Shell input • Web application input • Database input • Other input types CSC 666: Secure Software Engineering

  14. Command Line Arguments Available to program as **argv. execve() allows user to specify arguments. May be of any length • even program name, argv[0] • argv[0] may even be NULL CSC 666: Secure Software Engineering

  15. Environment Variables Default: inherit parent’s environment. execve() allows you to specify environment variables for exec’d process. • environment variables can be of any length. Telnet environment propagation to server • Server receives client shell’s environment. • Server runs setuid program login. • ssh may use user’s ~/.ssh/environment file. CSC 666: Secure Software Engineering

  16. Dangerous Environment Variables LD_PRELOAD • Programs loads functions from library specified in LD_PRELOAD before searching for system libraries. • Can replace any library function. • setuid root programs don’t honor this variable. LD_LIBRARY_PATH • Specify list of paths to search for shared libs. • Store hacked version of library in first directory. • Modern libc implementation disallow for setuid/setgid. CSC 666: Secure Software Engineering

  17. Dangerous Environment Variables PATH • Search path for binaries • Attacker puts directory with hacked binary first in PATH so his ls used instead of system ls • Avoid “.” as attacker may place hacked binaries in directory program sets CWD to IFS • Internal field separator for shell • Used to separate command line into arguments • Attacker sets to “/”: /bin/ls becomes “bin” and “ls” CSC 666: Secure Software Engineering

  18. Environment Storage Format Access Functions • setenv(), getenv() Internal Storage Format • array of character pointers, NULL terminated • string format: “NAME=value”, NULL term • Multiple env variables can have same name. • Did you check the same variable that you fetched? First or last variable that matches? CSC 666: Secure Software Engineering

  19. Securing Your Environment /* BSS, pp. 318-319 */ extern char **environ; static char *def_env[] = { “PATH=/bin:/usr/bin”, “IFS= \t\n”, 0 }; static void clean_environment() { int i = -1; while( environ[++i] != 0 ); while(i--) environ[i] = 0; while(def_env[i]) putenv(def_env[i++]); } CSC 666: Secure Software Engineering

  20. Securing Your Environment Secure Environment in Shell /usr/bin/env – PATH=/bin:/usr/bin IFS=“ \t\n” cmd Secure Environment in Perl %ENV = ( PATH => “/bin:/usr/bin”, IFS => “ \t\n” ); CSC 666: Secure Software Engineering

  21. File Descriptors • Default: inherited from parent process • stdin, stdout, stderr usually fd’s 0, 1, and 2 • Parent process may have closed or redirected standard file descriptors • Parent may have left some fd’s open • Cannot assume first file opened will have fd 3 • Parent process may not have left enough file descriptors for your program • Check using code from BSS, p. 315 CSC 666: Secure Software Engineering

  22. Signal Handlers Default: inherited from parent process. /* BSS, p. 316 */ #include <signal.h> int main( int argc, char **argv ) { int i; for(i=0; i<NSIG; i++) signal(I, SIG_DFL); } CSC 666: Secure Software Engineering

  23. Format Strings Formatted output functions use format lang. • Percent(%) symbols in string indicate substitutions. • %[flags][width][.precision][length]specifier Example format specifiers • “%010d”, 2009: 0000002009 • “%4.2f”, 3.1415926: 3.14 Example functions • printf() • scanf() • syslog() CSC 666: Secure Software Engineering

  24. printf() family dangers User-specified format strings • userstring = “foo %x”; • printf( userstring ); • Where can it find arguments to replace %x? • The Stack: %x reads 4-bytes higher in stack Solution: Use printf( “%s”, userstring ) or fputs( userstring ) CSC 666: Secure Software Engineering

  25. printf() family dangers Buffer overflows char buf[256]; sprintf( buf, “The data is %s\n”, userstring ); Specify “precision” of string substitution sprintf(buf,“The data is .32%s\n”,userstring ); Use snprintf (C99 standard function) snprintf(buf,255,“The data is %s\n”, userstring ); CSC 666: Secure Software Engineering

  26. %n format command Number of characters written so far is stored into the integer indicated by the int * pointer argument. char buf[] = "0123456789"; int *n; printf(“buf=%s%n\n", buf, n); printf("n=%d\n", *n); Output: • buf=0123456789 • n=14 CSC 666: Secure Software Engineering

  27. %n format attack Plan of Attack • Find address of variable to overwrite • Place address of variable on stack (as part of format string) so %n will write to that address • Write # of characters equal to value to insert into variable (use precision, e.g., %.64x) Use %n to write anywhere in memory • Address on stack can point to any location CSC 666: Secure Software Engineering

  28. String Inputs Aspects of string input • Length • Character encoding Encodings describe how bits map to chars • ASCII 7-bit encoding for English. • ISO-8859-1(Latin 1) 8-bit encoding • Compatible with ASCII • New chars for other Latin alphabet languages • Windows-1252 variant common • Unicode family of encodings for all languages CSC 666: Secure Software Engineering

  29. Universal Character Set (UCS) Can represent all chars for all languages. • Represents characters as code points. • Planes are groups of 65,536 numerical values that represent code points. • 1,112,064 code points from 17 planes are accessible with current encodings. Basic Multilingual Plane (BMP) • The first 65,536 UCS characters. • UCS-2 was an early 16-bit encoding to represent only characters from the BMP. Supplementary Ideographic Plane • Contains many CJK ideographs. CSC 666: Secure Software Engineering

  30. Map of the BMP CSC 666: Secure Software Engineering

  31. Unicode Encodings UTF-8 • Variable length 8-, 16-, 24-, or 32-bit encoding • Can represent any char on the 17 plans. • Backwards compatible: first 128 chars are ASCII. • Over half of web pages use UTF-8 encoding. UTF-16 • Variable length 16-bit or 32-bit encoding • Can represent any char on the 17 planes. • Used in Windows API since W2k. • Java added UTF-16 support in Java 5. • Special syntax for non-BMP in most languages. CSC 666: Secure Software Engineering

  32. UTF-8 • Problems • Note that not all bit sequences are valid. • Can represent same character using techniques in each of 6 rows above to bypass input validation. • Different characters may look identical on the screen and can fool users into clicking malicious URLs. CSC 666: Secure Software Engineering

  33. Paths If attacker controls paths used by program • Can read files accessible by program. • Can write files accessible by program. Vuln if access is different than attackers • Privileged (SETUID) local programs. • Remote server applications, including web. Directory traversal • Use “../../..” to climb out of application’s directory and access files. CSC 666: Secure Software Engineering

  34. Path Traversal Example CSC 666: Secure Software Engineering

  35. The Problem How to make correct access control decisions when there are many names for a resource? • config • ./config • /etc/program/config • ../program/config • /tmp/../etc/program/config CSC 666: Secure Software Engineering

  36. Canonicalization • Canonical Name: standard form of a name • Generally simplest form. • Canonicalizename then apply access control • UTF-8 canonicalization in Java • String s = "\uFE64" + "script" + "\uFE65"; • s = Normalizer.normalize(s, Form.NFKC); • APIs to canonicalize pathnames • C: realpath() • Java: getCanonicalPath() CSC 666: Secure Software Engineering

  37. Common Naming Issues • . represents current directory • .. represents previous directory • Case sensitivity • Windows allows both / and \ in URLs. • Windows 8.3 representation of long names • Two names for each file for backwards compat. • Trailing dot in DNS names • www.nku.edu. == www.nku.edu • URL encoding CSC 666: Secure Software Engineering

  38. Path Traversal Encodings URL Encodings • %2e%2e%2f translates to ../ • %2e%2e/ translates to ../ • ..%2f translates to ../ • %2e%2e%5c translates to ..\ Unicode Encodings • %c1%1c translates to / • %c0%af translates to \ CSC 666: Secure Software Engineering

  39. Win/Apache Directory Traversal Found in Apache 2.0.39 and earlier. To view the file winnt\win.ini, use: http://127.0.0.1/error/%5c%2e%2e%5c%2e%2e%5c%2e%2e%5c%2e%2e%5cwinnt%5cwin.ini which is the URL-encoded form of http://127.0.0.1/error/\..\..\..\..\winnt\win.ini CSC 666: Secure Software Engineering

  40. Command Injection Find program that invokes a subshell command with user input UNIX C: system(), popen(), … Windows C: CreateProcess(), ShellExecute() Java: java.lang.Runtime.exec() Perl: system(), ``, open() Use shell meta-characters to insert user-defined code into the command. CSC 666: Secure Software Engineering

  41. UNIX Shell Metacharacters `command` will execute command ‘;’ separates commands ‘|’ creates a pipe between two commands ‘&&’ and ‘||’ logical operators which may execute following command ‘!’ logical negation—reverses truth value of test ‘-’ could convert filename into an argument ‘*’ and ‘?’ glob, matching files, which may be interpreted as args: what if “-rf” is file? ‘#’ comments to end of line CSC 666: Secure Software Engineering

  42. Command Injection in C /* Mail to root with user-defined subject */ int main( int argc, char **argv ) { char buf[1024]; sprintf( buf, “/bin/mail –s %s root </tmp/message”, argv[1] ); system( buf ); } CSC 666: Secure Software Engineering

  43. Command Injection in C How to exploit? ./mailprog \`/path/to/hacked_bin\` /path/to/hacked_bin will be run by mailprog How to fix? Verify input matches list of safe strings. Run /bin/mail using fork/exec w/o a subshell. CSC 666: Secure Software Engineering

  44. Command Injection in Java String btype = request.getParameter("backuptype"); String cmd = new String("cmd.exe /K \"c:\\util\\rmanDB.bat "+btype+"&&c:\\utl\\cleanup.bat\""); System.Runtime.getRuntime().exec(cmd); CSC 666: Secure Software Engineering

  45. Command Injection in Java How to exploit? Edit HTTP parameter via web browser. Set bype to be “&& del c:\\dbms\\*.*” How to defend? Verify input matches list of safe strings. Run commands separately w/o cmd.exe. CSC 666: Secure Software Engineering

  46. Web-based Input Sources of Input: • URLs, including paths + parameters • POST form parameters • HTTP headers • Cookies Common Types of Input: • HTML • Javascript • URL-encoded parameters • XML/JSON CSC 666: Secure Software Engineering

  47. Client Dangers Dangerous code ActiveX ActionScript Javascript Java Client-side storage Cookies Flash LSOs DOM storage Server Dangers No data sent to client is secret: Hidden fields Cookies User controls client. Can bypass validation. Can access URLs in any order. Can alter client-side storage. Different Perspectives CSC 666: Secure Software Engineering

  48. URL Parameters <proto>://<user>@<host>:<port>/<path>?<qstr> Whitespace marks end of URL “@” separates userinfo from host “?” marks beginning of query string “&” separates query parameters %HH represents character with hex values • ex: %20 represents a space CSC 666: Secure Software Engineering

  49. HTML Special Characters “<“ begins a tag “>” ends a tag some browsers will auto-insert matching “<“ “&” begins a character entity ex: &lt; represents literal “<“ character Quotes(‘ and “) used to enclose attribute values, but don’t have to be used. CSC 666: Secure Software Engineering

  50. Cookies Parameters • Name • Value • Expiration Date • Domain • Path • Secure Connections Only CSC 666: Secure Software Engineering

More Related