250 likes | 440 Views
C Characters & Strings. Character Review Character Handling Library Initialization String Conversion Functions String Handling Library Standard Input/Output Library Functions. Use of safe libraries.
E N D
C Characters & Strings • Character Review • Character Handling Library • Initialization • String Conversion Functions • String Handling Library • Standard Input/Output Library Functions
Use of safe libraries • The problem of buffer overflows is common in the C and C++ languages because they expose low level representational details of buffers as containers for data types • Buffer overflows must thus be avoided by maintaining a high degree of correctness in code which performs buffer management
Use of safe libraries • It has also long been recommended to avoid standard library functions which are not bounds checked, such as gets, scanf and strcpy • The Morris worm exploited a gets call in fingerd
Character Review • Know your ASCII • '0' 48 dec 0x30 0011 0000 • '9' 57 dec 0x39 0011 1001 • 'A' 65 dec 0x41 0100 0001 • 'Z' 90 dec 0x5A 0101 1010 • 'a' 97 dec 0x61 0110 0001 • 'z' 122 dec 0x7A 0111 1010
Character Handling Library • Use #include <ctype.h> header file • Basic functions include: • Conversion between case • Test for upper or lower case • Test for letters (and digits and alphanumeric) • Test for blank spaces • See example at char.c
Functions in <ctype.h> Library • See list on p. 312 in textbook • int isdigit(char) • Is it a digit? • int islower(char) • Is it lowercase letter? • int isupper(char) • Is it uppercase letter? • int tolower(char) • Convert to lowercase • int toupper(char) • Convert to uppercase • int isalpha(char) • Is it a letter? • int isalnum(char) • Is it a letter or digit? • int isdigit(char) • Is it a digit? • int isspace(char) • Is it a space, tab, or newline? • int ispunct(char) • Is it printable but not space or alphanumeric?
String Initialization • Each of the following variables stores 6 characters + null character ('\0' = 0x0) • char s1[]="string"; • char s2[7]="string"; • char s3[]={'s','t','r','i','n','g','\0'}; • char s4[7]={'s','t','r','i','n','g','\0'}; • #define SIZE 7char s5[SIZE]="string";
String Initialization • Stores one 'a' + null character ('\0' = 0x0) • char s6[7]={'a'}; • Stores 'a' + 'b' + null character ('\0' = 0x0) • char s7[7]={'a', 'b'}; • Stores 7 null characters ('\0' = 0x0) • char s8[7]={0}; • char s9[7]={0x0}; • char s10[7]={'\0'};
String Initialization • Creates an array of pointers to two constant character strings (See string2.c) char *c1[] = {"zero","one"}; printf("%s\n",c1[0]); /* zero */ printf("%p\n",c1[0]); /* 0x10950 */ printf("%p\n",&c1[0]); /* 0xffbffa60 */ printf("%c\n",c1[0][0]); /* z */ printf("%p\n",&c1[0][0]); /* 0x10950 */ c1[0][0]='a'; /* Segmentation fault */ c1[0]=c1[1]; /* assign another address */ printf("%s\n",c1[0]); /* one */
String Initialization • Creates an array of characters char c2[2][5] = {"zero", "one"}; printf("%s\n",c2[0]); /* zero */ printf("%p\n",c2[0]); /* 0xffbffa50*/ printf("%p\n",&c2[0]); /* 0xffbffa50 */ printf("%c\n",c2[0][0]); /* z */ printf("%p\n",&c2[0][0]); /* 0xffbffa50 */ c2[0][0]='a'; /* ok */ c2[0]=c2[1]; /* incompatible types */
String Initialization • Creates an array of characters char c3[2][5] = {'z','e','r','o','\0', 'o','n','e','\0','\0'}; printf("%s\n",c3[0]); /* zero */ printf("%p\n",c3[0]); /* 0xffbffa40*/ printf("%p\n",&c3[0]); /* 0xffbffa40 */ printf("%c\n",c3[0][0]); /* z */ printf("%p\n",&c3[0][0]); /* 0xffbffa40 */ c3[0][0]='a'; /* ok */ c3[0]=c3[1]; /* incompatible types */
String Conversion Functions • Convert from strings to integers and floats • Requires the #include<stdlib.h> header • General utilities library • See p. 317 in textbook
String Conversion Functions • Converts string to an integer or double • int atoi(const char *nPtr); • double atof(const char *nPtr); • Examples int i=atoi("1234"); //i=1234 double f=atof("1.23"); //f=1.23 int x=atoi('5'); /*What’s wrong with this one?*/
String Handling Library • Functions for manipulating strings • Requires the #include<string.h> header • See textbook pp. 326-342 • Basic functions include: • Copying strings • Comparing strings • Searching strings • Determining length of strings • Tokenizing strings
Copy, Append, Compare • char *strcpy(char *s1, const char *s2); • Copies s2 into s1; the value of s1 is returned • Should not be used because of the potential for “buffer overflow” attack • char *strcat(char *s1, const char *s2); • Appends s2 to s1; the value of s1 is returned • char *strcmp(char *s1, const char *s2); • s1 > s2 returns positive number • s1 == s2 returns 0 • s1 < s2 returns negative number
Example Code char str1[]="string"; char str2[20]; strcpy(str2, str1); strcat(str2, str1); printf("%s\n",str1); /*string*/ printf("%s\n",str2); /*stringstring*/ printf("%d\n",strcmp(str2,str1)); /*115*/ printf("%d\n",strcmp(str2,str2)); /*0*/ printf("%d\n",strcmp(str1,str2)); /*-115*/ //See strcpy.c
Search Functions • char *strchr(const char *s, int c); • Locates & returns a pointer to the 1st occurrence of c in string s • Returns NULL if not found • char *strstr(const char *s1, const char *s2); • Locates & returns a pointer to the 1st occurrence of string s2 in string s1 • Returns NULL if not found
Example Code char str[]="string", rin[]="rin", abc[]="abc"; char i = 'i', z = 'z'; printf("%p\n",strchr(str,i)); /*0xffbefa8b*/ printf("%s\n",strchr(str,i)); /*ing*/ printf("%p\n",strchr(str,z)); /*0x0*/ printf("%p\n",strstr(str,rin)); /*0xffbefa8a*/ printf("%s\n",strstr(str,rin)); /*ring*/ printf("%p\n",strstr(str,abc)); /*0x0*/ //See strstr.c
Length of a String • size_tstrlen(const char s); • Returns the length of string s • Does not include the terminating null character • size_t is an unsigned integer type that is large enough to contain the maximum size of an array of any type
Example Code #include<stdio.h> #include<string.h> main(){ char str1[]="apple", *str2="banana"; char str3[6]={'a','b','\0','c','d','\0'}; printf("%d\n",strlen(str1)); /* 5 */ printf("%d\n",strlen(str2)); /* 6 */ printf("%d\n",strlen(str3)); /* 2 */ } //See strlen.c
String Tokenizer • char *strtok(char *s1, const char *s2); • Used to break a string into tokens • A sequence of characters separated by delimiting characters (usually spaces or punctuation marks) • s1 contains string to be tokenized • s2 contains the characters used to separate the tokens • 1st function call: has s1 as 1st argument • Next function call: NULL as 1st argument • Pointer to current token is returned
String Tokenizer #include <stdio.h> #include <string.h> int main(){ char str[]="This is a string."; char *tPtr=strtok(str," "); /*searches for s2 (space) & inserts null character ('\0') in its place*/ while(tPtr!=NULL){ printf("%s\n",tPtr); tPtr=strtok(NULL," "); /*keeps inserting '\0' in s2 (space)*/ } return 0; } //See token.c
String Tokenizer • Output from last slide This is a string. • Problem with strtok() is that it changes the original string by replacing s2 with the null character ('\0') • "This is a string." • Becomes: • "This'\0'is'\0'a'\0'string."
Standard I/O Library Functions • See p. 322 in textbook for list of functions in <stdio.h> library • int getchar(void); • Returns one character at a time from standard input • int putchar(int c); • Prints the character stored in “c” • char* puts(const char *s); • Prints the string followed by a newline character
Standard I/O Library Functions • Don’t use this function • char* gets(char *s); • Reads one line from standard input & puts it into array “s”. Includes blank spaces. • Should not be used because of the potential for “buffer overflow” attack