270 likes | 290 Views
Explore the fundamentals of handling strings in C programming, including string initialization, array of strings, formatting strings, and potential pitfalls to avoid. Learn how to manipulate and process strings efficiently in your programs.
E N D
StringsH&K Chapter 9 Instructor – GokcenCilingir Cpt S 121 (July 19, 2011) Washington State University
Strings • Although we’ve used char variables and we can work with arrays of chars, we still don’t have a convenient way to process a series of characters (or string, on C terminology) • Strings are an important data structure in most computer applications • Word processing • Data bases • File system
String literals • We’ve already been using string constants(or string literals) in our programs: printf("CptS %d is almost over!\n",121); string • A string literal is a sequence of characters enclosed within double quotes. Format strings in printf and scanf calls are string literals. • Just like in printf and scanf calls, escape sequences are allowed in a string literal.
String Basics (1) • Since C implements the string data structure using char arrays, declaring a string is no different than declaring a char arrays: char string_var [20]; • As with other data types, we can initialize a string when we declare it: char string_var [20] = “CptS 121”; char string_var [20] = {‘C’, ‘p’, ‘t’, ‘S’, ‘ ’, ‘1’, ‘2’, ‘1’}; Null character, string terminator (marks the end of the string)
String Basics (2) • Notes on the null character • When a string is initialized on the line it is declared, the compiler automatically "null terminates" the string. • All of C's string handling functions work only with null-terminated strings; any characters to the right of the null character are ignored. • Since one character must be reserved for the null termination character, we should always declare an array length that is one character longer than the longest string we expect to store • Hence, char name[20] actually declares a string of at most 19 characters (plus the null termination character)
String Basics (3) • Array of strings: • Suppose we want to store a list of student names • We can do this by declaring an array of strings, one row for each name: #define NUM_NAMES 3#define MAX_NAME_LENGTH 10char student_names[NUM_STUDENTS][MAX_NAME_LENGTH]; • We can initialize an array of strings “in-line” • char student_names[NUM_STUDENTS][MAX_NAME_LENGTH] = {"Gening", "Li", "Varun"};
String Basics (4) • Printing out and reading in strings #include <stdio.h> #define [NUM_STUDENTS] 3 #define [MAX_NAME_LENGTH] 10 char student_names[NUM_STUDENTS][MAX_NAME_LENGTH]; inti; for (i = 0; i < NUM_STUDENTS; i++) { printf("Please enter student name: "); scanf("%s",student_names[i]); printf("The name '%s' was just read in.\n", student_names[i]); } • Is the above code robust? Could it lead to a run-time crash? • If a name is entered that is more than 10 characters long, this code can easily lead to a run-time crash, why?
String Basics (5) • Printing out strings within fields • Recall that it's possible to put int and double data values into fields, e.g., printf ("double value: %.2f", my_float); printf("integer value: %5d", my_int); The above fields right justify the values within the field. • Often, we need not only to right-justify strings, but also to left-justify them:
String Basics (6) • Just as is the case for doubles and ints, we can specify a field width in a printf statement involving a string (%s). By default, the string is right justified within that field, e.g., printf("string value: %5s\n",my_string); /* string is right justified within field of 5 */ • If we want to left-justify the string, we specify a negative field width, e.g., printf("string value: %-5s\n",my_string); /* string is left justified within field of 5 */
String Basics (7) • Reading in multiple data types alongside the string data type may result in unexpected results if user fails to comply with the expected inputting format:
String Basics (8) • Recall, how scanf takes a numeric input from user: • Skips all the leading whitespace characters such as blanks, newline, and tabs. • Starting from the first non-white space character, reads until a character is reached that is either a whitespace or a character that is not supposed to be a part of the expected type (i.e. seeing an alpha character while expecting an integer stops scanf). • Same principles applies while taking a string input. • Skipping leading whitespace, • scanf copies the characters it encounters into the successive memory cells of its character array argument until a whitespace is encountered. • When it’s finished with scanning, it places a null character at the end of the string.
String Basics (9) • Let’s see what happens when the previous program is given the following input: MATH,1270,TR,1800The scanf callscanf("%s%d%s%d",dept,&course_num,days,&time);interprets this all as one string, storing it to dept (bad news!):
String Basics (10) • Example problem: • Write a program that prompts the user for a word less than 25 characters long and prints a statement like this:peace starts with the letter p Have the program process words until it encounters a "word" beginning with the character '9'.
String Basics (11) • Solution: #include <stdio.h> #define WORD_LENGTH 25 int main() { char word [WORD_LENGTH]; int continue; do{ continue = 1; printf("Enter a word(to quit start the word with ‘9’): "); scanf("%s", word); if (word [0] == '9') continue = 0; else printf("%s starts with the letter %c.\n", word, word [0]); } while (continue); return (0); }
String Library Functions (1) • Unlike the simple types (e.g., int, double and even structs), C does not define assignment and copy operators for strings. char string1[20] = "foo", string2[20]; string2 = string1; /* Will cause compiler error! */ if (string1 == string2) /* illegal! */ printf("The strings are equal!"); • In order to perform assignment and comparison, we need to call upon C library functions…
String Library Functions (2) • To use string library functions, we need to include the string library:#include <string.h> • String assignmentstrcpy (char * dest, const char *source); • Creates a fresh copy of source, assigning it to dest (an output parameter) • Example:char string1[20] = "CptS 121", string2[20];strcpy(string2,string1);/* string2 now also contains "CptS 121\0" */
String Library Functions (3) • String assignment: an alternative strncpy (char * dest, const char *source, unsigned intsize_t); • Copies size_t characters of source to dest. If source has fewer than size_t characters allocated to it, then '\0' is used to fill the remaining characters • Example: char string1[20] = "CptS", string2[20];strncpy(string2,string1,6);/* string2 now contains "CptS\0\0" */
String Library Functions (4) • String assignment: Only copying as many characters as will fit in the destination string • Notice that, if we're not paying attention, it's easy to copy more characters to the destination string than the space available:char string1[20] = "CptS", string2[4];strncpy (string2,string1,6);/* string2 now also contains "CptS\0\0", but the null terminators are beyond string2's allocated memory, so the string isn't properly terminated! */ • We can guard against this situation by making sure that we copy one fewer characters than the length of the dest string:strncpy(dest, source, dest_len – 1);dest[dest_len – 1] = '\0';
String Library Functions (5) • We can use strncpyto extract a series of characters(substrings) from a longer string • Example: Extract the month from a date string like "24-Sep-1999 " char date[12] = "24-Sep-1999", month[4]; strncpy( month[4], &date[3],3); month[4] = ‘\0’; /* month now contains “Sep\0" */
String Library Functions (6) • An alternative for substring extraction: string tokenizer: strtok strtok(char * str, const char * delimiters); • strtok breaks source into tokens by finding groups of characters separated by any of the characters in delimiters. • First call must provide source and delimiters. • Subsequent calls using NULL as the source string find additional tokens in the original source • strtok alters source by replacing first delimiter following a token by ‘\0’.
String Library Functions (7) • For example, suppose that we want to extract the department code ("CptS"), course number (121), semester (Spring), and year (2011) elements from "CptS 121, Spring, 2011"char course[25] = "CptS 121, Spring, 2011";char course_copy[25];char *dept_code, *course_num, *semester, *year; strcpy(course_copy, course); /*strtok alters source string */dept_code = strtok(course_copy, ", ");course_num = strtok(NULL, ", ");semester = strtok(NULL, ", ");year = strtok(NULL, ", "); printf("dept. code: %s\n", dept_code);printf("course num: %s\n", course_num);printf("semester: %s\n", semester);printf("year: %s\n", year); Output:dept. code: CptS course num: 121 semester: Spring year: 2011
String Library Functions (8) • String concatenation with strcatandstrncat: char * strcat(char * destination, const char * source ); char * strncat(char * destination, char * source, size_tnum ); • strcatappends a copy of the source string to the destination string. • The terminating null character in destination is overwritten by the first character of source, and a new null-character is appended at the end of the new string formed by the concatenation of both in destination. • strncatappends the first num characters of source to destination, plus a terminating null-character.
String Library Functions (9) • For example, given the substrings "CptS", "121", “Fall", and "2010", suppose we want to piece together the bigger string "CptS 121, Spring, 2011“:char course[25] = ""; /* Empty string—first char is '\0' */char dept_code[5] = "CptS";char course_num[7] = "121 is";char semester[7] = “Spring";char year[5] = "2011"; strcat(course, dept_code);strcat(course, " ");strncat(course, course_num, 3); /* Take only 1st 3 char */ strcat(course, ", ");strcat(course, semester);strcat(course, ", ");strcat(course, year); printf("%s\n",course);Output:CptS 121, Spring, 2011
String Library Functions (10) • String comparisonintstrcmp(const char *s1, const char *s2); • Returns • a negative value if s1 < s2 • 0 if s1 == s2 • a positive value if s1 > s2 • Example:char string1[20] = "CptS 121", string2[20] = "CptS 122";int result = strcmp(string1, string2);/* result now contains a negative value */ • Note: To determine whether one string is greater than, less than, or equal to another, strcmp compares the strings, one character at a time • All comparisons are based on ASCII character codes
You Try It (1) Re-write the Selection Sort Algorithm discussed in class so that it operates on an array of strings. Test your code on sample input data sets of strings. The next slide presents the original selection sort code…
You Try It (2) • Code for Selection Sort void selection_sort(int values[], int num_values) { int i, index_of_smallest, temp; for (i = 0; i < num_values - 1; ++i) { /* Find the index of the smallest element in unsorted list... */ index_of_smallest = find_smallest(values,i,num_values-1); /* Swap the smallest value in the subarray i+1 .. num_values - 1 with the value i, thereby putting into place the ith element. */ temp = values[i]; values[i] = values[index_of_smallest]; values[index_of_smallest] = temp; } } int find_smallest(int values[], int low, int high) { int smallest_index, i; smallest_index = low; for (i = low + 1; i <= high; i++) if (values[i] < values[smallest_index]) smallest_index = i; return smallest_index; }
References • J.R. Hanly & E.B. Koffman, Problem Solving and Program Design in C (6th Ed.), Addison-Wesley, 2010 • K. N. King, C Programming: A Modern Approach, W. W. Norton & Company, 1996.