640 likes | 745 Views
Characters, String and Regular expressions. Characters. char data type is used to represent a single character. Characters are stored in a computer memory using some form of encoding. Java uses Unicode , which includes ASCII, for representing char constants. 9.
E N D
Characters • char data type is used to represent a single character. • Characters are stored in a computer memory using some form of encoding. • Java uses Unicode, which includes ASCII, for representing char constants.
9 For example, character 'O' is 79 (row value 70 + col value 9 = 79). O 70 ASCII Encoding
char ch1 = 'X'; System.out.println(ch1); System.out.println( (int) ch1); X 88 Unicode Encoding • The Unicode Worldwide Character Standard (Unicode) supports the interchange, processing, and display of the written texts of diverse languages. • Java uses the Unicode standard for representing char constants.
Declaration and initialization char ch1, ch2 = ‘X’; System.out.print("ASCII code of character X is " + (int) 'X' ); System.out.print("Character with ASCII code 88 is " + (char)88 ); Type conversion between int and char. ‘A’ < ‘c’ This comparison returns true because ASCII value of 'A' is 65 while that of 'c' is 99. Character Processing
Strings • A string is a sequence of characters that is treated as a single value. • Instances of the String class are used to represent strings in Java.
String declaration • Create a String object String <variable name>; <variable name>=new String(“<value of a string>”); • Create a String literal String <variable name>; <variable name> = “<value of a string”;
Example String word1; word1 = new String(“Java”); OR String word1; word1 = “Java”;
Examples We can do this because String objects are immutable.
String constructor • No-argument constructor • One-argument constructor • A String object • One-argument constructor • A char array • Three-argument constructor • A char array • An integer specifies the starting position • An integer specifies the number of characters to access
Review To compute how many characters the string myString contains, we use: • myString.size • myString.size() • myString.length • myString.length()
Review To compute how many characters the string myString contains, we use: • myString.size • myString.size() • myString.length • myString.length()
Review • Java uses this to represent characters of diverse languages • ASCII • UNICODE • EDBIC • BINARY
Review • Java uses this to represent characters of diverse languages • ASCII • UNICODE • EDBIC • BINARY
0 1 2 3 4 5 6 S u m a t r a String name = "Sumatra"; Accessing Individual Elements • Individual characters in a String accessed with the charAt method. name.charAt( 3 ) name The method returns the character at position # 3. This variable refers to the whole string.
Compute Length of a string • Method: length() Returns the length of a string • Example: String strVar; strVar = new String(“Java”); int len = strVar.length();
Substring • Method: Extract a substring from a given string by specifying the beginning and ending positions • Example: String strVar, strSubStr; strVar = new String(“Exam after Easter”); strSubStr = strVar.substring(0,4);
Index position of a substring within another string • Method: Find an index position of a substring within another string. • Example: String strVar1 = “Google it”; String strVar2 = “Google”; int index; index = strVar1.indexOf(strVar2);
String concatenation • Method: Create a new string from two strings by concatenating the two strings. • Example: String strVar1 = “Google”; String strVar2 = “ Search Engine”; String sumStr; sumStr = strVar1.concat(strVar2);
Review What method is used to refer to individual character in a String • getBytes • indexOf • getChars • charAt
Review What method is used to refer to individual character in a String • getBytes • indexOf • getChars • charAt
Review To compare two strings in Java, we use • == • equals method • != • <>
Review To compare two strings in Java, we use • == • equals method • != • <>
String comparison Methods: equals equalsIgnoreCase compareTo compareToIgnoreCase
String comparison equals, and equalsIgnoreCase Example: String string1 =“COMPSCI”; String string2 = “compsci” boolean isEqual; isEqual = string1.equals(string2);
Common error Comparing references with ==can lead to logic errors, because == compares the references to determine whether they refer to the same object, not whether two objects have the same contents. When two identical (but separate) objects are compared with ==, the result will be false. When comparing objects to determine whether they have the same contents, use method equals.
String comparison compareTo Example: String string1 =“Adam”; String string2 = “AdamA”; int compareResult; compareResult = string1.compareTo(string2);
String comparison - string1.compareTo(string2) Compares two strings lexicographically will return 0 if two strings are equal will return negative value if string1 is less than string 2 will return positive value if string1 is greater than string 2 The comparison is based on the Unicode value of each character in the strings
String comparison The comparison is based on the Unicode value of each character in the strings let k be the smallest index valid for both strings; compareTo returns the difference of the two character values at position k in the two string -- that is, the value: character at the position k of string 1 – character at the position k of string 2
regionMatches • regionMatches(boolean ignoreCase, int toffset, String other, int ooffset, int len) A substring of this String object is compared to a substring of the argument other. The result is true if these substrings represent character sequences that are the same, ignoring case if and only if ignoreCase is true.
The String Class is Immutable • In Java a String object is immutable • This means once a String object is created, it cannot be changed, such as replacing a character with another character or removing a character • The String methods we have used so far do not change the original string. They created a new string from the original. For example, substring creates a new string from a given string.
Review If x.equals(y) is true, then x==y is always true • True • False
Review If x.equals(y), then x==y is always true • True • False
The StringBuffer Class • In many string processing applications, we would like to change the contents of a string. In other words, we want it to be mutable. • Manipulating the content of a string, such as replacing a character, appending a string with another string, deleting a portion of a string, and so on, may be accomplished by using the StringBuffer class.
StringBuffer word = new StringBuffer("Java"); word.setCharAt(0, 'D'); word.setCharAt(1, 'i'); word word : StringBuffer : StringBuffer Java Diva Before After StringBuffer Example Changing a string Java to Diva
Delete a substring from a StringBuffer object • StringBuffer word = new StringBuffer(“CCourse”); word.delete(0,1);
Append a string • StringBuffer word = new StringBuffer(“CS ”); word.append(“Course”);
Insert a string • StringBuffer word = new StringBuffer(“MCS Course”); word.insert(4,“220”);
Convert from StringBuffer to String StringBuffer word = new StringBuffer(“Java”); word.setCharAt(0,’D’); word.setCharAt(1,’i’); System.out.println(word.toString());
Review Both the String and StringBuffer classes include the charAt and setCharAt methods • True • False
Review Both the String and StringBuffer classes include the charAt and setCharAt methods • True • False
Review What will be the value of str after the following statements are executed: String str; StringBuffer strBuf; str ="Decaffeinated"; strBuf = new StringBuffer(str.substring(2,7)); strBuf.setCharAt(1,'o'); strBuf.append('e'); str = strBuf.toString();
StringBuffer methods • Method length • Return StringBuffer length • Method capacity • Return StringBuffer capacity • Method setLength • Increase or decrease StringBuffer length • Method ensureCapacity • Set StringBuffer capacity • Guarantee that StringBuffer has minimum capacity
Class StringTokenizer • Tokenizer • Partition String into individual substrings • Use delimiter • Typically whitespace characters (space, tab, newline, etc) • Java offers java.util.StringTokenizer
first character is 5 third character is any digit between 1 and 7 second character is 1, 2, or 3 Pattern Example • Suppose students are assigned a three-digit code: • The first digit represents the major (5 indicates computer science); • The second digit represents either in-state (1), out-of-state (2), or international (3); • The third digit indicates campus housing: • On-campus dorms are numbered 1-7. • Students living off-campus are represented by the digit 8. 5[123][1-7] The 3-digit pattern to represent computer science majors living on-campus is
Regular Expressions, Class Pattern and Class Matcher • Regular expression • Sequence of characters and symbols • Useful for validating input and ensuring data format • Facilitate the construction of a compiler • Regular-expression operations in String • Method matches • Matches the contents of a String to regular expression • Returns a boolean indicating whether the match succeeded
Regular Expressions, Class Pattern and Class Matcher • Predefine character classes • Escape sequence that represents a group of character • Digit • Numeric character • Word character • Any letter, digit, underscore • Whitespace character • Space, tab, carriage return, newline, form feed