50 likes | 188 Views
Things we might want to do Finding patterns using regular expressions Manipulating Strings Splitting Processing Tokens Basic methods to review Substring(), charAt(), indexOf(), toLowerCase(), startsWith(), endsWith(), firstIndexOf(), lastIndexOf(), trim(), length(). Advanced String handling.
E N D
Things we might want to do • Finding patterns using regular expressions • Manipulating Strings • Splitting • Processing Tokens • Basic methods to review • Substring(), charAt(), indexOf(), toLowerCase(), startsWith(), endsWith(), firstIndexOf(), lastIndexOf(), trim(), length() Advanced String handling
Regular expressions is a syntax for pattern matching used by many programming languages • Examples of regular expression syntax: • [aceF] matches any of the letters enclosed in [ ] • * matches zero or more occurrences of a pattern • + matches one or more occurrences of a pattern • \s matches whitespace • String methods that use regular expressions include matches(), split(), replaceAll() • More concrete examples are on the following slides Pattern matching Note: This page is a brief overview; regular expression syntax has much more in it
Problem: A String is an immutable object • Bad solution (1453 milliseconds on my computer): • Repeatedly create a new string from an old one • String str; for (int i=0; i<10000; i++) str += “abcdef”; • Better solution (0 milliseconds on my computer): • Use StringBuilder, a mutable string class StringBuilder build = new StringBUilder(10000); For (int i=0; i<10000; i++) { build.append(“abcdef”); } String str = build.toString(); Manipulationg Strings
Extracting information from a string • Example: String date = "January 23, 1923 10:32:15 pm"; String[] data = date.split("[, :]+"); for (int i=0; i<data.length; i++) System.out.println(data[i]); • The argument to split: • determines how the date is converted to an array • Characters enclosed between [ and ] are delimiters • Space, comma and colon • Split when one or more (+) delimiters are found in a row Output January231923103215pm Splitting tokens Definition: A token is a group of characters treated as a unit
String Tokenizer class String expression = "((3.5 + 52)/234 + 75.2*83.9 - 9.0"; // The next line removes white space expression = expression.replaceAll("\\s*",""); StringTokenizer tokenizer = new StringTokenizer(expression, "()+-/*", true); while (tokenizer.hasMoreTokens()){ System.out.println(tokenizer.nextToken()); } Output ((3.5 +52)/234 +75.2*83.9 -9.0 Another way to split tokens Definition: white space includes space, tab, new line characters