670 likes | 787 Views
Validation Controls Dr. Awad Khalil Computer Science & Engineering Department AUC. Regular Expressions and Class Regex.
E N D
Validation Controls Dr. Awad Khalil Computer Science & Engineering Department AUC
Regular Expressions and Class Regex • Regularexpressions are specially formatted strings used to find patterns in text. They can be useful during information validation, to ensure that data in a particular format. For example, a ZIP code must consist of five digits, and a last name must start with a capital letter. • Compilers use regular expressions to validate the syntax of programs. If the program code does not match the regular expression, the compiler indicates that there is syntax error. • The .NET Framework provides several classes to help developers recognize and manipulate regular expressions.
Regular Expressions and Class Regex • Class Regex (of the System.Text.RegularExpressions namespace) represents an immutable regular expression. Regex method Match returns an object of class Match that represents a single regular expression match. • Regex also provides method Matches, which finds all matches of a regular expression in an arbitrary string and returns an object of the class MatchCollection object containing all the Matches. A collection is data structure, similar to an array and can be used with a foreach statement to iterate through the collection’s elements.
Regular Expressions Character Classes • The following table specifies some characterclasses that can be used with regular expressions. • Please do not confuse a character class with C# class declaration. A characterclass is simply an escape sequence that represents a group of characters that might appear in a string. • A wordcharacter is any alphanumeric character or underscore. • A whitespace character is a space, a tab, a carriage return, a newline or a form feed. A digit is any numeric character. • Regularexpressions are not limited to the character classes, they can use other notations to search for complex patterns in strings.
Regular Expression Example • The following program tries to match birthdays to a regular expression. • Just for demonstration purposes, the expression matches only birthdays that do not occur in April and that belong to people whose names begin with “J”. • // RegexMatches.cs • // Demonstrating Class Regex. • using System; • using System.Text.RegularExpressions; • class RegexMatches • { • public static void Main() • { • // create regular expression • Regex expression = • new Regex( @"J.*\d[0-35-9]-\d\d-\d\d" ); • string string1 = "Jane's Birthday is 05-12-75\n" + • "Dave's Birthday is 11-04-68\n" + • "John's Birthday is 04-28-73\n" + • "Joe's Birthday is 12-17-77"; • // match regular expression to string and • // print out all matches • foreach ( Match myMatch in expression.Matches( string1 ) ) • Console.WriteLine( myMatch ); • } // end method Main • } // end class RegexMatches
Examining RegexMatches.cs • Lines 11-12 create a Regex object and pass a regular expression pattern string to the Regex constructor. Note that we precede the string with @. Recall that backslashes within the double quotation marks following the @ character are regular backslash characters, not the beginning of escape sequences. • To define the regular expression without prefixing @ to the string, you would need to escape every backslash character, as in “J.*\\d[0-35-9]-\\d\\d-\\d\\d” which makes the regular expression more difficult to read.
Examining RegexMatches.cs • The first character in the regular expression, “J”, is a literal character. Any string matching this regular expression is required to start with “J”. • In a regular expression, the dot character “.” matches any single character except a newline character. When the dot character is followed by an asterisk, as in “.*” , the regular expression matches any number of unspecified characters except newlines. • In general, when the operator “*” is applied to a pattern, the pattern will match zero or more occurrences.
Examining RegexMatches.cs • By contrast, applying the operator “+” to a pattern causes the pattern to match one or more occurrences. For example both “A*” and “A+” will match “A” but only “A*” will match an empty string. • “\d” matches any numeric digit . To specify sets of characters other than those that belong to a predefined character class, characters can be listed in square brackets, []. For example, the pattern “[aeiou]” matches any vowel. • Ranges of characters are represented by placing a dash (-) between two characters. In the example, “[0-35-9]” matches only digits in the ranges specified by the pattern - i.e., any digit between 0 and 3 or between 5 and 9; therefore, it matches any digit except 4. • You can also specify that a pattern should match anything other than the characters in the brackets. To do so, place ^ as the first in the brackets. It is important to note that “[^4]” is not the same as “[0-35-9]” ; “[^4]” matches any non-digit and digits other than 4.
Examining RegexMatches.cs • Although the “-” character indicates a range when it is enclosed in square brackets, instances of the “-” character outside grouping expressions are treated as literal characters. Thus, the regular expression in line 12 searches for a string that starts with the letter “J”, followed by any number of characters (except newline), followed by a dash, another two-digit number (of which the second digit cannot be 4), followed by a dash, another two-digit number, a dash and another two-digit number. • Lines 21-22 use a foreach statement to iterate through the MatchCollection returned by the expression object’s Matches method, which received string1 as an argument. The elements in the MatchCollection are Match objects, so the foreach statement declares variable myMatch to be type Match. For each Match, line 22 outputs the text that matched the regular expression.
Quantifiers • The asterisk (*) in line 12 is more formally called a quantifier. The following table lists various quantifiers that you can place after a pattern in a regular expression and the purpose of each quantifier. • All quantifiers are greedy – they will match as many occurrences of the pattern as possible until the pattern fails to make a match. • If a quantifier is followed by a question mark (?), the quantifier becomes lazy and will match as few occurrences as possible as long as there is a successful match.
Validating User Input Using Regular Expressions • The following Windows application uses regular expressions to validate name, address and telephone number information input by a user. • // Validate.cs • // Validate user information using regular expressions. • using System; • using System.Text.RegularExpressions; • using System.Windows.Forms; • public partial class ValidateForm : Form • { • // default constructor • public ValidateForm() • { • InitializeComponent(); • } // end constructor
// handles OkButton Click event • private void okButton_Click( object sender, EventArgs e ) • { • // ensures no TextBoxes are empty • if ( lastNameTextBox.Text == "" || firstNameTextBox.Text == "" || • addressTextBox.Text == "" || cityTextBox.Text == "" || • stateTextBox.Text == "" || zipCodeTextBox.Text == "" || • phoneTextBox.Text == "" ) • { • // display popup box • MessageBox.Show( "Please fill in all fields", "Error", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • lastNameTextBox.Focus(); // set focus to lastNameTextBox • return; • } // end if
// if last name format invalid show message • if ( !Regex.Match( lastNameTextBox.Text, • "^[A-Z][a-zA-Z]*$" ).Success ) • { • // last name was incorrect • MessageBox.Show( "Invalid last name", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • lastNameTextBox.Focus(); • return; • } // end if • // if first name format invalid show message • if ( !Regex.Match( firstNameTextBox.Text, • "^[A-Z][a-zA-Z]*$" ).Success ) • { • // first name was incorrect • MessageBox.Show( "Invalid first name", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • firstNameTextBox.Focus(); • return; • } // end if
// if address format invalid show message • if ( !Regex.Match( addressTextBox.Text, • @"^[0-9]+\s+([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$" ).Success ) • { • // address was incorrect • MessageBox.Show( "Invalid address", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • addressTextBox.Focus(); • return; • } // end if • // if city format invalid show message • if ( !Regex.Match( cityTextBox.Text, • @"^([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$" ).Success ) • { • // city was incorrect • MessageBox.Show( "Invalid city", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • cityTextBox.Focus(); • return; • } // end if
// if state format invalid show message • if ( !Regex.Match( stateTextBox.Text, • @"^([a-zA-Z]+|[a-zA-Z]+\s[a-zA-Z]+)$" ).Success ) • { • // state was incorrect • MessageBox.Show( "Invalid state", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • stateTextBox.Focus(); • return; • } // end if • // if zip code format invalid show message • if ( !Regex.Match( zipCodeTextBox.Text, @"^\d{5}$" ).Success ) • { • // zip was incorrect • MessageBox.Show( "Invalid zip code", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • zipCodeTextBox.Focus(); • return; • } // end if
// if phone number format invalid show message • if ( !Regex.Match( phoneTextBox.Text, • @"^[1-9]\d{2}-[1-9]\d{2}-\d{4}$" ).Success ) • { • // phone number was incorrect • MessageBox.Show( "Invalid phone number", "Message", • MessageBoxButtons.OK, MessageBoxIcon.Error ); • phoneTextBox.Focus(); • return; • } // end if • // information is valid, signal user and exit application • this.Hide(); // hide main window while MessageBox displays • MessageBox.Show( "Thank You!", "Information Correct", • MessageBoxButtons.OK, MessageBoxIcon.Information ); • Application.Exit(); • } // end method okButton_Click • } // end class ValidateForm
Examining Validate.cs • When a user clicks the OK button, the program checks to make sure that none of the fields is empty (lines 19-22). • If one or more fields are empty, the program displays a message to the user (lines 25-26) that all fields must be filled in before the program can validate the input information. • Line 27 calls lastNameTextBox’sFocus method to place the cursor in the lastNameTextBox. The program then exits the event handler (line 28). If there are no empty fields, lines 32-105 validate the user input. • Lines 32-40 validate the last name by calling static method Match of class Regex, passing both the string to validate and the regular expression as arguments. • Method Match returns a Match object. This object contains a Success property that indicates whether method Match’s first argument matched the pattern specified by the regular expression in the second argument.
Examining Validate.cs • If the value of Success is False (i.e., there was no match), lines 36-37 display an error message, line 38 sets the focus back to the lastNameTextBox so that user can retype the input and line 39 terminates the event handler. • If there is a match, the event handler proceeds to validate the first name. this process continues until the event handler validates the user input in all the TextBoxes or until a validation fails. • If all of the fields contain valid information, the program displays a message dialog stating this, and the program exits when the user dismisses the dialog.
Examining Validate.cs • In the previous example, we searched a string for substrings that matched a regular expression. In this example, we want to ensure that the entire string in each TextBox conforms to a particular regular expression. For example, we want to accept “Khalil” as a last name , but not “9@Khalil#”. In a regular expression that begins with a “^” character and ends with a “$” character, the characters “^” and “$” represent the beginning and end of a string, respectively. Those characters force a regular expression to return a match only if the entire string being processed matches the regular expression. • The regular expression in line 33 uses the square bracket and range notation to match an uppercase first letter, followed by letters of any case – a-z matches any lowercase letter, and A-Z matches any uppercase letter. The * quantifier signifies that the second range of characters may occur zero or more times in the string. Thus, this expression matches any string consisting of one uppercase letter, followed by zero or more additional characters.
Examining Validate.cs • The notation \s matches a single whitespace character (lines 55, 66 and 77). The expression \d{5}, used in the Zip (zip code) field, matches any five digits (line 87). • Note that without the “^” and “$” characters, the regular expression would match any five consecutive digits in the string. By including the “^” and “$” characters, we ensure that only five-digit zip codes are allowed. • The character “|” (lines 55, 66 and 77) matches the expression to its left or the expression to its right. For example, Hi (John | Jane) matches both HiJohn and HiJane. In line 55, we use the character “|” to indicate that the address can contain a word of one or more characters or a word of one or more characters followed by a space and another word of one or more characters. • Note the use of parenthesis to group parts of the regular expression. • Quantifiers may be applied to patterns enclosed in parenthesis to create more complex regular expressions.
Examining Validate.cs • The Lastname and Firstname fields both accept strings of any length that begin with an uppercase letter. The regular expression for the Address field (line 55) matches a number of at least one digit, followed by a space and then either one or more letters or else one or more letters followed by a space and another series of one or more letters. Therefore, “10 Elhegaz” and “10 Elhegaz Street” are both valid addresses. • As currently formed, the regular expression in line 55 does not match an address that does not start with a number or that has more than two words. • The regular expression for the City (line 66) and State (line 77) fields match any word of at least one character or, alternatively, any two words of at least one character if the words are separated by a single space. This means both Cairo and NewCairo would match. Again, these regular expressions would not accept names that have more than two words.
Examining Validate.cs • The regular expression for the Zipcode field (line 87) ensures that the zip code is a five-digit number. • The regular expression for the Phone field (line 98) indicates that the phone number must be of the form xxx-yyy-yyyy, where the xs represent the area code and the ys the number. The first x and the first y cannot be zero, as specified by the range [1-9] in each case.
Regex methods Replace and Split • Sometimes it is useful to replace parts of one string with another or to split a string according to a regular expression. For this purpose, Regex class provides static and instance versions of methods Replace and Split, which are demonstrated in the following example. • // RegexSubstitution.cs • // Using Regex method Replace. • using System; • using System.Text.RegularExpressions; • class RegexSubstitution • { • public static void Main() • { • string testString1 = • "This sentence ends in 5 stars *****"; • string output = ""; • string testString2 = "1, 2, 3, 4, 5, 6, 7, 8"; • Regex testRegex1 = new Regex( @"\d" ); • string[] result;
Console.WriteLine( "Original string: " + • testString1 ); • testString1 = Regex.Replace( testString1, @"\*", "^" ); • Console.WriteLine( "^ substituted for *: " + testString1 ); • testString1 = Regex.Replace( testString1, "stars", • "carets" ); • Console.WriteLine( "\"carets\" substituted for \"stars\": " + • testString1 ); • Console.WriteLine( "Every word replaced by \"word\": " + • Regex.Replace( testString1, @"\w+", "word" ) ); • Console.WriteLine( "\nOriginal string: " + testString2 ); • Console.WriteLine( "Replace first 3 digits by \"digit\": " + • testRegex1.Replace( testString2, "digit", 3 ) ); • Console.Write( "string split at commas [" );
result = Regex.Split( testString2, @",\s" ); • foreach ( string resultString in result ) • output += "\"" + resultString + "\", "; • // Delete ", " at the end of output string • Console.WriteLine( output.Substring( 0, output.Length - 2 ) + "]" ); • } // end method Main • } // end class RegexSubstitution
Examining RegexSubdtitution.cs • Method Replace replaces text in a string with new text wherever the original string matches a regular expression. We use two versions of this method in our example. The first version (line 19) is static and takes three parameters – the string to modify, the string containing the regular expression to match and the replacement string. Here, Replace replaces every instance of “*” in testString with “^”. • Note that the regular expression (@”\*”) precedes character * with a backslash, \. Normally, * is a quantifier indicating that a regular expression should match any number of occurrences of a preceding pattern. • However, in line 19, we want to find all occurrences of the literal characters *; to do this, we must escape character * with character \. By escaping a special regular expression character with a \, we tell the regular-expression matching engine to find the actual character * rather than use it as a quantifier.
Examining RegexSubdtitution.cs • The second version of method Replace (line 29) is an instance method that uses the regular expression passed to the constructor for testRegex1 (line 14) to perform the replacement operation. Line 14 instantiates testRegex1 with argument @”\d”. • The call to instance method Replace in line 29 takes three arguments – a string to modify, a string containing the replacement text and an int specifying the number of replacements to make. In this case, line 29 replaces the first three instances of a digit (“\d”) in testString2 with the text “digit”. • Method Split divides a string into several substrings. The original string is broken at delimiters that match a specified regular expression. • Method Split returns an array containing the substrings. In line 32, we use the static version of method Split to separate a string of comma-separated integers.
Examining RegexSubdtitution.cs • The first argument is the string to split; the second argument is the regular expression that represents the delimiter. • The regular expression @”,\s” separates the substrings at each comma. • By matching any whitespace characters (\s* in the regular expression), we eliminate extra spaces from the resulting substrings.
Validation Controls (Validators) in ASP.NET • A validationcontrol (or validator) determines whether the data in another Web control is in the proper format. For example, validation could determine whether a user has provided information in a required field or whether a ZIP –code field contains exactly five digits. • Validators provide a mechanism for validating user input on the client. When the XHTML for our page is created, the validator is converted into ECMAScript that performs the validation. • ECMAScript (commonly known as JavaScript) is a scripting language that enhances the functionality and appearance of Web pages. ECMAScript is typically executed on the client. • Some clients do not support scripting or disable scripting. However, for security reasons, validation is always performed on the server – whether or not the script executes on the client.
Validating Input on a Web Form • The following example prompts the user to enter a name, e-mail address and phone number . A Web site could use a form like this to collect contact information from site visitors. • After the user enters any data, but before the data is sent to the Web server, validators ensure that the user entered a value in each field and that the e-mail address and phone number values are in an acceptable format. • In this example, (002) 123-4567, 002-123-4567, 123-4567 and (012) 123-4567 are all considered valid phone numbers. • Once the data is submitted, the Web server responds by displaying an appropriate message and an XHTML table repeating the submitted information. • Note that a real business application would typically store the submitted data in a database or in a file on the server. We simply send the data back to the form to domenstrate that the server received the data.
Validation.aspx • <%-- Validation.aspx --%> • <%-- Form that demonstrates using validators to validate user input. --%> • <%@ Page Language="C#" AutoEventWireup="true" • CodeFile="Validation.aspx.cs" Inherits="Validation" %> • <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" • "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> • <html xmlns="http://www.w3.org/1999/xhtml" > • <head id="Head1" runat="server"> • <title>Demonstrating Validation Controls</title> • </head> • <body> • <form id="form1" runat="server"> • <div> • Please fill out the following form.<br /> • <em>All fields are required and must • contain valid information.</em><br /> • <br />
<table> • <tr> • <td style="width: 100px" valign="top">Name:</td> • <td style="width: 450px" valign="top"> • <asp:TextBox ID="nameTextBox" runat="server"> • </asp:TextBox><br /> • <asp:RequiredFieldValidator ID="nameInputValidator" • runat="server" ControlToValidate="nameTextBox" • ErrorMessage="Please enter your name." • Display="Dynamic"></asp:RequiredFieldValidator> • </td> • </tr> • <tr> • <td style="width: 100px; height: 64px;" valign="top"> • E-mail address:</td> • <td style="width: 450px; height: 64px;" valign="top"> • <asp:TextBox ID="emailTextBox" runat="server"> • </asp:TextBox> • e.g., user@domain.com<br />
<asp:RequiredFieldValidator ID="emailInputValidator" • runat="server" ControlToValidate="emailTextBox" • ErrorMessage="Please enter your e-mail address." • Display="Dynamic"></asp:RequiredFieldValidator> • <asp:RegularExpressionValidator • ID="emailFormatValidator" runat="server" • ControlToValidate="emailTextBox" • ErrorMessage="Please enter an e-mail address in a • valid format." Display="Dynamic" • ValidationExpression= • "\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*"> • </asp:RegularExpressionValidator> • </td> • </tr> • <tr> • <td style="width: 100px" valign="top">Phone number:</td> • <td style="width: 450px" valign="top"> • <asp:TextBox ID="phoneTextBox" runat="server"> • </asp:TextBox> • e.g., (555) 555-1234<br />
<asp:RequiredFieldValidator ID="phoneInputValidator" • runat="server" ControlToValidate="phoneTextBox" • ErrorMessage="Please enter your phone number." • Display="Dynamic"></asp:RequiredFieldValidator> • <asp:RegularExpressionValidator • ID="phoneFormatValidator" runat="server" • ControlToValidate="phoneTextBox" • ErrorMessage="Please enter a phone number in a • valid format." Display="Dynamic" • ValidationExpression= • "((\(\d{3}\) ?)|(\d{3}-))?\d{3}-\d{4}"> • </asp:RegularExpressionValidator> • </td> • </tr> • </table> • <br /> • <asp:Button ID="submitButton" runat="server" Text="Submit" /> • <br /><br />
<asp:Label ID="outputLabel" runat="server" • Text="Thank you for your submission." • Visible="False"></asp:Label> • </div> • </form> • </body> • </html>
Examining Validation.aspx • Validation.aspx uses a table to organize the page’s contents. Lines 24-25, 36-37 and 56-57 define TextBoxes for receiving the user’s name, e-mail address and phone number, respectively, and line 75 defines a Submit button. • Lines 77-79 create a Label named outputLabel that displays the response from the server when the user successfully submits the form. • Notice that that outLable’sVisible property is initially set to False, so that the Label does not appear in the client’s browser when the page loads for the first time.
Using RequiredFieldValidator Control • In this example, we use three RequiredFieldValidaor controls (found in the Validation section of the ToolBox) to ensure that the name, e-mail address and phone number Textboxes are not empty when the form is submitted. • A RequiredFieldValidator makes an input control a required field. If such field an empty, validation fails. For example, lines 26-29 define RequiredFieldValidatornameInputValidator, which confirms that nameTextBox is not empty. Line 27 associates nameTextBox with nameInputValidator by setting the validator’sControlToValidate property to nameTextBox. This indicates that nameInputValidator verifies the nameTextBox’s contents. • Property ErrorMessage’stext (line 28) is displayed on the Web Form if the validation fails. If the user does not input any data in nameTextBox and attempts to submit the form, the ErrorMessage text is displayed in red. • Because we set the control’s Display property to dynamic (line 29), the validator takes a space on the Web Form only when validation fails – space is allocated dynamically when validation fails, causing the controls below the validator to shift downwards to accommodate the ErrorMessage.
Using RegularExpressionValidator Control • The example also uses RegularExpressionValidator controls to match the e-mail address and phone number entered by the user against regular expressions. • These controls determine whether the e-mail address and phone number were each entered in a valid format. For example, lines 43-50 create a RegularExpressionValidator named emailFormatValidator. Line 45 sets property ControlToValidate to emailTextBox to indicate that emailFormatValidator verifies the emailTextBox’s contents. • A RegularExpressionValidator’sValidationExpression property specifies the regular expression that validates the ControlToValidate’s contents. • Clicking the ellipsis next to property ValidationExpression in the Properties window displays the Regular ExpressionEditor dialog, which contains a list of Standardexpressionsfot phone numbers, ZIP codes and other formatted information.
Using RegularExpressionValidator Control • You can also write your own custom expression. • For the emailFormatValidator, we selected the standard expression Internet e-mail address, which uses the validation expression: \w+([-+.’]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* • This regular expression indicates that an e-mail address is valid if the part of the address before the @ symbol contains one or more word characters (alphanumeric characters or underscore), followed by zero or more strings comprised of a hyphen, plus sign, period or apostrophe and additional word characters. • After the @ symbol, a valid e-mail address must contain one or more groups of word characters potentially separated by hyphens or periods, followed by a required period and another group of one more word characters potentially separated by hyphens or periods. • For example, akhalil@aucegypt.edu, nathalie.khalil@hp.com, bob-white@my-email.com and bob’s-personal.email@white.email.com are all valid email addresses. • If the user enters text in the emailTextBox that does not have the correct format and either clicks in a different text box or attempts to submit the form, the ErrorMessage text is displayed in red.
Using RegularExpressionValidator Control • We also use RegularExpressionValidatorphoneFormatValidator (lines 63-70) to ensure that the phoneTextBox contains a valid phone number before the form is submitted. • In the RegularExpressionEditor dialog, we select U.S. phonenumber, which assigns: ((\(d{3}\) ?) | (\d{3}-))?\d{3}-\d{4} to the ValidationExpression property. This expression indicates that a phone number can contain a three-digit area code either in parenthesis and followed by an optional space or without parenthesis and followed by required hyphen. After an optional area code, a phone number must contain three digits, a hyphen and another four digits. For example, (555) 123-4567, 555-123-4567 and 123-4567 are all valid phone numbers. • If all five validators are successful (i.e., each TextBox is filled in, and the e-mail address and phone number provided are valid), clicking the Submit button sends the form’s data to the server. The server then responds by displaying the submitted data in the outputLabel (lines 77-79)