1 / 39

Data Manipulation & Regular Expressions CSCI 215

Data Manipulation & Regular Expressions CSCI 215. Data Input. PHP scripts use data input... from files from databases from users Before using the data, we often need to... format it validate it To achieve this, we use... Input PHP functions Regular Expressions ( Regex ). PHP Functions.

salali
Download Presentation

Data Manipulation & Regular Expressions CSCI 215

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Manipulation & Regular ExpressionsCSCI 215

  2. Data Input • PHP scripts use data input... • from files • from databases • from users • Before using the data, we often need to... • format it • validate it • To achieve this, we use... • Input PHP functions • Regular Expressions (Regex)

  3. PHP Functions • There are many PHP functions used to validate data • ctype_alnum- returns true if a string is alphanumericctype_alnum(‘WJD640’) truectype_alnum(‘Hi!’)  false • ctype_alpha- returns true if a string is all alphabetic ctype_alpha (‘Hello’) truectype_alpha(‘Hi5’)  false • ctype_digit- returns true if a string is all numeric ctype_digit(‘88996’) truectype_digit(‘$23,946.52’)  false

  4. Useful Functions: Splitting • Often we need to split data into multiple pieces based on a particular character • Use explode() // expand user supplied date.. $input = ‘1/12/2007’; $bits = explode(‘/’,$input); // array(0=>1,1=>12,2=>2007) $month = $bits[0];

  5. Useful functions: Trimming • Removing excess whitespace • Use trim() // a user supplied name $input = ‘ Rob ’; $name = trim($input);  ‘Rob’

  6. Useful functions: String replace • To replace all occurrences of a string in another string use str_replace() // user-supplied date $input = '01.12-2007'; $clean = str_replace(array('.','-'), '/', $input); echo $clean;  01/12/2007

  7. Useful functions: cAsE • To make a string all uppercase use strtoupper() • To make a string all lowercase use strtolower() • To make just the first letter upper case use ucfirst() • To make the first letter of each word in a string uppercase use ucwords() • Especially important when comparing strings: if(strtolower($_POST['type']) == 'student');

  8. Useful functions: html sanitise • To make a string “safe” to output as html use htmlentities() // user entered comment $input = ’The <a> tag & ..’; $clean = htmlentities($input); // ‘The &lt;a&gt; tag &amp; ..’

  9. Regular Expressions It is usually possible to use a combination of various built-in PHP functions to achieve what you want. However, sometimes this gets complicated and we turn to Regular Expressions. Regular expressions are a concise (but complicated!) way of pattern matching Define a patternused to validate or extract data from a string

  10. Some definitions ‘rob@example.com’ '/^[a-z\d\._-]+@([a-z\d-]+\.)+[a-z]{2,6}$/i‘ Actual data string Definition of the pattern (the ‘Regular Expression’) PHP functions to do something with data and regular expression. preg_match(), preg_replace()

  11. Regex: Delimiters • The regex definition is always bracketed by delimiters, usually a ‘/’: pattern: ’/php/’; Matches: ‘php’, ‘I love php’, ‘phpphp’ Doesn’t match: ‘PHP’, ‘I love ph’ The whole regular expression has to be matched, but the whole data string doesn’t have to be used.

  12. Regex: Case insensitive • Extra switches can be added after the last delimiter. • The ‘i’ switch makes comparisons case insensitive $regex = ’/php/i’; Matches: ‘php’, ’I love pHp’, ‘PHP’ Doesn’t match: ‘I love ph’, ‘p h p’ Will it match ‘phpPHP’?

  13. Regex: Character groups • A regex is matched character-by-character. You can specify multiple options for a character using square brackets: $regex = ’/p[huo]p/’; Matches: ‘php’, ’pup’, ‘pop’ Doesn’t match: ‘phup’, ‘ppp’, ‘pHp’ Will it match ‘phpPHP’?

  14. Regex: Character groups • You can also specify a digit or alphabetical range in square brackets: $regex = ’/p[a-z1-3]p/’; Matches: ‘php’, ’pup’, ‘ppp’, ‘pop’, ‘p3p’ Doesn’t match: ‘PHP’, ‘p5p’, ‘p p’ Will it match ‘pa3p’?

  15. Regex: Predefined Classes

  16. Regex: Predefined classes $regex = ’/p\dp/’; Matches: ‘p3p’, ’p7p’, Doesn’t match: ‘p10p’, ‘P7p’ $regex = ’/p\wp/’; Matches: ‘p3p’, ’pHp’, ’pop’, ’p_p’ Doesn’t match: ‘phhp’, ’p*p’, ’pp’

  17. Regex: the Dot • The special dot character matches any character except for a line break: $regex = ’/p.p/’; Matches: ‘php’, ’p&p’, ‘p(p’, ‘p3p’, ‘p$p’ Doesn’t match: ‘PHP’, ‘phhp’

  18. Regex: Repetition • There are a number of special characters that indicate the character group may be repeated:

  19. Regex: Repetition $regex = ’/ph?p/’; Matches: ‘pp’, ’php’, Doesn’t match: ‘phhp’, ‘pbp’ $regex = ’/ph*p/’; Matches: ‘pp’, ’php’, ’phhhhp’ Doesn’t match: ‘pop’, ’phhohp’ Will it match ‘phHp’?

  20. Regex: Bracketed repetition • The repetition operators can be used on bracketed expressions to repeat multiple characters: $regex = ’/(php)+/’; Matches: ‘php’, ’phpphp’, ‘phpphpphp’ Doesn’t match: ‘ph’, ‘popph’ Will it match ‘phpph’?

  21. Regex: Repetition $regex = ’/ph+p/’; Matches: ‘php’, ’phhhhp’, Doesn’t match: ‘pp’, ‘phyhp’ $regex = ’/ph{1,3}p/’; Matches: ‘php’, ’phhhp’ Doesn’t match: ‘pp’, ’phhhhp’ Will it match ‘pHHp’?

  22. Regex: Anchors • So far, we have matched anywhere within a string. We can change this behavior by using anchors:

  23. Regex: Anchors • With NO anchors: $regex = ’/php/’; Matches: ‘php’, ’php is great’, ‘I love php’ Doesn’t match: ‘pop’

  24. Regex: Anchors • With start anchor: $regex = ’/^php/’; Matches: ‘php’, ’php is great Doesn’t match: ‘I love php’, ‘pop’ Will it match ‘PHP rocks!’?

  25. Regex: Anchors • With start and end anchors: $regex = ’/^php$/’; Matches: ‘php’, Doesn’t match: ’php is great’, ‘I love php’, ‘pop’ Will it match ‘php is php’?

  26. Regex: Escape special characters • We have seen that characters such as ?,.,$,*,+ have a special meaning. If we want to actually use them as a literal, we need to escape them with a backslash. $regex = ’/p\.p/’; Matches: ‘p.p’ Doesn’t match: ‘php’, ‘p1p’ Will it match ‘p..p’?

  27. So.. An example • Lets define a regex that matches an email: $emailRegex ='/^[a-z\d\._-]+@([a-z\d-]+\.)+[a-z]{2,6}$/i‘; Matches: ‘rob@example.com’, ‘rob@subdomain.example.com’ ‘a_n_other@example.co.uk’ Doesn’t match: ‘rob@exam@ple.com’ ‘not.an.email.com’

  28. So.. An example /^ [a-z\d\._-]+ @ ([a-z\d-]+\.)+ [a-z]{2,6} $/i Starting delimiter, and start-of-string anchor User name – allow any length of letters, numbers, dots, underscore or dashes The @ separator Domain (letters, digits or dash only). Repetition to include subdomains. com,uk,info,etc. End anchor, end delimiter, case insensitive

  29. Resources • http://regexpal.com • http://regexlib.com/ • Search • RegEx Tester • http://www.regular-expressions.info/

  30. Now What? • How do we use Regular Expressions? • preg_match() tests to see whether a string matches a regex pattern • preg_replace() is used to replace a string that matches a regex pattern

  31. preg_match • We can use the preg_match() function to test whether a string matches or not. // match an email $emailRegex= '/^[a-z\d\._-]+@([a-z\d-]+\.)+[a-z]{2,6}$/i' ; $input = ‘rob@example.com'; if (preg_match($emailRegex,$input) { echo'Valid email'; } else { echo'Invalid email'; }

  32. Using RegEx in Validation Functions Write a function validZip that returns true if an input contains exactly 5 digits. function validZip($str) { $regexp = '/^\d{5}$/'; return preg_match($regexp, $str); }

  33. Using RegEx in Validation Functions Test the validZip function on an array of zip codes. $data = array('89956', '33221-8837', '123VEF', '878788'); foreach($data as $item) { if(validZip($item)) echo "$item is valid<br />"; else echo "$item is not valid<br />"; }

  34. Using RegEx in Validation Functions Write a function validText that returns true if an input contains only text, no numbers or symbols. function validText($str) { $regexp = '/^[A-z]*$/'; return preg_match($regexp, $str); }

  35. Using RegEx in Validation Functions Test the validText function on an array of strings. $data = array('Hello2U', 'HELLO', '123', 'abc@def'); foreach($data as $item) { if(validText($item)) echo "$item is valid<br />"; else echo "$item is not valid<br />"; }

  36. Using RegEx in Validation Functions Write a function validSid that returns true if an input contains a student ID in the form 880-88-3322. function validSid($str) { $regexp = '/^\d{3}-?\d{2}-?\d{4}$/'; return preg_match($regexp, $str); }

  37. Using RegEx in Validation Functions Test the validSid function on an array of SIDs. $data = array(‘880-12-3456', ‘888776666', ‘8765432'); foreach($data as $item) { if(validSid($item)) echo "$item is valid<br />"; else echo "$item is not valid<br />"; }

  38. More Practice Write and test a function that returns true for 9-digit zip code, e.g. 98001-9801 Write and test a function that returns true for either a 5-digit or 9-digit zip code Write and test a function that validates a state abbreviation Write and test a function that validates a phone number in the format (XXX)XXX-XXXX

  39. Pattern replacement • We can use the function preg_replace() to replace any matching strings. // replace two or more spaces with // a single space $input = ‘Some comment string’; $regex = ‘/\s\s+/’; $clean = preg_replace($regex,‘ ’ ,$input); // ‘Some comment string’

More Related