1 / 15

Regular Expressions for PHP

Regular Expressions for PHP. Adding magic to your programming. Geoffrey Dunn (geoff@warmage.com). What are Regular Expressions. Regular expressions are a syntax to match text. They date back to mathematical notation made in the 1950s.

marnin
Download Presentation

Regular Expressions for PHP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regular Expressions for PHP Adding magic to your programming. Geoffrey Dunn (geoff@warmage.com)

  2. What are Regular Expressions • Regular expressions are a syntax to match text. • They date back to mathematical notation made in the 1950s. • Became embedded in unix systems through tools like ed and grep.

  3. What are RE • Perl in particular promoted the use of very complex regular expressions. • They are now available in all popular programming languages. • They allow much more complex matching than strpos()

  4. Why use RE • You can use RE to enforce rules on formats like phone numbers, email addresses or URLs. • You can use them to find key data within logs, configuration files or webpages.

  5. Why use RE • They can quickly make replacements that may be complex like finding all email addresses in a page and making them address [AT] site [dot] com. • You can make your code really hard to understand

  6. Syntax basics • The entire regular expression is a sequence of characters between two forward slashes (/) • abc - most characters are normal character matches. This is looking for the exact character sequence a, b and then c • . - a period will match any character (except a newline but that can change) • [abc] - square brackets will match any of the characters inside. Here: a, b or c.

  7. Syntax basics • ? - marks the previous as optional. so a? means there might be an a • (abc)* - parenthesis group patterns and the asterix marks zero or more of the previous character. So this would match an empty string or abcabcabcabc • \.+ - the backslash is an all purpose escape character. the + marks one or more of the previous character. So this would match ......

  8. More syntax tricks • [0-4] - match any number from 0 to 4 • [^0-4] - match anything not the number 0-4 • \sword\s - match word where there is white space before and after • \bword\b - \b marks a word boundary. This could be white space, new line or end of the string

  9. More syntax tricks • \d{3,12} - \d matches any digit ([0-9]) while the braces mark the min and max count of the previous character. In this case 3 to 12 digits • [a-z]{8,} - must be at least 8 letters

  10. Matching Text • Simple check: preg_match(“/^[a-z0-9]+@([a-z0-9]+\.)*[a-z0-9]+$/i”, $email_address) > 0 • Finding: preg_match(“/\bcolou?r:\s+([a-zA-Z]+)\b/”, $text, $matches); echo $matches[1]; • Find all: preg_match_all(“/<([^>]+)>/”, $html, $tags); echo $tags[2][1];

  11. Matching Lines • This is more for looking through files but could be for any array of text. • $new_lines = preg_grep(“/Jan[a-z]*[\s\/\-](20)?07/”, $old_lines); • Or lines that do not match by adding a third parameter of PREG_GREP_INVERT rather than complicating your regular expression into something like /^[^\/]|(\/[^p])|(\/p[^r]) etc...

  12. Replacing text preg_replace( “/\b[^@]+(@)[a-zA-Z-_\d]+(\.)[a-zA-Z-_\d\.]+\b/”, array(“ [AT] “, “ [dot] “), $post);

  13. Splitting text • $date_parts = preg_split(“/[-\.,\/\\\s]+/”, $date_string);

  14. Tips • Comment what your regular expression is doing. • Test your regular expression for speed. Some can cause a noticeable slowdown. • There are plenty of simple uses like /Width: (\d+)/ • Watch out for greedy expressions. Eg /(<(.+)>)/ will not pull out “b” and “/b” from “<b>test</b>” but instead will pull “b>test</b”. A easy way to change this behaviour is like this: /(<(.+?)>)/

  15. References • http://en.wikipedia.org/wiki/Regular_expressions • http://php.net/manual/en/ref.pcre.php • Thank you

More Related