420 likes | 658 Views
CS465. Unix grep Utility. The grep utility. grep stands for g lobally search for a r egular e xpression and p rint the results It is one of the most used Unix tools. It has even added to the Unix user's vocabulary:
E N D
CS465 Unix grep Utility
The grep utility • grep stands for globally search for a regular expression and print the results • It is one of the most used Unix tools. It has even added to the Unix user's vocabulary: • Verb: “Grep through the files to see what should be changed.” • Adjective: “The projx file is grepped source code.” • Noun: “Grepping is the best way to find that information.”
grep commands • grep is actually a family of commands • fgrep • grep • egrep • All three search files for strings which match specified patterns: fgrep – pattern must be a fixed string grep – pattern can include regular expressions egrep – pattern can include extended regular expressions
Simplest grep • The simplest grep command is fgrep: • $ fgrep fixed-pattern [file-list] • Searches all files in the file-list • Displays the filenames of the files which contain the fixed-pattern, along with the line in the file that the pattern was found on. • If you list only ONE filename in the file-list, fgrep will NOT include the filename in the results
fgrep Example • Search for all files in the current directory that contain the string "main". • Example: $ fgrep main * memo: The main point is that the new.c:main() prog1.c:main() $
More fgrep examples Display all lines in file prog.c containing “num”: $ fgrep num prog.c num = 0; while ( num < 5 ) { num = num + 1; $ Display info on all users lines containing “small”: $ fgrep small /etc/passwd small000:x:1164:102:Faculty – Pam Smallwood:/export/home/small000:/bin/ks $
grep format $ grep [options] pattern [filelist] • Search for specified pattern in each line of specified files • Send lines containing pattern (or other info) to the standard output (i.e. display them) • Options: -c display only count of matching lines -h outputs matched lines but not filenames -i ignore casewhenmatching -l display names of files only (no matching lines) -n display line numbers -s suppresses error messages for nonexistent or unreadable files -v display only non-matching lines -w restricts pattern to matching a whole word
grep examples $ grep are soccer.txt There are no time outs. There are no helmets, $ cat soccer.txt In Soccer, There are no time outs. There are no helmets, no shoulder pads, no commercial breaks, no warm dugouts, no halftime extravaganzas. So if that’s what you need, play another sport. $ grep –n are soccer.txt 2:There are no time outs. 3:There are no helmets, $ grep –c are soccer.txt 2
grep examples $ cat soccer.txt In Soccer, There are no time outs. There are no helmets, no shoulder pads, no commercial breaks, no warm dugouts, no halftime extravaganzas. So if that’s what you need, play another sport. Are you ready? $ grep Are soccer.txt Are you ready? $ grep –i Are soccer.txt There are no time outs. There are no helmets, Are you ready? $ grep –v no soccer.txt In Soccer, So if that’s what you need, Are you ready? $ grep –vc no soccer.txt 3
grep examples $ cat soccer.txt In Soccer, There are no time outs. There are no helmets, no shoulder pads, no commercial breaks, no warm dugouts, no halftime extravaganzas. So if that’s what you need, play another sport. Are you ready? $ grep –vw no soccer.txt In Soccer, So if that’s what you need, Play another sport. Are you ready? $ grep –l soccer * $ grep –l Soccer * soccer.txt $ grep –li soccer * soccer.txt
grep examples $ cat team1 Rob Murray Scott Stewart Martin Jones Scott Smith $ cat team2 Scott Jones Richard Shepard Doug Stringfellow John English $ grep Scott team1 team2 team1:Scott Stewart team1:Scott Smith team2:Scott Jones $ grep –h Scott team1 team2 Scott Stewart Scott Smith Scott Jones $ grep –l Scott team1 team2 team1 team2
grep examples $ cat team1 Rob Murray Scott Stewart Martin Jones Scott Smith $ cat team2 Scott Jones Richard Shepard Doug Stringfellow John English $ grep Scott team1 taem2 team1:Scott Stewart team1:Scott Smith grep: can’t open taem2 $ grep –s Scott team1 taem2 team1:Scott Stewart team1:Scott Smith
grep examples $ cat team1 Rob Murray Scott Stewart Martin Jones Scott Smith $ cat team2 Scott Jones Richard Shepard Doug Stringfellow John English $ grep Scott team* team1:Scott Stewart team1:Scott Smith team2:Scott Jones $ grep –c Doug team* team1:0 team2:1 $ grep –c Do team* | grep ":0" team1:0
More grep examples • Search files in sub subdirectory for string “test” (ignore case) • $ grep –i test `ls /sub` • ltr: Test for today • mbox:Subject: test • makefile: test1: test1.c • makefile: gcc test1.c -o test1 • $ Print non-commented lines in file myfile (i.e. lines that do NOT start with the string "#") $ egrep -v "^#" myfile name=bill echo $name $
More grep example Determine number of users in the projectX group: $ grep projectX /etc/group projectX:x:507:Plin_9318,Fyusuf_9287,Rlee_8656,Rdeich_1254,Njuwal_5960,Mmelto_8858,Wbucki_6698,Tespin_0604,Psmallwo_000 $ • $ grep -c 507 /etc/passwd • 9 • $ -c shows matching count -c shows matching count -c shows matching count
Searching for Multiple Strings -f option • If you have multiple strings that you want to search for, you can put all the strings into a string file, and use: • $ cat stringfile • pattern1 • pattern2 • $ grep –f stringfile filelist
grep examples $ cat soccer.txt In Soccer, There are no time outs. There are no helmets, no shoulder pads, no commercial breaks, no warm dugouts, no halftime extravaganzas. So if that’s what you need, play another sport. Are you ready? $ cat words Soccer dugouts helmets $ grep –f words soccer.txt In Soccer, There are no helmets, no warm dugouts, $ grep -v –f words soccer.txt There are no time outs. no shoulder pads, no commercial breaks, no halftime extravaganzas. So if that’s what you need, play another sport. Are you ready?
grep Exercises • Display all lines of file “test” that do not contain the string “and”, ignoring case $ grep -iv "and" test • Display a count of all of the lines of each “.c” file in current directory that contain the strings “num” or “number” • Answer: • $ cat strings • num • number • $ grep -c –f strings *.c
Advanced grep • grep is much more powerful when used with regular expressions to match more complex strings. • Regular expressions are strings of characters and special symbols that are used to match other strings.
Regular Expressions • A pattern matching string is called a regular expression (RE) • grep (and other Unix utilities) can use REs • Regular Expression Metacharacters: . (period) match any single character, except newline (similar to wildcard ?) * (asterisk) match any number (including zero) of the preceding character .*match any number of any character
Not Filename Expansion! • Although there are similarities to the metacharacters used in filename expansion • this is different! • Filename expansion is done by the shell. • Regular expressions are used by commands (programs).
More RE Metacharacters ^ (caret) match start of line $ (dollar) match end of line Character Sets: [ ] match any of enclosed characters [^ ] match anything BUT the enclosed NOTE: If the caret (^) is anywhere inside a character set except right after the opening bracket, it has no special meaning
Character Set Ranges • The hyphen (-) character can be used with the square brackets to indicate a range of characters: [0-9] is the same as [0123456789] [a-z] is the same as [abcd...wxyz] • If the hyphen is placed at the beginning or end of the character set, it has no special meaning (and will match a hyphen in the string)
Other Characters • Any character other than a metacharacter will accept one of itself: • A single lettera in a regular expression will accept a single letter a in a string • This is assumed to be case-sensitive; lowercase a doesn't accept uppercase A • Use a backslash to turn OFF metacharacter processing (i.e. match a metacharacter to its real value) • Use quotes around Regular Expressions to prevent SHELL metacharacter interpretation.
Example Pattern Matches • Some sample regular expressions and what they match: "abc" matches the string abc "^abc"abc at the beginning of a line "abc$"abc at the end of a line "^abc$"abc as the entire line "[Aa]bc"abc or Abc "a[aeiuo]c" a, lowercase vowel, c "a[^aeiou]c"a, not lowercase vowel, c
Example Pattern Matches • More regular expressions and what they match: "[x-z]" matches x or y or z "[x\-z]" matches x or - or z "[xz-]" matches x or - or z (same) "[.c]" matches any character followed by a c "[\.c]" matches .c "[a-zA-Z0-9]" matches any letter or digit "[^0-9]"match any non-digit "[^\^]" match any single character except ^
More Pattern Matches "[Pp][Aa][Mm]" • Matches “Pam" or “pam" or “pAM“ • Does not match "am" or “pa“ "[abc]*" • matches "aaaaa" or "acbca“ "0*10" • matches "010" or "0000010" or "10"
grep examples $ cat pattern Background is black, and white. I love red, and I love blue, but not yellow. $ grep –i "^b" pattern Background is black, and white. but not yellow. $ grep "." pattern Background is black, and white. I love red, and I love blue, but not yellow. $ grep –i b pattern Background is black, and white. And I love blue, but not yellow. $ grep "\." pattern Background is black, and white. But not yellow. $ grep "^b" pattern but not yellow.
grep examples $ cat pattern Background is black, and white. I love red, and I love blue, but not yellow. $ grep "," pattern Background is black, and white. I love red, and I love blue, $ grep ",$" pattern I love red, And I love blue,
grep examples $ cat pattern2 rd reed red reef ref reep $ grep "re[df]" pattern2 red ref $ grep "re*[dp]" pattern2 rd reed red reep $ grep "re*d" pattern2 rd reed red $ grep "f$" pattern2 reef ref
More grep examples (using RE) Display names of all files in this directory that refer to Unix (or unix) $ grep -l "[Uu]nix" * mbox myfile.txt script2 $ • List soft-linked files only: • $ ls –l | grep "^l" • lrwx------ 2 small000 faculty 512 Jun 4 13:04 t1 -> t • lrwx------ 2 small000 faculty 512 Jun 2 13:43 t2 -> t • $
grep in a script Display long list of files, then the number of "old" files (files last accessed in 2007): $ cat oldfiles.ksh #! /bin/ksh ls -l > listfile num=`grep 2007 listfile | wc -l` echo Number of old files: $num rm listfile exit 0 $
Extended Regular ExpressionAdditional Metacharacters(available only with egrep) + match any number (greater than zero) of preceding character ? match zero or one instances of preceding character |combines REs with either-or matching ( ) groups pattern matching characters
egrep examples $ cat pattern2 rd reed red reef ref reep $ egrep "re+d" pattern2 red reed $ egrep "re?[df]" pattern2 rd red ref $ egrep "re?d" pattern2 rd red $ egrep "re+[dp]" pattern2 red reed reep
grep/egrep examples $ cat pattern Background is black, and white. I love red, and I love blue, but not yellow. $ egrep "I|a" pattern Background is black, and white. I love red, and I love blue, $ egrep "^I|^a" pattern I love red, and I love blue, $ grep "red|blue" pattern $ egrep "red|blue" pattern I love red, and I love blue,
Extended RE Examples [abc]+d matches "aaaaad" or "acbcad " but does NOT match "d" 0+10matches "010" or "0000010" but does NOT match "10" x[abc]?x matches "xax", "xbx", "xcx" or "xx" A[0-9]?B matches "A8B" or "AB“ but does NOT match "a8b" or "A123B"
Grouping • The parentheses special characters ( and ) can be used to group several subexpressions together and apply a suffix to them as a group: ba+d accepts bad, baad, baaad, etc (ba)+d accepts bad, babad, bababad, etc (ba)+(cd)+ accepts bacd, babacd, bacdcd, bacdcdcdcd, etc
Alternatives • The (|) "either-or" choice can also be "grouped" aa|bb will accept either aa or bb fr(ie|ei)nd will accept friend or freind, and nothing else • There can be any number of choices: m(a|e|ai|oo)n will accept man, men, main, or moon • If all the choices are single characters, then you might as well use a character set: p(a|e|i)n will accept pan, pen, or pin, and is equivalent to p[aei]n
More egrep Examples • You must use egrep in order to have access to the EXTENDED regular expressions: $ egrep 'ab+c?d' file match lines with a followed by any number of b’s and and optional c followed by a d $ egrep '(ab)+c?(de)+' file match lines with any number of ab’s and optional c followed by any number of de’s
Handout • See handout for more fgrep, grep, and egrep examples
grep/egrep Exercise • Display all lines of file “test” that end in the letters x, y, or z $ grep '[xyz]$' test OR $ egrep 'x$|y$|z$' test