390 likes | 525 Views
Important UNIX Utilities. Regular expressions. Used by several different UNIX commands, including ed , sed , awk , grep A period ‘.’ matches any single characters .X. matches any X that is surrounded by any two characters Caret character ^ matches the beginning of the line
E N D
Regular expressions • Used by several different UNIX commands, including ed, sed, awk, grep • A period ‘.’ matches any single characters • .X. matches any X that is surrounded by any two characters • Caret character ^ matches the beginning of the line • ^Bridgeport matches the characters Bridgeport only if they occur at the beginning of the line
Regular expressions (continue.) • A dollar sign ‘$’ is used to match the end of the line • Bridgeport$ will match the characters Bridgeport only they are the very last characters on the line • $matches any single character at the end of the line • To match any single character, this character should be preceded by a backslash ‘\’ to remove the special meaning • \.$ matches any line end with a period
Regular expressions (continue.) • ^$matches any line that contains no characters • […] is used to match any character enclosed in […] • [tT] matches a lower or upper case t followed immediately by the characters • [A-Z] matches upper case letter • [A-Za-z] matches upper or lower case letter • [^A-Z] matches any character except upper case letter • [A-Za-z] matches any non alphabetic character
Regular expressions (continue.) • (*) Asterisk matches zero or more characters • X* matches zero, one, two, three, … capital X’s • XX* matches one or more capital X’s • .* matches zero or more occurrences of any characters • e.*e matches all the characters from the first e in the line to the last one • [A-Za-z] [A-Za-z]* matches any alphabetic character followed by zero or more alphabetic character
Regular expressions (continue.) • [-0-9]matches a single dash or digit character (ORDER IS IMPORTANT) • [0-9-]same as [-0-9] • [^-0-9]matches any alphabetic except digits and dash • []a-z]matches a right bracket or lower case letter (ORDER IS IMPORTANT)
Regular expressions (continue.) • \{min, max\} matches a precise number of characters • min specifies the minimum number of occurrences of the preceding regular expression to be matched, and max specifies the maximum • w\{1,10\} matches from 1 to 10 consecutive w’s • [a-zA-Z]\{7\} matches exactly seven alphabetic characters
Regular expressions (continue.) • X\{5,\} matches at least five consecutive X’s • \(….) is used to save matched characters • ^\(.\)matches the first character on the line and store it into register one • There is 1-9 registers • To retrieve what is stored in any register \n is used • Example: ^\(.\)\1 matches the first two characters on a line if they are both the same characters
Regular expressions (continue.) • ^\(.\).*\1$ matches all lines in which the first character on the line is the same as the last. Note (.*) matches all the characters in-between • ^\(…)\(…\) the first three characters on the line will be stored into register 1 and the next three characters into register 2
cut • Used in extracting various fields of data from a data file or the output of a command $ who bgeorge pts/16 Oct 5 15:01 (216.87.102.204) abakshi pts/13 Oct 6 19:48 (216.87.102.220) tphilip pts/11 Oct 2 14:10 (AC8C6085.ipt.aol.com) $ who | cut -c1-8,18- bgeorge Oct 5 15:01 (216.87.102.204) abakshi Oct 6 19:48 (216.87.102.220) tphilip Oct 2 14:10 (AC8C6085.ipt.aol.com) $ Format: cut -cchars file • chars specifies what characters to extract from each line of file.
cut (continue.) • Example: -c5, -c1,3,4 -c-10-15 -c5- • The –d and –f options are used with cut when you have data that is delimited by a particular character • Format: cut –ddchars –ffields file • dchar: delimiters of the fields (default: tab character) • fields: fields to be extracted from file
cut (continue.) $ cat /etc/passwd root:x:0:1:Super-User:/:/sbin/sh daemon:x:1:1::/: bin:x:2:2::/usr/bin: sys:x:3:3::/: adm:x:4:4:Admin:/var/adm: lp:x:71:8:Line Printer Admin:/usr/spool/lp: uucp:x:5:5:uucp Admin:/usr/lib/uucp: listen:x:37:4:Network Admin:/usr/net/nls: nobody:x:60001:60001:Nobody:/: noaccess:x:60002:60002:No Access User:/: oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh webuser:*:102:102:Web User:/export/home/webuser:/bin/csh abuzneid:x:103:100:Abdelshakour Abuzneid:/home/abuzneid:/sbin/csh $
cut (continue.) $ cut -d: -f1 /etc/passwd root daemon bin sys adm lp uucp nuucp listen nobody oracle webuser abuzneid $
cut (continue.) $ cat phonebook Edward 336-145 Alice 334-121 Sony 332-336 Robert 326-056 $ cut -f1 phonebook Edward Alice Sony Robert $
paste • Format: paste files • tab character is a default delimiter
paste (continue.) • Example: $ cat students Sue Vara Elvis Luis Eliza $ cat sid 578426 452869 354896 455468 335123 $ paste students sid Sue 578426 Vara 452869 Elvis 354896 Luis 455468 Eliza 335123 $
paste (continue.) • The option –s tells paste to paste together lines from the same file not from alternate files • To change the delimiter, -d option is used
paste (continue.) • Examples: $ paste -d '+' students sid Sue+578426 Vara+452869 Elvis+354896 Luis+455468 Eliza+335123 $ paste -s students Sue Vara Elvis Luis Eliza $ ls | paste -d ' ' -s - addr args list mail memo name nsmail phonebook programs roster sid students test tp twice user $
sed • sed (stream editor) is a program used for editing data • Unlike ed, sed can not be used interactively • Format: sed command file • command: applied to each line of the specified file • file: if no file is specified, then standard input is assumed • sed writes the output to the standard output • s/Unix/UNIX command is applied to every line in the file, it replaces the first Unix with UNIX
sed (continue.) • sed makes no changes to the original input file • ‘s/Unix/UNIX/g’ command is applied to every line in the file. It replaces every Unix with UNIX. “g” means global • With –n option, selected lines can be printed • Example:sed –n ’1,2p’ file which prints the first two lines • Example: sed –n ‘/UNIX/p’ file, prints any line containing UNIX
sed (continue.) • Example: sed –n ‘/1,2d/’ file, deletes lines 1 and 2 • Example: sed –n’ /1’ text, prints all lines from text, showing non printing characters as \nn and tab characters as “>”
tr • The tr filter is used to translate characters from standard input • Format: tr from-chars to-chars • Result is written to standard output • Example tr e x <file, translates every “e” in file to “x” and prints the output to the standard output • The octal representation of a character can be given to “tr” in the format \nnn • Example: tr : ‘\11’ will translate all : to tabs
tr (continue.) • Example: tr ‘[a-z]’’[A-Z]’ < filetranslate all lower case letters in file to their uppercase equivalent. The characters ranges [a-z] and [A-Z] are enclosed in quotes to keep the shell from replacing them with all files named from a through z and A through Z • To “squeeze” out multiple occurrences of characters the –s option is used
tr (continue.) • Example: tr –s ’ ’ ‘ ‘ < file will squeeze multiple spaces to one space • The –d option is used to delete single characters from a stream of input • Format: tr –d from-chars • Example: tr –d ‘ ‘ < file will delete all spaces from the input stream
grep • Searches one or more files for a particular characters patterns • Format: grep pattern files • Example: grep path .cshrcwill print every line in .cshrc file which has the pattern ‘path’ and print it • Example: grep bin .cshrc .login .profilewill print every line from any of the three files .cshrc, .login and .profile which has the pattern “bin”
grep (continue.) • Example : grep * smarts will give an error because * will be substituted with all file in the correct directory • Example : grep ‘*’ smarts grep * smarts arguments
sort • By default, sort takes each line of the specified input file and sorts it into ascending order $ cat students Sue Vara Elvis Luis Eliza $ sort students Eliza Elvis Luis Sue Vara $
sort (continue.) • The –n option tells sort to eliminate duplicate lines from the output
sort (continue.) $ echo Ash >> students $ echo Ash >> students $ cat students Sue Vara Elvis Luis Eliza Ash Ash $ sort students Ash Ash Eliza Elvis Luis Sue Vara $
sort (continue.) • The –s option reverses the order of the sort • The –o option is used to direct the input from the standard output to file • sort students > sorted_students works as sort students –o sorted_students • The –o option allows to sort file and saves the output to the same file • Example: sort students –o students correct sort students > studentsincorrect
sort (continue.) • The –n option specifies the first field for sort • as numberand data to sorted arithmetically
sort(continue.) $ cat data -10 11 15 2 -9 -3 2 13 20 22 3 1 $ sort data -10 11 -9 -3 15 2 2 13 20 22 3 1 $
sort (continue.) $ sort -n data -10 11 -9 -3 2 13 3 1 15 2 20 22 $ sort +1n data -9 -3 3 1 15 2 -10 11 2 13 20 22 $
sort (continue.) • To sort by the second field +1n should be used instead of n. +1 says to skip the first field • +5n would mean to skip the first five fields on each line and then sort the data numerically
sort(continue.) • Example $ sort -t: +2n /etc/passwd root:x:0:1:Super-User:/:/sbin/sh daemon:x:1:1::/: bin:x:2:2::/usr/bin: sys:x:3:3::/: adm:x:4:4:Admin:/var/adm: uucp:x:5:5:uucp Admin:/usr/lib/uucp: nuucp:x:9:9:uucp Admin:/var/spool/uucppublic:/usr/lib/uucp/uucico listen:x:37:4:Network Admin:/usr/net/nls: lp:x:71:8:Line Printer Admin:/usr/spool/lp: oracle:*:101:67:DBA Account:/export/home/oracle:/bin/csh webuser:*:102:102:Web User:/export/home/webuser:/bin/csh y:x:60001:60001:Nobody:/: $
uniq • Used to find duplicate lines in a file • Format: uniq in_file out_file • uniq will copy in_file to out_file removing any duplicate lines in the process • uniq’s definition of duplicated lines are consecutive-occurring lines that match exactly
uniq(continue.) • The –d option is used to list duplicate lines • Example: $ cat students Sue Vara Elvis Luis Eliza Ash Ash $ uniq students Sue Vara Elvis Luis Eliza Ash $
References • UNIX SHELLS BY EXAMPLE BY ELLIE QUIGLEY • UNIX FOR PROGRAMMERS AND USERS BY G. GLASS AND K ABLES • UNIX SHELL PROGRAMMING BY S. KOCHAN AND P. WOOD