390 likes | 496 Views
Filters and Utilities. Notes: . This is a simple overview of the filtering capability Some of these commands are very powerful Only showing some of the basics of a few of the commands. Reminder: . Grave accent AKA backtick or backquote
E N D
Notes: • This is a simple overview of the filtering capability • Some of these commands are very powerful • Only showing some of the basics of a few of the commands
Reminder: • Grave accent • AKA backtick or backquote • Used for command substitution in bash and other Linux utilities and languages • Typical use: • put a command between a pair of ` • the std out of the command is substituted • Example: • #echo The date is:`date`!#The date is:Sun Mar 17 15L51:28 EDT 2013!
What are Filters? • Use std in and std out • Monitor the input • Modify data as appropriate • Change • Delete • Move • "as appropriate" • Send data to standard out
Filter examples • Simple • pr • cmp • diff • comm • head • tail • cut • paste • sort • uniq • tr • Complex • grep • sed • Filter/script • awk
pr: Paginate Files • Prepare files for printing • Adds: • Headers • Footers • Formatted text • Default adds 5 lines before and after text on page • Options: • Make columns • Set page length • Set page width • Number lines in output
cmp: Byte by Byte Compare • Compares two files • Terminates on first delta • Echoes the location of first mismatch • Usually reports line and character position • Returns: • True if identical • False otherwise
comm: What Is Common between files • Compares files line by line • Requires sorted files to work properly • Returns 3 types of differently indented lines • Lines unique to first file • Lines unique to second file • Lines common to both • Output is “weird” in columns 1st col is lines unique to 1st file 2nd col is lines unique to 2nd file 3rd col is common lines comm.sh in ~/ITIS3110/bashscripts commbad.sh (with error)
diff: "How to make files the same" • Details how to change one file to make it the same as the other • For deltas instructions of how to change
head: Display beginning of file • Show the first n lines of a file • Default is 10 • Can change with –n x • Example use: • Want to re-edit the last file you edited: • nano `ls –t | head –n 1` • ls –t: list by time • head –n 1: list first entry • Feed as a parameter to nano with the backticks
tail: Display end of file • Show the last n lines of a file • Default is 10 • Can change with –n x • Options • -f • Monitor the file as it grows • Must terminate with <ctrl-C> • -c • Do the last n chars instead of lines
cut: Splitting a file vertically • Cuts a range out by: • Columns • Good for fixed length entries • -c range • -c1-4 • Fields • Good for delimited entries • Tab is default • -d specifies delimiter • -d/ set the / as the delimiter • -f specifies the fields to use • -f1,4 specifies the first and fourth fields
paste: Paste files vertically • Paste two files together line by line • Can be used on a single file to join multiple sequential lines together • -s • Do serial on a single file • -d • Separate joined element with the list of delimiters
sort: Order files • Put files in order • Default is ascending order on column 1 • ASCII order • Options: • -t • Define a delimiter • -k • Used with –t, which field to use • Can have multiple keys • Use commas to separate ranges • Use –k again to denote a new field • Can sort on columns in a field • Use a dot to separate • -n • Treat a field as a number, not an ASCII character • Remember the number 1 is different than the character "1" • -u • Remove repeated lines
uniq: Locating identical lines • Returns only unique lines • Options: • -u • Return only the non-repeated lines • -d • Return only the repeated lines • But only one copy of each • -c • Return the count of how many times each line is repeated
tr: Translate characters • Changes one set of characters to another, default input is the standard input • Example: • #tr 'ab' 'cd'This is abnormalThis is cdnormclabsolutecdsoluteab a b ccdcd c^C • Blue is std in • Red is std out – bold is what changed • Note: a c and b d, not ab cd • Note: ^D can be used to denote end of file to tr instead of the shown ^C which stops the process tr
tr: Translate characters • More examples: • Can be used to translate case for a file • tr a-z A-Z <file1ortr '[a-z]' '[A-Z]' <file1 • Takes the input from file1 with the < redirection • Turns all lower case letters to upper case • Output goes to std out • Get rid of characters • tr –d [a-z] <file1 • Gets rid of all lower case chars from file1 • Again output is std out • Compressing repeated chars • tr –s ' ' <file1 • Changes repeated spaces to a single space
Regular Expression • A pattern to match strings of text which is: • Concise • Flexible • Used by many programming languages and operating systems
Regular Expressions • BRE • Basic Regular Expression • ERE • Extended Regular Expression • IRE • Interval Regular Expression • TRE • Tagged Regular Expression
Character class • Set of characters enclosed within square brackets [ ] • Can be a list of single characters • [aD1] • a, D, and the character 1 only • Can be a range of characters • [a-zA-Z] • All the upper and lower case chars • Negate a class • [^0-9] • Not the numeric chars 0-9
Regular Expressions • * • Refers to the immediately proceeding character • Any number of repeated character(s) • 0 or more • Used with other patterns • [A*] • Anything that matches 0 or more ‘A’s in a row • s*print will match sprint, ssprint, sssprint and print! • Note: this is not related to the familiar wildcard *
Regular Expressions • . • Any character • Exactly one • S...with match Sort, Sxxx, … • Any four char string starting with S • Note .* means 0 or more of any character • Pattern starting locations • ^ • Pattern start at the beginning of a line • $ • Pattern starts at the end of a line
Extended Regular Expressions • | • Either one of a set • [a|b] • Matches if an a or a b • ( and ) • Chars between the parenthesis and what is before or after • ‘animaltype:(dog|cat)’ • look for animaltype:dog or animaltype:cat
grep – Search a pattern • Searches for a pattern in a file • grep options pattern filename(s) • std in is used if there is no filename • Can also pipe data to grep • Notes: • Pattern does not need be quoted if no delimiters or special chars in it • Can always use quotes to be safe
grep - Options • -i • Ignore case • -v • Don’t display lines matching expression • -l • Display filenames • Useful when grepping multiple files • -e • Useful when grepping for – • -x • Match entire line • -f file • Takes expression from a file
grep - examples #cat bigfile2 file 1 text file 1 text file 3 text file 1 text file 1 text file 1 text • Examples: • #grep 3 bigfile2file 3 text • #grep file bigfile2file 1 textfile 1 textfile 3 textfile 1 textfile 1 textfile 1 text
sed – Streaming Editor • Edit a file(s) with a specified action • sed options 'address action' file(s) • Basics: • Take input from the file(s) • Performs the action on the file(s) • Sends output to std out • Uses: • Select part(s) of a file • By line • By content • Edit a file • e.g. create a template, then use sed to customize for a run • Oddities • Usually need –n to get rid of unwanted duplicated lines
sed – Line addressing #cat tenline.file Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Last Line • Select specific lines • #sed '3q' tenline.fileLine 1Line 2Line 3 • Selects the first 3 lines then quits • #sed '$p' tenline.fileLast Line • Prints last line • $ - last line • p – print • #sed '5,7p' tenline.fileLine 5Line 6Line 7 • Prints lines 5 through 7
sed – Line addressing • Select specific lines with ; • #sed '1p;3p;$p' tenline.fileLine 1Line 3Last Line • Prints line 1, 3 and the last line ($) • ! Will negate operations • #sed '3,$!p' tenline.fileLine 1Line 2 • Does not print line 3 through the end • Notes: • By default sed will echo the input lines as well as the selected lines • get duplicated lines • Use –n to not echo the input lines
sed – Context addressing • Use a pattern to identify lines to work with • Use / to delimit the pattern • Examples • #sed –n '/2/p' tenline.fileLine 2 • Find all lines with 2 in them and print • #sed –n '/^2/p' tenline.file • Finds all lines that start with 2 and print • ^ - starting the line
sed – Writing selected lines to a file • Can use w to write the selected lines to a file • Example • sed –n '/2/w twos.file' tenline.file • winstead of p puts the output to a file • -n does not print duplicated
sed – Text editing • Can edit the stream • i • Insert • a • Append • c • Change • d • Delete • s • Substitute
sed - editing • Example: inserting • #sed '1i\>#!/bin/bash\># using the bash shell>' test.sh > $$ • Notes: • 1iinserts text starting line 1 • Need \ as a continuation character within the quotes • Input is the code or text in test.sh • Redirecting the output to $$ (temporary file) • Ends up with the 2 new lines at the beginning in $$ • Can further modify $$
sed - editing • Use s to indicate substitution • Example: substituting • sed 's/a/b/' file • replacesa with b for the first instance on each line • sed 's/a/b/g' file • g(global) replaces a with b for all instances on each line