1 / 40

Filters and Utilities

Filters and Utilities. Notes: . This is a simple overview of the filtering capability Some of these commands are very powerful Only showing some of the basics of a few of the commands. Reminder: . Grave accent AKA backtick or backquote

brygid
Download Presentation

Filters and Utilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Filters and Utilities

  2. Notes: • This is a simple overview of the filtering capability • Some of these commands are very powerful • Only showing some of the basics of a few of the commands

  3. Reminder: • Grave accent • AKA backtick or backquote • Used for command substitution in bash and other Linux utilities and languages • Typical use: • put a command between a pair of ` • the std out of the command is substituted • Example: • #echo The date is:`date`!#The date is:Sun Mar 17 15L51:28 EDT 2013!

  4. What are Filters? • Use std in and std out • Monitor the input • Modify data as appropriate • Changes • Deletes • Moves • "as appropriate" • Send data to output

  5. Filter examples • Simple • pr • cmp • diff • comm • head • tail • cut • paste • sort • uniq • tr • Complex • grep • sed • Filter/script • awk

  6. pr: Paginate Files • Prepare files for printing • Adds: • Headers • Footers • Formatted text • Default adds 5 lines before and after text on page • Options: • Make columns • Set page length • Set page width • Number lines in output

  7. cmp: Byte by Byte Compare • Compares two files • Terminates on first delta • Echoes the location of first mismatch • Usually reports line and character position • Returns: • True if identical • False otherwise

  8. comm: What Is Common between files • Compares files line by line • Requires sorted files to work properly • Returns 3 types of differently indented lines • Lines unique to first file • Lines unique to second file • Lines common to both • Output is “weird” in columns 1st col is lines unique to 1st file 2nd col is lines unique to 2nd file 3rd col is common lines comm.sh (with error) in ~/Desktop/bashscripts

  9. diff: "How to make files the same" • Details how to change one file to make it the same as the other • For deltas instructions of how to change

  10. Resume 1/24

  11. head: Display beginning of file • Show the first n lines of a file • Default is 10 • Can change with –n x • Example use: • Want to re-edit the last file you edited: • nano `ls –t | head –n 1` • ls –t: list by time • head –n 1: list first entry • Feed as a parameter to nano with the backticks

  12. tail: Display end of file • Show the last n lines of a file • Default is 10 • Can change with –n x • Options • -f • Monitor the file as it grows • Must terminate with <ctrl-C> • -c • Do the last n chars instead of lines

  13. cut: Splitting a file vertically • Cuts a range out by: • Columns • Good for fixed length entries • -c range • -c1-4 • Fields • Good for delimited entries • Tab is default • -d specifies delimiter • -d/ set the / as the delimiter • -f specifies the fields to use • -f1,4 specifies the first and fourth fields

  14. paste: Paste files vertically • Paste two files together line by line • Can be used on a single file to join multiple sequential lines together • -s • Do serial on a single file • -d • Separate joined element with the list of delimiters

  15. sort: Order files • Put files in order • Default is ascending order on column 1 • ASCII order • Options: • -t • Define a delimiter • -k • Used with –t, which field to use • Can have multiple keys • Use commas to separate ranges • Use –k again to denote a new field • Can sort on columns in a field • Use a dot to separate • -n • Treat a field as a number, not an ASCII character • Remember the number 1 is different than the character "1" • -u • Remove repeated lines

  16. uniq: Locating identical lines • Returns only unique lines • Options: • -u • Return only the non-repeated lines • -d • Return only the repeated lines • But only one copy of each • -c • Return the count of how many times each line is repeated

  17. tr: Translate characters • Changes one set of characters to another, default input is the standard input • Example: • #tr 'ab' 'cd'This is abnormalThis is cdnormclabsolutecdsoluteab a b ccd c d c^C • Blue is std in • Red is std out • Note: a  c and b  d, not ab cd • Note: ^D can be used to denote end of file to tr instead of the shown ^C which stops the process tr

  18. tr: Translate characters • More examples: • Can be used to translate case for a file • tr [a-z] [A-Z] <file1 • Takes the input from file1 with the < redirection • Turns all lower case letters to upper case • Output goes to std out • Get rid of characters • tr –d [a-z] <file1 • Gets rid of all lower case chars from file1 • Again output is std out • Compressing repeated chars • tr –s ' ' <file1 • Changes repeated spaces to a single space

  19. Filters Using Regular Expressions

  20. Regular Expression Review

  21. Regular Expression • A pattern to match strings of text which is: • Concise • Flexible • Used by many programming languages and operating systems

  22. Regular Expressions • BRE • Basic Regular Expression • ERE • Extended Regular Expression • IRE • Interval Regular Expression • TRE • Tagged Regular Expression

  23. Character class • Set of characters enclosed within square brackets [ ] • Can be a list of single characters • [aD1] • a, D, and the character 1 only • Can be a range of characters • [a-zA-Z] • All the upper and lower case chars • Negate a class • [^0-9] • Not the numeric chars 0-9

  24. Regular Expressions • * • Refers to the immediately proceeding character • Any number of repeated character(s) • 0 or more • Used with other patterns • [A*] • Anything that matches 0 or more ‘A’s in a row • s*print will match sprint, ssprint, sssprint and print! • Note: this is not related to the familiarwildcard *

  25. Regular Expressions • . • Any character • Exactly one • S… with match Sort, Sxxx, … • Any four string starting with S • Note .* means 0 or more of any character • Pattern starting locations • ^ • Pattern start at the beginning of a line • $ • Pattern starts at the end of a line

  26. Extended Regular Expressions • | • Either one of a set • [a|b] • Matches if an a or a b • ( and ) • Chars between the parenthesis and what is before or after • ‘animaltype:(dog|cat)’ • look for animaltype:dog or animaltype:cat

  27. Advanced Filters

  28. grep

  29. grep – Search a pattern • Searches for a pattern in a file • grep options pattern filename(s) • std in is used if there is no filename • Can also pipe data to grep • Notes: • Pattern does not need be quoted if no delimiters or special chars in it • Can always use quotes to be safe

  30. grep - Options • -i • Ignore case • -v • Don’t display lines matching expression • -l • Display filenames • Useful when grepping multiple files • -e • Useful when grepping for – • -x • Match entire line • -f file • Takes expression from a file

  31. grep - examples #cat bigfile2 file 1 text file 1 text file 3 text file 1 text file 1 text file 1 text • Examples: • #grep 3 bigfile2file 3 text • #grep file bigfile2file 1 textfile 1 textfile 3 textfile 1 textfile 1 textfile 1 text

  32. sed

  33. sed – Streaming Editor • Edit a file(s) with a specified action • sed options 'address action' file(s) • Basics: • Take input from the file(s) • Performs the action on the file(s) • Sends output to std out • Uses: • Select part(s) of a file • By line • By content • Edit a file • e.g. create a template, then use sed to customize for a run • Oddities • Usually need –n to get rid of unwanted duplicated lines

  34. sed – Line addressing #cat tenline.file Line 1 Line 2 Line 3 Line 4 Line 5 Line 6 Line 7 Line 8 Line 9 Last Line • Select specific lines • #sed '3q' tenline.fileLine 1Line 2Line 3 • Selects the first 3 lines then quits • #sed '$p' tenline.fileLast Line • Prints last line • $ - last line • p – print • #sed '5,7p' tenline.fileLine 5Line 6Line 7 • Prints lines 5 through 7

  35. sed – Line addressing • Select specific lines with ; • #sed '1p;3p;$p' tenline.fileLine 1Line 3Last Line • Prints line 1, 3 and the last line ($) • ! Will negate operations • #sed '3,$!p' tenline.fileLine 1Line 2 • Does not print line 3 through the end • Notes • By default sed will echo the input lines as well as the selected lines •  get duplicated lines • Use –n to not echo the input lines

  36. sed – Context addressing • Use a pattern to identify lines to work with • Use / to delimit the pattern • Examples • #sed –n '/2/p' tenline.fileLine 2 • Find all lines with 2 in them and print • #sed –n '/^2/p' tenline.file • Finds all lines that start with 2 and print • ^ - starting the line

  37. sed – Writing selected lines to a file • Can use w to write the selected lines to a file • Example • sed –n '/2/w twos.file' tenline.file • w instead of p puts the output to a file • -n does not print duplicated

  38. sed – Text editing • Can edit the stream • i • Insert • a • Append • c • Change • d • Delete • s • Substitute

  39. sed - editing • Example: inserting • #sed '1i\>#!/bin/bash\># using the bash shell>' test.sh > $$ • Notes: • 1i inserts text starting line 1 • Need \ as a continuation character within the quotes • Input is the code or text in test.sh • Redirecting the output to $$ (temporary file) • Ends up with the 2 new lines at the beginning in $$ • Can further modify $$

  40. sed - editing • Use s to indicate substitution • Example: substituting • sed 's/a/b/' file • replacesa with b for the first instance on each line • sed 's/a/b/g' file • g(global) replaces a with b for all instances on each line

More Related