400 likes | 549 Views
Introduction to UNIX / Linux - 6. Dr. Jerry Shiao, Silicon Valley University. Introduction UNIX/Linux Course. Section 6 Advanced File Processing Regular Expressions Compressing/Uncompressing Text and Binary Files View Compressed Files Without Decompressing
E N D
Introduction to UNIX / Linux - 6 Dr. Jerry Shiao, Silicon Valley University SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Section 6 • Advanced File Processing • Regular Expressions • Compressing/Uncompressing Text and Binary Files • View Compressed Files Without Decompressing • Execute Binary Files Without Decompressing • Searching For Commands and Files for Absolute Path • Searching Text Files for Strings, Expressions, and Patterns • DataBase Operations • Sorting contents of ASCII File. • Cutting Fields (Column Wise) from Table. • Pasting Fields (Row Wise) to Table. • Encoding and Decoding Binary to ASCII File • File Encryption and Decryption SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Regular Expressions • Used in UNIX Commands and Tools • Using set of nondigits and nonletter characters to define: Rules to represent one or more items. • Character string xy* means x, xy, xyy, xyyy, … • Similar to shell metacharacters. • Prevent Shell from interpretting Regular Expression characters, use single quote (i.e. ‘x | y‘) or back slash (i.e. \*). • Common tools using Regular Expressions: • awk– Pattern-scanning and action. • ed – Line oriented editor. • grep – Searches file(s) for pattern. • egrep - Extended regular expression search pattern. • sed – Stream editor used as batch (noninteractive) editor. • vi – Full screen editor. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • awk, ed, egrep, grep, sed, vi and Regular Expressions SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • awk, ed, egrep, grep, sed, vi and Regular Expressions SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • vi and Regular Expressions SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • vi commands and Regular Expressions SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Compressing Files • Less disk space. • Archived file saves considerable storage space. • Less time to copy. • Compressed file is unreadable. • Less time to transmit over network. • Transmit multiple times: Time spend compressing is fraction of time of transmitting uncompressed file. • UNIX commands for compressing and decompressing. • Compress/uncompress: UNIX. • gzexe: Compress executable files. • gzip: Compress and uncompress. • gunzip: Uncompressing files compressed with gzip. • gzcat: Displaying compressed files. • gzcmp: Comparing compressed files. • gzmore: Displaying compress file by page. • gzgrep: grep command for compressed file. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • compress Command (UNIX). • Analyze for repeated patterns, substitute characters. • Adaptive Lempel-Ziv coding. • Compressed file contains nonprintable characters. • Original file removed, new file with extension “.z”. • Retains mod date, ownership, and access privileges. • compress [ options ] [ file-list ] • Files-list: Files to compress. Resultant compressed to suffix .Z file. • Options: • - c : Write compressed file to console (instead of .Z file). • - f : Force compression. • - v : Display compression percentage and names of compressed files. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • compress Command (UNIX). SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • uncompress [ options ] [ file-list ] • - c : No files changed. Behavior similar to “gzcat”. • - f : Force compression of file, regardless of no size reduction. • - v : Verbose, write percentage reduction or expansion of file. • Output goes into original file (without the .Z extension). SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • gzip [ options ] [ file-list ] • GNU tool for compressing files. • Uses Lempel-Ziv coding. • Compress each file in “file-list”, replaced with one with “.gz” extension. • Retains mod time, ownership, and access privileges. • - c : Send output to standard output, input file is not overwritten with “.gz” file. • - d : Uncompress a compressed “.gz” file. Same as ungzip. • - f : Force compression when “.gz” file already exits. • - l : File is gzip compressed file. Display sizes compressed size, uncompressed size, ratio, name. • - r : Recursively compress files in the directory. • - v : Display compression percentage and names. • - N : Compression Speed. 1 fastest, 9 slowest. Default is 6. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course “- l” option takes compressed file and provides statistics. • Advanced File Processing • gzip [ options ] [ file-list ] • $ gzip file1 • $ gzip -l file1 compressed uncompressed ratio uncompressed_name 52 120 76.7% file1 • $ ls file1* file1.gz file1_sort file1_sort_bak • $ mv file1_sort file1 • $ gzip file1 gzip: file1.gz already exists; do you wish to overwrite (y or n)? n not overwritten • $ gzip -f file1 • $ gzip -v file1_sort_bak • file1_sort_bak: 77.5% -- replaced with file1_sort_bak.gz Gzip replaces the source file with the “.gz” compressed file. “.gz” file exists. Ask to overwrite. “- v” option displays compression ratio and compressed file name. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • gunzip [ options ] [ file-list ] • GNU tool for decompressing files. • Same as gzip –d <file-list>. • Decompress each file in “file-list”, remove “.gz” extension. • Retains mod time, ownership, and access privileges. • - c : Send output to standard output. • - f : Force decompression. • - l : File is gzip compressed file. Display sizes compressed size, uncompressed size, ratio, name. • - r : Recursively decompress files in the directory. • - v : Display decompression percentage and names. • - # : Decompression Speed. 1 fastest, 9 slowest. Default is 6. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course List two identical files. Gzip replaces the source file with the “.gz” compressed file. • Advanced File Processing • gunzip [ options ] [ file-list ] • $ ls -l bash* -rw-r--r-- 1 sau users 285373 2012-10-17 15:58 bash2.man -rw-r--r-- 1 sau users 285373 2012-10-17 15:58 bash.man • $ gzip -v bash.man bash2.man bash.man: 72.9% -- replaced with bash.man.gz bash2.man: 72.9% -- replaced with bash2.man.gz • $ ls -l bash* -rw-r--r-- 1 sau users 77239 2012-10-17 15:58 bash2.man.gz -rw-r--r-- 1 sau users 77238 2012-10-17 15:58 bash.man.gz • $ gzip -vdbash.man bash.man.gz: 72.9% -- replaced with bash.man • $ gunzip -v bash2.man bash2.man.gz: 72.9% -- replaced with bash2.man • $ ls -l bash* -rw-r--r-- 1 sau users 285373 2012-10-17 15:58 bash2.man -rw-r--r-- 1 sau users 285373 2012-10-17 15:58 bash.man List compressed files. Use “gzip –d” to uncompress the “.gz” file. Use gunzip to uncompress the “.gz” file. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • gzexe [ options ] [ file-list ] • Compress executable files. • File name does not change and compressed file will automatically uncompress when executed. • Gzip of executable file is NOT executable. • Creates backup file “~file-list”. • Performance penalty when executed. • Utility useful on embedded systems with limited resources. • - d: Decompress compressed file “~file-list” is now the compressed file. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Compress executable file. Backup or original file, “~power” is created. • Advanced File Processing • gzexe [ options ] [ file-list ] • $ gzexe power power: 64.3% • $ ls -lt total 32 -rwxr-xr-x 1 sau users 4408 2012-10-17 16:57 power -rwxr-xr-x 1 sau users 10042 2012-10-17 16:56 power~ • $ ./power This program takes x and y values from stdin and displays x^y . . . • $ gzexe -d power • $ ls -lt total 32 -rwxr-xr-x 1 sau users 10042 2012-10-17 16:58 power -rwxr-xr-x 1 sau users 4408 2012-10-17 16:57 power~ • $ ./power This program takes x and y values from stdin and displays x^y . . . • $ ./power~ This program takes x and y values from stdin and displays x^y . . . When compressed file execute, file is auto uncompressed before executed. Decompress file to get original executable file. Backup is the compressed file. Execute original executable file. Execute backup executable file. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Convert Compressed File before displaying • Requires system resources: System Memory, Disk Storage, and Disk I/O Schedular. • zcat [ options ] [ file-list ] • Expands and displays contents of “file-list” files. • Does not rename expanded file. • Supports files compressed with compress or gzip. • Writes expanded output to standard output. • zmore [ file-list ] • Allows viewing compressed or plain text files one screen at a time. • Supports files compressed with compress or gzip. • Writes expanded output to standard output. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Zcat uncompresses and sends output to stdout, but stdout need to be paged with more command. • Advanced File Processing • zcat and zmore. • $ man bash > man.bash • $ gzipman.bash • $ zcatman.bash | more BASH(1) NAME bash - GNU Bourne-Again SHell SYNOPSIS bash [options] [file] - - More - - <Cntrl> <C> • $ zmore man.bash.gz BASH(1) NAME bash - GNU Bourne-Again SHell SYNOPSIS bash [options] [file] . . . lines 1-30 :q Zmore similar to using zcat with more command. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Sorting. • Ordering a set of items in ascending or descending order. • Sort Key: Using a field or portion of each item. • Determines the position of each item in the sorted list. • Key depends on item to be sorted (i.e. personal records will use last name, student ID, and social security numbers). • Performed in a variety of software systems. • Words in a dictionary. • People names in telephone directory. • Arrival/Departure times in airport terminals. • Student IDs in class list. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Sort items in text (ASCII) files. • Sort items based on a field: Field separated by blank. • sort [ options ] [ file-list ] • Sort lines in the ASCII files in “file-list”. • Output goes to standard output. • Default key starts at column 1. • Fields are just words separated by blanks. • - b : Ignore leading blanks. • - d : Alphabetical order, ignoring all characters except letters, digits, and then blanks. • - f : Case insensitive. • +n [ -m ] : Field used as sort key. “n” specifies how many fields to skip. Start at the first character of (n+1th) field and end at the last character (mth) field (or end of line). • - r : Sort in reverse order. • - t : Field delimiter changed to <character>. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Field skips first word (+1) and end at second word (-2). Field delimiter is space. • Advanced File Processing • sort [ options ] [ file-list ] • $ cat donors.data Bay Ching 500000 China Jack Arta 250000 Indonesia Cruella Lumper 725000 Malaysia • $ sort +1 -2 donors.data Jack Arta 250000 Indonesia Bay Ching 500000 China Cruella Lumper 725000 Malaysia • $ cat filec File2: line2. : 111 File3: line4. : 222 File4: line3. : 333 • $ sort -t: +1 -2 filec File2: line2. : 111 File4: line3. : 333 File3: line4. : 222 Field delimiter is colon “:”. Field start at second word (+1) and end at second word (-2). SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Primary Key – Field delimiter is colon “:”. Field skips first word (+1) and end at second word (-2). • Advanced File Processing • sort [ options ] [ file-list ] • $ cat filec File2: line4. : 111 File3: line4. : 100 File4: line3. : 333 • $ sort -b -t: +1 -2 +2 -3 filec File4: line3. : 333 File3: line4. : 100 File2: line4. : 111 • $ Secondary Key – Field delimiter is colon “:”. Field start at third word (+2) and end at third word (-2). SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Searching For Commands and Files • find directory-list expression • Search “directory-list” to locate files meeting one or more criteria in “expression”. • Criteria in “expression”: • -exec CMD : Files matching criteria executes CMD. • -name <pattern> : Search files specified by <pattern>. • -newer file : Search files modified after “file”. • -perm <octal> : Search files with permission <octal>. • -print : Display the pathnames of files found. • -user <name> : Search for files owned by user name. • find ~ -name class* -print /home/sau/class3 /home/sau/kernel/linux-bcm-2.6.30/include/config/leds/class.h /home/sau/kernel/linux-bcm-2.6.30/include/config/classic . . . SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Searching For Commands and Files • whereis [ options ] [ file-list ] • Locate binaries (executables), source codes, and manual pages for the commands in file-list. • Searches ONLY standard locations: • /usr/share/man/* Directories of manual files. • /sbin, /etc, /usr/{lib, bin, ucb, lpp} Directories of binary files. • /usr/src/* Directories of source code files. • Displays absolute pathnames for the located files. • - b : Search for binaries (executables) only. • - s : Search for source code only. • $ whereis find compress tar find: /bin/find /usr/bin/find /usr/bin/X11/find /usr/share/man/mann/find.n.gz /usr/share/man/man1p/find.1p.gz /usr/share/man/man1/find.1.gz compress: /usr/share/man/man1p/compress.1p.gz tar: /bin/tar /usr/include/tar.h /usr/share/man/mann/tar.n.gz /usr/share/man/man5/tar.5.gz /usr/share/man/man1/tar.1.gz SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Searching For Commands and Files • which [ command-list ] • Takes command name in command-list and locates the file that contains it. • Searches all the directories in your PATH environment variable, in order, until it locates the command. • Stops at first occurrence: Finds where the shell is resolving the command. • Cannot locate aliases, functions, and shell builtins. • type [ -tp ] name • Indicates how each name could be interpreted if used as a command name. • Prints string which is one of: alias, keyword, function, builtin, or file. • -t: Prints string alias, keyword, function, builtin, or file. • -p: Returns name of the disk file that would be executed (“type –t name” would return “file”) or nothing. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Searching For Commands and Files • [student1@localhost ~]$ whereis grepgrep: /bin/grep /usr/share/man/man1p/grep.1p.gz /usr/share/man/man1/grep.1.gz[student1@localhost ~]$ which grep/bin/grep[student1@localhost ~]$ type grepgrep is /bin/grep[student1@localhost ~]$ type -p grep/bin/grep[student1@localhost ~]$ type -t grepfile[student1@localhost ~]$ whereis nicenice: /bin/nice /usr/share/man/man1p/nice.1p.gz /usr/share/man/man2/nice.2.gz /usr/share/man/man3p/nice.3p.gz /usr/share/man/man1/nice.1.gz[student1@localhost ~]$ whereis cdcd: /usr/share/man/man1p/cd.1p.gz /usr/share/man/man1/cd.1.gz[student1@localhost ~]$ which cd/usr/bin/which: no cd in (/usr/kerberos/bin:/usr/lib/ccache:/usr/local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/student1/bin)[student1@localhost ~]$ type cdcd is a shell builtin[student1@localhost ~]$ SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Searching Files • Utilities to fine lines in Text Files that contain an expression, string, or pattern. • grep [ options ] pattern [ file-list ] • Search regular expression pattern. • egrep [ options ] regexp [ file-list ] • Interpret document as extended regular expression supporting special characters (i.e. “|, ?, !, and {“). • fgrep [ options ] pattern [ file-list ] • Search literal text string pattern. • Search files in “file-list” for given pattern, string, or expression. • Lines matching options sent to standard output. • - i : Ignore the case of letters during the matching process. • - n : Print line numbers along with matched line. • - v : Print nonmatching lines. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Preceed each line with the line number. • Advanced File Processing • Searching Files • $ grep -n line1 file1 1:File2 line1. Change this line. 2:File2 line1.5 Insert this line. • $ egrep "line1|line2" file1 File2 line1. Change this line. File2 line1.5 Insert this line. File2 line2. • $ grep "^Line" filec Line : line5. : 099 • $ grep "\<1" filec File2: line4. : 111 File3: line4. : 100 Using regular expression “|” to logically “OR” the two strings. Using regular expression “^” to select pattern at beginning of line. Using regular expression “<“ to select pattern at beginning of word. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Text Files can be processed as Tables: • Each line comprise a record. • Each column is a field separated by field separated . • ‘Tab’ is default field separated. • Cutting and Pasting • Process files stored in table format. • cut -blist [ -n ] [ file-list ] • cut –clist [ file-list ] • cut -flist [ -dchar ] [ -s ] • - b list : Treat each byte as a column. • - c list : Treat each character as a column. • - d char : Use character “char” as field separator. Default <Tab>. • - f < flist >: Cut fields specified in “flist”. • - n : Do not split characters. • - s : Do not output lines that do not have delimiter character. • “blist”, “clist”, and “flist” can be comma-separated list or using “-” to specify range of bytes, characters, or field. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course Fields: ‘Tab’ is the field separator. • Advanced File Processing • Cutting and Pasting Remove fields 1 and 2. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Cutting and Pasting • paste [ options ] file-list • Horizontally concatenates files in “file-list”. • - d list : Use “list” as new line separator. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Cutting and Pasting • paste [ options ] file-list • $ cut –f1-3 student_records > table1 • $ cut -f4 student_addresses > table2 • $paste table1 table2 John Doe ECE 312.111.9999 Pam Myer S 666.222.1212 Jim Davis CE 713. 999 .5555 Jason Kim ECE 434.000.8888 Amy Nash ECE 888.111.4444 • $rm table1 table2 • NOTE: System overhead with separate ‘cut’ commands, ‘paste’, and ‘rm’. Use piping “ | “ to reduce system resources. • $ paste student_recordsstudent_addresses | cut –f1-3,7 SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Encoding and Decoding • Used in mail utility that does not support attachments. • User want to send non-ASCII (multimedia) file, must send in body of the mail. • UNIX-to-UNIX encode convert a file to be mailed to a format that contains readable ASCII characters only. • uuencode [ source-file ] decode_label • Encode “source-file” from binary to ASCII • Default output binary file to standard output: Redirect standard output to named file using “ > “. • uudecode [ option ] [ encoded_file ] • Decode “encoded_file” from ASCII to binary. • Default output binary file, decode_label, created. • - p : Send binary version of the uuencoded file to standard output. SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Encoding and Decoding - p decode-label SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • Encoding and Decoding • $ uuencode power2 decode_label > power2_uuencoded • $ ls -lt total 140 -rw-r--r-- 1 sau users 13869 2012-10-18 04:13 power2_uuencoded -rwxr-xr-x 1 sau users 10042 2012-10-17 16:56 power2 • $ head power2_uuencoded begin 644 decode_label M9&EF9B`M3F%U<B!D7S$O;&5C.5]F:6QE(&1?,B]L96,Y7V9I;&4*+2TM(&1? M,2]L96,Y7V9I;&4),C`Q-"TQ,"TP.2`Q.3HS,CHP."XR,3`R-3,X-3,@+3`W . . . • $ uudecode power2_uuencoded • $ ls decode_label power2 power2_uuencoded • $ ./decode_label This program takes x and y values from stdin and displays x^y . . . SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • File Encryption and Decryption • Encryption is a process by which a file is converted to an encrypted form, completely different from its original version. • Prevent hackers from easily understanding text over LAN or WiFi. • Original file remains intact, must be explicitly deleted. • Decryption is the reverse process by which a file is converted from a encrypted form to its original form. • mcrypt [ option ] • -z : Use gzip to compress files before encryption. If specified at decryption time, it will decompress these files. • - d : Decrypt. • - k : Enter keyword via command line, instead of prompting. • mdecrypt [ option ] SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • File Encryption and Decryption SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • File Encryption and Decryption PassPhrase PassPhrase Original file (Must be explicitly removed) SILICON VALLEY UNIVERSITY CONFIDENTIAL
Introduction UNIX/Linux Course • Advanced File Processing • File Encryption and Decryption • [student1]$ mcrypt -z students Enter the passphrase (maximum of 512 characters) Please use a combination of upper and lower case letters and numbers. Enter passphrase: Enter passphrase: File students was encrypted. • [student1]$ ls students students.gz.nc • [student1]$ mdecrypt -z students.gz.nc Enter passphrase: File students.gz.nc was decrypted. • [student1]$ ls students students.gz students.gz.nc • [student1]$ diff students students.gz • [student1]$ SILICON VALLEY UNIVERSITY CONFIDENTIAL