1 / 16

Perl Syntax: substitution s// and character replacement tr//

Understand substitution with s/// and character replacement with tr// in Perl, useful for modifying data. Find patterns, replace, match globally, and handle case. See examples with detailed explanations.

reilly
Download Presentation

Perl Syntax: substitution s// and character replacement tr//

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Perl Syntax: substitution s// and character replacement tr//

  2. Substitution Pattern matching is useful for finding or indexing items, but to modify the data, substitution is required. Substitution searches a string for a PATTERN and, if found, replaces it with REPLACEMENT. $line =~ s/PATTERN/REPLACEMENT/; Returns a value equal to the number of times the pattern was found and replaced. $result = $line =~ s/PATTERN/REPLACEMENT/;

  3. s/// $_ = "one two"; s/^([^ ]+) +([^ ]+)/$2 $1/; # $_ = "green scaly dinosaur"; s/(\w+) (\w+)/$2, $1/; # s/^/huge, /; # s/,.*een//; # s/green/red/; # s/\w+$/($`!)$&/; # s/\s+(!\W+)/$1 /; # s/huge/gigantic/g; # $_= "fred barney"; if(s/fred/george/) { #s/// returns true if sucessful print "replaced fred with george\n";

  4. s/// $_ = "one two"; s/^([^ ]+) +([^ ]+)/$2 $1/; #swap first 2 words $_ = "green scaly dinosaur"; s/(\w+) (\w+)/$2, $1/; # "scaly, green dinosaur" s/^/huge, /; #huge, scaly, green dinosaur s/,.*een//; #huge dinosaur s/green/red/; #fails s/\w+$/($`!)$&/; # huge (huge !)dinosaur next line ^^ --match _, !, and ) s/\s+(!\W+)/$1 /; # huge (huge!) dinosaur -- replace '_!)' with !)_ s/huge/gigantic/; #gigantic (huge!) dinosaur $_= "fred barney"; if(s/fred/george/) { #s/// returns true if sucessful print "replaced fred with george\n";

  5. Character Replacement • A similar operation to substitution is character replacement. • tr/CHARACTER SEARCH LIST/REPLACEMENT LIST/ • -- note that this is not pattern matching, but character matching • -- returns number of characters replaced • $line =~ tr/a-z/A-Z/; • $count_CG = $line =~ tr/CG/CG/; • $line =~ tr/ACGT/TGCA/; • $line =~ s/A/T/g; # be CAREFUL • $line =~ s/C/G/g; # this turns your sequence into all A|C • $line =~ s/G/C/g; • $line =~ s/T/A/g;

  6. Character Replacement Flags tr/SEARCH_LIST/REPLACEMENT_LIST /c -- complement the SEARCHLIST -- SEARCH_LIST is comprised of all characters NOT in SEARCH_LIST tr/ / /d -- delete found but unreplaced characters tr / / /s -- squash duplicate replaced characters -- sequences (or runs) of characters replaced, are squashed down to a single character

  7. Character Replacement while($line = <IN>) { $count_CG = $line =~ tr/CG/CG/; $count_AT = $line =~ tr/AT/AT/; } $total = $count_CG + $count_AT; $percent_CG = 100 * ($count_CG/$total); print “The sequence was $percent_CG CG-rich.\n”;

  8. Pattern Matching Flags g m// s// (not tr// ) match globally, find all occurrences i ignore case m match multiple lines as continuous string s treat string as single line .

  9. Examples $_ " AtttcgAtggctaaaAtttgctt" s/A/a/g; #atttcgatggctaaaatttgctt s/^\s+//; #strip leading white space s/\s+$//; #strip trailing white space Binding operator $string = " opps"; $string =~ s/^\s+//; # "opps"

  10. Upper/Lower Case \U -- everything that follows is upper case \L -- what follows is lower case \u -- single character upper \l -- single character lower $_ = "I saw Barney with Fred"; s/(fred|barney)/\U$1/gi; I saw BARNEY with FRED

  11. #!/usr/bin/perl # newNaive.pl # # Here is an example that shows how the "match" # returns a "true" -- so that on the "IF" control structure, # execution precedes into the block # # The $` is a special variable that "remembers" all of the # string that was passed over by the pattern matching engine. # # Using the length() function, the position of the match is determined, # and printed # $_ = "CCCATGATG"; if(/ATG/) { print "Found sequence at position ".length($`)."\n"; }

  12. ` #!/usr/bin/perl # newNaive2.pl # # Here is an example that shows how the "match" # returns a "true" -- so that on the "IF" control structure, # execution precedes into the block # # The $` is a special variable that "remembers" all of the # string that was passed over by the pattern matching engine. # # Using the length() function, the position of the match is determined, # and printed # $_ = "CCCATAATTTAGTTTT"; if(/ATG/) { print "Found Start codong $& at position ".length($`)."\n"; } elsif (m/TAG/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } elsif (m/TAA/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } elsif (m/TGA/) { print "found stop codon $& at position ".length($`)."\n"; print "There are ".scalar(length($&)+length($'))." nucleotides after the stop, including the stop codon\n"; } else { print "Start/stop codons not found in $_\n"; }

  13. #!/usr/bin/perl # sub.pl # # Example where I match "with" and " " and one or more # word characters # Then I replace all of that "with word" with "against 'word'" # # The $1 corresponds to the first set of parentheses. # # $_="He's out bowling with Fred tonight"; s/with (\w+)/against $1/; print "$_\n";

  14. #!/usr/bin/perl -w # bind3.pl # # Here's an example that takes a unix path ($file) # and copies it to anothe variable ($filename) # Then, we search for one or more of any character {.+} # followed by a "/" character -- but we have to use the # escape metacharacter "\" so that we don't end the match {\/}. # Finally, we are looking for one or more non-white spaces {(\S+) # at the end -- to pull off the the last file name "FOUND" # # # $path = "/home/tabraun/test/bob/FOUND"; $filename = $path; $filename =~ s/.+\/(\S+)/$1/; print "$filename\n";

  15. #!/usr/bin/perl # randomSeq.pl # # Don't get too uptight over this line -- it is just setting # a "seed" for the rand() fuction with a value that approximates # a random number. If you must know, it takes a prccess ID ($$), # shifts its bit left 15 times, then add the process ID to the shifted # value, then does an bit-wise XOR (^) with the current time(). # print "Enter length of sequence to generate:"; $length = <STDIN>; srand(time() ^ ($$ + ($$ << 15)) ); while($length) { # stay in loop until have generated enough sequence $rand = int rand(4); # Interger number between (0-3) inclusive $rand =~ tr/0123/ACTG/; $length = $length-1; #decrease loop counter $seq = $seq . $rand; #keep the nucleotide I just created } # Since I am out of the loop, I must be done print "$seq\n";

  16. End

More Related