210 likes | 414 Views
Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message. perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire world $x = 6e9; print “Hello world!<br>”; print “All $x of you!<br>”;. - command interpretation header. - a comment.
E N D
A Taste of Perl: print a message • perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire world $x = 6e9; print “Hello world!\n”; print “All $x of you!\n”; - command interpretation header - a comment - variable assignment statement } - function calls (output statements)
Basic Syntax and Data Types • whitespace doesn’t matter to Perl. One can write all statements on one line • All Perl statements end in a semicolon ; just like C • Comments begin with ‘#’ and Perl ignores everything after the # until end of line. • Example: #this is a comment • Perl has three basic data types: • scalar • array (list) • associative array (hash)
Scalars • Scalar variables begin with ‘$’ followed by an identifier • Example: $this_is_a_scalar; • An identifier is composed of upper or lower case letters, numbers, and underscore '_'. Identifiers are case sensitive (like all of Perl) • $progname = “first_perl”; • $numOfStudents = 4; • = sets the content of $progname to be the string “first_perl” & $numOfStudents to be the integer 4
Scalar Values • Numerical Values • integer: 5, “3”, 0, -307 • floating point: 6.2e9, -4022.33 • hexadecimal/octal: 0xd4f, 0477 • Binary: 0b011011 NOTE: all numerical values stored as floating-point numbers (“double” precision)
Do the Math • Mathematical functions work pretty much as you would expect: 4+7 6*4 43-27 256/12 2/(3-5) • Example #!/usr/bin/perl print "4+5\n"; print 4+5 , "\n"; print "4+5=" , 4+5 , "\n"; $myNumber = 88; • Note: use commas to separate multiple items in a print statement 4+5 9 4+5=9 What will be the output?
Scalar Values • String values • Example: $day = "Monday "; print "Happy Monday!\n"; print "Happy $day!\n"; print 'Happy Monday!\n'; print 'Happy $day!\n'; • Double-quoted: interpolates (replaces variable name/control character with it’s value) • Single-quoted: no interpolation done (as-is) Happy Monday!<newline> Happy Monday!<newline> Happy Monday!\n Happy $day!\n What will be the output?
2 Length of the substring 0 String Manipulation Concatenation $dna1 = “ACTGCGTAGC”; $dna2 = “CTTGCTAT”; • juxtapose in a string assignment or print statement $new_dna = “$dna1$dna2”; • Use the concatenation operator ‘.’ $new_dna = $dna1 . $dna2; Substring $dna = “ACTGCGTAGC”; $exon1 = substr($dna,2,5); # TGCGT
Substitution DNA transcription: T U Substitution operator s/// : $dna = “GATTACATACACTGTTCA”; $rna = $dna; $rna =~s/T/U/g; #“GAUUACAUACACUGUUCA” =~ is a binding operator indicating to exam the contents of $rna for a match pattern Ex: Start with $dna =“gaTtACataCACTgttca”; and do the same as above. What will be the output?
Example • transcribe.pl: $dna ="gaTtACataCACTgttca"; $rna = $dna; $rna =~ s/T/U/g; print "DNA: $dna\n"; print "RNA: $rna\n"; • Does it do what you expect? If not, why not? • Patterns in substitution are case-sensitive! What can we do? • Convert all letters to upper/lower case (preferred when possible) • If we want to retain mixed case, use transliteration/translation operatortr/// $rna =~ tr/tT/uU/; #replace all t by u, all T by U
Case conversion $string = “acCGtGcaTGc”; Upper case: $dna = uc($string); # “ACCGTGCATGC” or$dna = uc $string; or$dna = “\U$string”; Lower case: $dna = lc($string); # “accgtgcatgc” or$dna = “\L$string”; Sentence case: $dna = ucfirst($string) # “Accgtgcatgc” or$dna = “\u\L$string”;
Reverse Complement 5’-A C G T C T A G C . . . . G C A T-3’ 3’-T G C A G A T C G . . . . C G T A-5’ • Reverse: reverses a string $string = "ACGTCTAGC"; $string = reverse($string);"CGATCTGCA“ • Complementation: use transliteration operator $string =~ tr/ACGT/TGCA/;
optional More on String Manipulation String length: length($dna) Index: #index STR,SUBSTR,POSITION index($strand, $primer, 2)
Flow Control Conditional Statements • parts of code executed depending on truth value of a logical statement “truth” (logical) values in Perl: false = {0, 0.0, 0e0, “”, undef}, default “” true = anything else, default 1 ($a, $b) = (75, 83); if ( $a < $b ) { $a = $b; print “Now a = b!\n”; } if ( $a > $b ) { print “Yes, a > b!\n” }# Compact
if/else/elsif • allows for multiple branching/outcomes $a = rand(); if ( $a <0.25 ) { print “A”; } elsif ($a <0.50 ) { print “C”; } elsif ( $a < 0.75 ) { print “G”; } else { print “T”; }
Conditional Loops while ( statement ) { commands … } • repeats commands until statement is no longer true do { commands } while ( statement ); • same as while, except commands executed as least once • NOTE the ‘;’ after the while statement!! Short-circuiting commands: next and last • next; #jumps to end, do next iteration • last; #jumps out of the loop completely
while Example: while ($alive) { if ($needs_nutrients) { print “Cell needs nutrients\n”; } } Any problem?
for and foreach loops • Execute a code loop a specified number of times, or for a specified list of values • for and foreach are identical: use whichever you want Incremental loop (“C style”): for ( $i=0 ; $i < 50 ; $i++ ) { $x = $i*$i; print "$i squared is $x.\n"; } Loop over list (“foreach” loop): foreach $name ( "Billy", "Bob", "Edwina" ) { print "$name is my friend.\n"; }
Basic Data Types • Perl has three basic data types: • scalar • array (list) • associative array (hash)