230 likes | 411 Views
Advanced Perl For Bioinformatics. Part 1 2/23/06 1-4pm Module structure Module path Module export Object oriented programming Part 2 2/24/06 1-4pm Bioperl modules Sequence access Sequence manipulation Parsing BLAST records. Module and main program. package Hello1; sub greet {
E N D
Advanced Perl For Bioinformatics Part 1 2/23/06 1-4pm Module structure Module path Module export Object oriented programming Part 2 2/24/06 1-4pm Bioperl modules Sequence access Sequence manipulation Parsing BLAST records
Module and main program package Hello1; sub greet { return "Hello, World!"; } 1; test1.pl Hello1.pm #!/usr/bin/perl use Hello1; print Hello1::greet();
Why use module? • Reusable by different programs. • Keep your code well organized.
Module structure package Hello1; sub greet { return "Hello, World!\n"; } 1; Declare a package; file must be saved as Hello.pm Contents of the package: functions, and variables. Return a true value at end
Path to module • Default path to look for module: @INC perl -e “print @INC” • If your module is placed under one of the path in @INC, you can refer to your module use relative path. E.g. If @INC contains /usr/my/lib, and • your Mod.pm is /usr/my/lib/Mod.pm, you can refer to your module by “use Mod.pm”. • Your Mod.pm is /usr/my/lib/Mymod/Seq/Mod.pm, then you say: use Mymod::Seq::Mod • If your module is not placed under any of @INC, e.g. /some/dir/Mod.pm, then: use lib “/some/dir”; --- this adds the path to the beginning of @INC use Mod;
Variable scope in module • my $var --- accessible only in module • our $var --- accessible from outside • $var ---same as “our $var” • use strict; --- This forces all variables to be qualified with ‘my’ or ‘our’. test2.pl Hello2.pm #!/usr/bin/perl use Hello2; print "var1= $Hello2::var1\n"; print "var2= $Hello2::var2\n"; pring Hello2::greet(); package Hello2; use strict; our $var1 = 1; my $var2 = 3; my $str = "Hello World!\n"; sub greet { return $str; } 1;
Export Export functions and variables, so that they can be accessed without qualifier test3.pl Hello3.pm package Hello3; use strict; require Exporter; our @ISA=“Exporter”; our @EXPORT_OK = qw(greet); our $var1 = 1; my $var2 = 3; my $str = "Hello World!\n"; sub greet { return $str; } 1; #!/usr/bin/perl use Hello3 qw(greet); print "var1= $Hello3::var1\n"; print "var2= $Hello3::var2\n"; print greet();
Hello3.pm Need functionality in Exporter.pm to do exporting. package Hello3; use strict; use Exporter; our @ISA=“Exporter”; our @EXPORT_OK = qw(greet); our $var1 = 1; my $var2 = 3; my $str = "Hello World!\n"; sub greet { return $str; } 1; This programs inherits functions Exporter module, rather than creates its own. Exporter this sub routine upon request by other program
test3.pl #!/usr/bin/perl use Hello3 qw(greet); print "var1= $Hello3::var1\n"; print "var2= $Hello3::var2\n"; print greet(); Request “greet”
Hello4.pm package Hello4; use strict; use Exporter; our @ISA=“Exporter”; our @EXPORT_OK = qw(greet); our @EXPORT = qw(greet2); our $var1 = 1; my $var2 = 3; my $str = "Hello World!"; sub greet { return $str; } sub greet2 { return “Hi.\n”; } 1; Export this automatically
test4.pl #!/usr/bin/perl use Hello4 qw(greet); use Hello4; print "var1= $Hello4::var1\n"; print "var2= $Hello4::var2\n"; print greet(); print greet2(); Request “greet” This automatically imports whatever in @EXPORT.
Exercise 1 • Create a module which has functions to calculate the area and boundary of a rectangle. The width and length are to be supplied in your main program and passed into your module. Practice the @EXPORT, and @EXPORT_OK.
Object Orientied Programming • A package (or module) is a class. • A reference to a hash becomes an object of this class. • The object contains member variables which are stored in the hash. • The object also contains member functions.
Hello5.pm test5.pl package Hello; use strict; sub new { my $class = shift; my $ref = {}; bless ( $ref, $class ); return $ref; } sub greet { my ($ref, $str)= @_; return $str; } sub greet2 { return "Hi\n"; } 1; #!/usr/local/bin/perl use Hello5; $h = new Hello5; print $h->greet("Good morning\n"); print $h->greet2;
Rectangle.pm recttest.pl #!/usr/bin/perl use Rectangle; my $w = 3; my $l = 4; my $rect = new Rectangle($w,$l); my $area = $rect->getArea(); print "Area = $area\n"; my $b = $rect->getBoundary(); Print “Boundary=$b\n”; package Rectangle; sub new { my ($class, $width, $length)=@_; my $hashref = {W=>$width, L=>$length }; bless ( $hashref, $class); return $hashref; } sub getArea { my $self = shift; return $self->{W} * $self->{L}; } sub getBoundary { my $self=shift; return 2*($self->{W}+$self->{L}); } 1;
Exercise 2 • Create a class called “Cube”. It should have methods to calculate volume based on the cube’s width, length and height.
More Pratices on Class • Sequence.pm: clean, wrap, reverse complement, shuffle, GC content, translate • Main program: seq.pl
Bioperl • A collection of perl modules for bioinformatics • Facilitates sequence retrieval, manipulation, and parsing results of programs like blast, clustalw. • http://bioperl.org for download and documentation. • Individual .pm file has info on how to use modules. • Usually installed: /usr/local/lib/perl5/site_perl/5.8.0/Bio
Some Bioperl modules • Bio::Perl, Bio::DB -- access seq databases. Examples: seqret.pl • Bio::Seq -- sequence and its annotation. E.g. seqio.pl • Bio::SeqIO – read sequence from file, and write to file. E.g. seqio.pl • Bio::Tools:SeqStats -- molecular weight, etc. E.g. seqmw.pl • Bio::SearchIO -- parse blast results.
Accessing Remote Databases use Bio::Perl; $seqobj = get_sequence(‘swiss’, “ROA1_HUMAN”); write_sequence(“roa1.fasta”, ‘fasta’, $seqobj); Databases can be: swiss, genbank, genpept, refseq, etc.
Bio::Seq • Contain sequence and annotation • Methods: display_id, desc, seq, revcom, translate, etc. The revcom and translate methods create new Bio::Seq object. One way to create a Bio::Seq object: $seq = Bio::Seq->new(-seq => 'actgtggcgtcaact', -desc => 'Sample Bio::Seq object', -display_id => 'something', -accession_number => 'accnum', -alphabet => 'dna' ); An other way: read the sequence from file via Bio::SeqIO object.
Parsing blast results • my $in = new Bio::SearchIO(-format => 'blast', -file => 'report.bls'); • while( my $result = $in->next_result ) { • while( my $hit = $result->next_hit ) { • while( my $hsp = $hit->next_hsp ) { • if( $hsp->length('total') > 100 ) { • if ( $hsp->percent_identity >= 75 ) { • print "Hit= ", $hit->name, ", • Length=", $hsp->length('total'), ", • Percent_id=", $hsp->percent_identity, "\n"; • } • } • } • } • } • Example: blastparse.pl • Module: Bio::SearchIO