400 likes | 424 Views
Bioperl modules. Object Oriented Programming in Perl (1). Defining a class A class is simply a package with subroutines that function as methods. #!/usr/local/bin/perl package Cat; sub new { … } sub meow { … }. Object Oriented Programming in Perl (2). Perl Object
E N D
Object Oriented Programming in Perl (1) • Defining a class • A class is simply a package with subroutines that function as methods. #!/usr/local/bin/perl package Cat; sub new { … } sub meow { … }
Object Oriented Programming in Perl (2) • Perl Object • To initiates an object from a class, call the class “new” method. $new_object = new ClassName; • Using Method • To use the methods of an object, use the “->” operator. $cat->meow();
Object Oriented Programming in Perl (3) • Inheritance • Declare a class array called @ISA. • This array store the name and parent class(es) of the new species. package NorthAmericanCat; @NorthAmericanCat::ISA = (“Cat”); sub new { … }
Perl Modules A Perl module is a reusable package defined in a library file whose name is the same as the name of the package.
Names of perl modules • Each Perl module has a unique name. • To minimize name space collision, Perl provides a hierarchical name space for modules. • Components of a module name are separated by double colons (::). • For example, • Math::Complex • Math::Approx • String::BitCount • String::Approx
Module files • Each module is contained in a single file. • Module files are stored in a subdirectory hierarchy that parallels the module name hierarchy. • All module files have an extension of .pm.
Module libraries • The Perl interpreter has a list of directories in which it searhces for modules. • Global arry @INC >perl –V @INC: /usr/local/lib/perl5/5.00503/sun4-solaris /usr/local/lib/perl5/5.00503 /usr/local/lib/perl5/site-perl/5.005/sun4-solaris /usr/local/lib/perl5/site-perl/5.005
Using Modules • A module can be loaded by calling the use function. use Foo; bar( “a” ); # using bar method blat( “b” ); # using blat method
Bioperl toolkit • Core package (bioperl-live) • THE basic package and it’s required by all the other packages • Run package (bioperl-run) • Providing wrappers for executing some 60 common bioinformatics applications • DB package (bioperl-db) • Subproject to store sequence and annotation data in a BioSQL relational database • Network package (bioperl-network) • Parses and analyzes protein-protein interaction data • Dev package (bioperl-dev) • New and exploratory bioperl development
Bioperl Object-Oriented • The Bioperl takes advantages of the OO design to create a consistent, well documented, object model for interacting with biological data in the life sciences. • Bioperl Name space The Bioperl package installs everything in the Bio:: namespace. (where are the packages stored???)
Bioperl Objects • Sequence handling objects • Sequence objects • Alignment objects • Location objects • Other Objects: 3D structure objects, tree objects and phylogenetic trees, map objects, bibliographic objects and graphics objects
Sequence handling • Typical sequence handling tasks: • Access the sequence • Format the sequence • Sequence alignment and comparison • Search for similar sequences • Pairwise comparisons • Multiple alignment
Sequence Annotation • Bio::SeqFeature Sequence object can have multiple sequence feature (SeqFeature) objects (e.g. Gene, Exon, or Promoter objects) associated with it. • Bio::Annotation A Seq object can also have an Annotation object (used to store database links, literature references and comments) associated with it
Sequence Input/Output The Bio::SeqIOsystem was designed to make getting and storing sequences to and from the myriad of formats as easy as possible.
Accessing sequence data • Bioperl supports accessing remote databases as well as local databases. • Bioperl currently supports sequence data retrieval from the GenBank, Genpept, RefSeq, SwissProt, and EMBL databases
Format the sequences • SeqIO object can read a stream of sequences in one format: Fasta, EMBL, GenBank, Swissprot, PIR, GCG, SCF, phd/phred, Ace, or raw (plain sequence), then write to another file in another format
Manipulating sequence data $seqobj->display_id() # the human readable id of the sequence $seqobj->subseq(5,10) # part of the sequence as a string $seqobj->desc() # a description of the sequence $seqobj->trunc(5,10) # truncation from 5 to 10 as new object $seqobj->revcom # reverse complements sequence $seqobj->translate # translation of the sequence …
Search result parsing The Bio::SearchIOsystem was designed for parsing sequence database searches (BLAST, sim4, waba, FASTA, HMMER, exonerate, etc.)
Manipulating alignment The Bio::AlignIOsystem was designed for manipulating the alignment objects in different formats including aln, phylip, fasta, etc.
Example: Format the sequences Example: using “seq_formating.pl” to convert “sequences.gb” to another format
Copy the files to the current directory Check whether the files are executable Now, let’s look at the genbank file.
The home directory in Windows system. If you have Notepad++ installed, click “Edit with Notepad++”. If not, try to open “sequence.gb” with Notepad program.
If no arguments were supplied, a usage information will appear for instructions.
<enter> Program name Format of the input sequences Format of the output sequences Input file Output file
Program suceeded! Now it’s time to look at the file generated.
Type: cd<space>c:\BioDownload To enter the BioDownload folder
Type: • dir • To display the files in the current folder (NOT ls) • You should have the following files in the folder • (you may have other files, but that’s fine): • seq_formating.pl • sequences.gb.txt
Type: perl<space>seq_formating.pl<space>sequences.gb.txt<space>genbank<space>sequences.fasta<space>fasta
What’s next: Parsing the BLAST output