BioPerl

BioPerl - documentation • Bioperl tutorial http://www.bioperl.org/Core/Latest/bptutorial.html • Mastering Perl for Bioinformatics: Introduction to Bioperl http://www.oreilly.com/catalog/mperlbio/chapter/ch09.pdf • Bioperl documentation http://www.bioperl.org/Core/Latest/modules.html • Modules listing. • http://doc.bioperl.org/releases/bioperl-1.0.1/ • Unix help: man and perldoc. • http://www.pasteur.fr/recherche/unites/sis/formation/bioperl/

Bio::Seq class structure

Building Sequence Bio::Seq de novo: use Bio::Seq; my $seq1 = Bio::Seq->new ( -seq => 'ATGAGTAGTAGTAAAGGTA', -id => 'my seq', -desc => 'this is a new Seq'); from a file: use Bio::SeqIO; my $seqin = Bio::SeqIO->new ( -file => 'seq.fasta', -format => 'fasta'); my $seq3 = $seqin->next_seq(); my $seqin2 = Bio::SeqIO->newFh ( -file => 'seq.fasta', -format => 'fasta'); my $seq4 = <$seqin2>; my $seqin3 = Bio::SeqIO->newFh ( -file => 'golden sp:TAUD_ECOLI |', -format => 'swiss'); my $seq5 = <$seqin3>;

Building Sequence Bio::Seq by fetching an entry from a remote database: use Bio::DB::SwissProt; my $banque = new Bio::DB::SwissProt; my $seq6 = $banque->get_Seq_by_id('KPY1_ECOLI'); by fetching an entry from a bioperl indexed local database: use Bio::Index::Swissprot; my $inx = Bio::Index::Swissprot->new( -filename => 'small_swiss.inx'); my $seq7 = $inx->fetch('MALK_ECOLI'); $seqobj->seq;

Relation of a SwissProt entry and the corresponding Bio::Seq object components

analysis tools Bio::Seq # sequence as string $seqobj->subseq(10,40); # sub-sequence as string $seqobj->trunc(10,100); # sub-sequence as fresh Bio::PrimarySeq $seqobj->translate;

statistical tools Bio::Tools::SeqStats $seq_stats = Bio::Tools::SeqStats-> new(-seq=>$seqobj); $seq_stats->count_monomers(); $seq_stats->count_codons(); $weight = $seq_stats->get_mol_wt($seqobj);

An universal converter with SeqIO use Bio::SeqIO; my $in = Bio::SeqIO->new(-file => "inputfilename" , '-format' => 'Fasta'); my $out = Bio::SeqIO->new(-file => ">outputfilename" , '-format' => 'EMBL'); my $seq = $in->next_seq() ; $out->write_seq($seq); • Similarly, format conversions of alignment formats using AlignIO, e.g. clustalw, fasta, phylip, Pfam, msf, bl2seq, meme.

Align Classes diagram!!!

Run Clustalw and print the result on standard output in Phylip format. use Bio::Tools::Run::Alignment::Clustalw; use Bio::AlignIO; my $inputfilename = $ARGV[0]; my $outform = $ARGV[1]; my @params = ('ktuple' => 2, 'matrix' => 'BLOSUM'); my $factory = Bio::Tools::Run::Alignment::Clustalw->new(@params); my $aln = $factory->align($inputfilename); my $out = Bio::AlignIO->new ( -fh => \*STDOUT, -format => $outform ); $out->write_aln($aln);

Blast search StandAloneBlast run use Bio::SeqIO; use Bio::Tools::Run::StandAloneBlast; my $Seq_in = Bio::SeqIO->new (-file => $query_file.faa, -format => 'fasta'); my $query = $Seq_in->next_seq(); my $factory = Bio::Tools::Run::StandAloneBlast->new('program' => 'blastp', 'database' => 'swissprot' ); $factory->e(1.0); # Setting parameters $factory->outfile('blast.out'); # Saving output to file my $blast_report = $factory->blastall($query); # looking at the report: many results with multiple hits with multiple hsps… my $result = $blast_report->next_result; while( my $hit = $result->next_hit()) { print "\thit name: ", $hit->name(), " significance: ", $hit->significance(), "\n"; while( my $hsp = $hit->next_hsp()) { print “hsp length: ", $hsp->length(), "\n"; }}

Blast search Parse Blast results on standard input my $blast_report = new Bio::SearchIO ('-format' => 'blast', '-fh' => \*STDIN ); my $result = $blast_report->next_result; Extracting alignmernts my $pattern = $ARGV[1]; my $out = Bio::AlignIO->newFh(-format => 'clustalw' ); while( my $hit = $result->next_hit()) { if ($hit->name() =~ /$pattern/i ) { while( my $hsp = $hit->next_hsp()) { my $aln = $hsp->get_aln(); $out->write_aln($aln); }}}

Blast Classes diagram

Extract translations from Genbank entry

Extract translations from Genbank entry my $dbin = Bio::SeqIO->newFh ( -fh => STDIN, -format => $format ); my $out = Bio::SeqIO->newFh ( -fh => STDOUT, -format => 'fasta' ); while($entry = <$dbin>) { my $entryid = $entry->id(); foreach my $feat ($entry->top_SeqFeatures()) { if ($feat->primary_tag() eq 'CDS') { my $id = $feat->has_tag('gene') ? join(' ', $feat->each_tag_value('gene')) : 'no_gene'; my $acc = $feat->has_tag('protein_id') ? join(' ', $feat->each_tag_value('protein_id')) : 'no_pid'; my $featseq = Bio::PrimarySeq->new(); my $translation = $feat->has_tag('translation') ? join(' ', $feat->each_tag_value('translation')) : if ($translation eq ``) {print "can't get the correct seq of $id|$acc **\n"; next; } $featseq->seq($translation); $featseq->id("$id|$acc"); print $out $featseq; }}}

BioPerl - documentation

BioPerl - documentation

Presentation Transcript

Bioperl modules

BioPerl

Documentation

Beginning BioPerl for Biologists

Bioinformatica BioPerl

BioPerl – An Overview

BioPerl

Bibliographic Query Service in bioperl

BioPerl

BioPerl

Introducing Bioperl

Documentation

Installing Bioperl

Introduction to bioperl

BioPerl

Introducción a Bioperl

Bioperl modules

BioPerl