130 likes | 296 Views
Log in to amazon biolinux For mac users ssh ubuntu@public_dns_address For Windows users use putty Hostname public_dns_address username ubuntu mkdir bioperl cd bioperl wget http:// biobase.ist.unomaha.edu /~ ithapa / myfile.gbk. BioPerl. Ishwor Thapa (02/17/2012).
E N D
Log in to amazon biolinux • For mac users • sshubuntu@public_dns_address • For Windows users • use putty • Hostname public_dns_address • username ubuntu mkdirbioperl cd bioperl wgethttp://biobase.ist.unomaha.edu/~ithapa/myfile.gbk
BioPerl Ishwor Thapa (02/17/2012)
How Perl saved the human genome project http://www.bioperl.org/wiki/How_Perl_saved_human_genome • DATE: Early February, 1996 • LOCATION: Cambridge, England, in the conference room of the largest DNA sequencing center in Europe. • OCCASION: A high level meeting between the computer scientists of this center and the largest DNA sequencing center in the United States. • THE PROBLEM: Although the two centers use almost identical laboratory techniques, almost identical databases, and almost identical data analysis tools, they still can't interchange data or meaningfully compare results. • THE SOLUTION: Perl.
BioPerl • Students from biocomputing course at uni-bielefeld.de • http://bioperl.org/pipermail/bioperl-l/1996-September/002618.html • all large genome centers worldwide
Installing BioPerl • BioLinux comes with BioPerl • For other machines (linux, mac, windows), • http://www.bioperl.org/wiki/Main_Page
Programming in Perl print “Hello World!\n”; for (int$i = 0; $i < 10; $i++) { print “$i\n”; }
BioPerl • Two Main Classes in BioPerl Bio::SeqIO Bio::Seq
using Bio::SeqIO • 3 Main Methods new next_seq write_seq
Genbank to Fasta converter use Bio::SeqIO; $in = Bio::SeqIO->new(-file => ”myfile.gbk" , -format => ’Genbank'); $out = Bio::SeqIO->new(-file => ">myfile.fasta" , -format => ’Fasta'); while ( my $seq = $in->next_seq() ) { $out->write_seq($seq); }
Bio::Seq • 3 Main Methods new seq subseq display_id desc revcom
Using Bio::Seq use Bio::SeqIO; $in = Bio::SeqIO->new(-file => "myfile.gbk" , -format => 'Genbank'); while ( my $seq = $in->next_seq() ) { print $seq->display_id; print $seq->desc; #print $seq->seq; #print $seq->subseq(10,20); #print $seq->revcom->seq; }
while (my $seq = $seq_io->next_seq()) { my @features = $seq->get_SeqFeatures(); foreach my $feat(@features) { if($feat->primary_tageq "CDS") { my @pid = $feat->get_tag_values('protein_id'); my @translation = $feat->get_tag_values('translation'); for (my $index = 0; $index < scalar @pid; $index++) { print ">$pid[$index]"."\n"; print $translation[$index]."\n"; } } } }