150 likes | 375 Views
PAVE v3 . 1. * Perl, not distributed (unitrans = unique transcript). 2. EST libraries and transcript sets. loadLibrary.pl and runPAVE.pl Assembly of transcript sets : e.g. Assemble 454 transcripts and Illumina (NCGR) transcripts
E N D
PAVE v3 1 * Perl, not distributed (unitrans = unique transcript)
2 EST libraries and transcript sets • loadLibrary.pland runPAVE.pl • Assembly of transcript sets: e.g. • Assemble 454 transcripts and Illumina (NCGR) transcripts • maintaining the expression levels with each contig • Mix ESTs and transcripts sets: e.g. • Assemble Sanger ESTs and one or more transcript sets • maintaining the expression level per contig. • No assembly: only load transcripts and expression levels • e.g. for rhi_EhRiIllumina data only (~1hr setup, 1hr execution) • Easily add RNA-seq libraries (~0.5h setup, 0.5h execution)
3 viewPAVE v3 has library information and queries just like webPAVE Include/ exclude Currently adding normalized expression levels
Previous web based queries (driven by biologist requests) On Unitrans page NRO – non-redundant organism
annotatePAVE: Taxonomic UniProts (annoDB) 5 From viewPAVE overview for horsetail: Maximum 25 hits per unitrans per annoDB (12 x 25 possible hits per unitrans) 1st Best – not best of all, but the 1st Best hit found (order of annoDBs is important) Blast evalue cutoff 1e-20
Times on 12 dedicated CPUs for 71k unitrans 6 31msprot_plants 9h:37mtrembl_plants 29m sprot_fungi 9h:50m trembl_fungi 19m sprot_viruses 8h:30mtrembl_viruses 24m sprot_invertebrates 16h:11m trembl_invertebrates 2h:21m sprot_bacteria 2d:18h:39m trembl_bacteria 1h:29m sprot_pfvib (- sprot species) 8h:12m trembl_pfvib (- tremblspecies) ~ 5 days Option to only blast against un-annotated unitrans, but that was not used for any of these.
7 annotatePAVE computes unitrans hit filter sets For each unitran, each DB hit can be in zero or more of the filtered sets.
viewPAVE contig table 8 General 1st Best hit All hits Plant specific
Selected contig from contig table – initial view 9 Multiple overlap Overlap differs by 30 bases on left or right, or different orientations
R R N R R N R 10
Gene reps vs total hits Ignore all descriptions with ‘putative’, etc. Remove last digit/letter. Show one hit for each resulting description, where the hit has the highest Eval.
Similar Contig Pairs based on (1) selfblast, (2) translated selfblast, (3) shared DB hits