230 likes | 442 Views
PAVE Overview. www.plantrhizomes.org/progress.html. 454 MSU. Illumina NCGR. Assemblies. 454+Ilm RedRice(OlR). Need to compute Illimina low coverage (~singletons) from exp level. Total 454: 17,624 Total Illumina: 21,083. 454+Ilm Ginger ( ZoR ). PAVE assembler.
E N D
454+IlmRedRice(OlR) Need to compute Illimina low coverage (~singletons) from exp level. Total 454: 17,624 Total Illumina: 21,083
PAVE assembler • Assemble Sanger with mate-pairs – retaining mate-pairs in contigs • Assemble 454 – can handle ~500,000 easily by burying redundancy • Assemble consensus sequences from 454 and Illumina • SNPs – most use two confirming bases, but with 454 there is way too many false-positives due to homopolymers. So a ‘p-value’ is computed based on number of confirming ESTs, depth of ESTs at the base, and estimated base-call error. • Script to add expression level
Immediate Future • Streamline and combine web and java annotation • cmpPAVE • Show alignments of: • a protein to all its UniTrans • a UniTrans to all its proteins • Self-blast all UniTrans, create clusters of cliques, display and filters similar to UniProt • Incorporate GO, EC, etc • viewPAVE • Show alignment of all proteins for a UniTran • Show coverage of reads as histogram • Cluster similar UniTrans (instead of Pairs) • Incorporate GO and EC (i.e. same functionality as Web) • Web PAVE • Remove count of ESTs - instead indicate protein coverage • How much we further extend…..
NRO: would need to reduce hits to D6NDn_9POAL because one could have best be 42 and another 43….
Clustering • Best clustering for paralogs in viewPAVEand orthologs in cmpPAVE: • Same protein region • Prodomand/or Motif • EC • GO/GoSlim