1 / 11

VectorBase annotation metrics

VectorBase annotation metrics. Daniel Lawson VectorBase-EBI, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton UK. Topics. Annotation metrics Numbers (Gene numbers & xrefs) Data types (Availability & Integration) Annotation SOPs Genome specific Gene specific

regina
Download Presentation

VectorBase annotation metrics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VectorBase annotation metrics Daniel Lawson VectorBase-EBI, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton UK VectorBase BRC4 2006

  2. Topics • Annotation metrics • Numbers (Gene numbers & xrefs) • Data types (Availability & Integration) • Annotation SOPs • Genome specific • Gene specific • Gene build profile & prediction confidence VectorBase BRC4 2006

  3. VectorBase BRC4 2006

  4. Considerations • Importance of calculating all metrics using similar methodology from the same data set • Metrics calculated from Ensembl using BioMart & raw SQL queries. • GO terms - many ways of calculating (InterPro2GO, projection from Drosophila orthologs) • No VectorBase capability to automatically assign EC numbers VectorBase BRC4 2006

  5. VectorBase BRC4 2006

  6. Canonical Gene set VectorBase gene prediction pipeline (SOP) Blessed predictions Manual annotations Community submissions VB:SOP010 VB:SOP007 Similarity predictions Species-specific predictions VB:SOP002 & SOP003 VB:SOP001 Protein family HMMs ncRNA predictions VB:SOP009 VB:SOP008 Transcript based predictions Ab initio gene predictions VB:SOP004 VB:SOP005 VectorBase BRC4 2006

  7. Assignment of SOPs to VectorBase genes: AgamP3.3 VectorBase BRC4 2006

  8. Display of Metrics & SOPs • Metrics • VectorBase wiki • Species-page containing the three tables available from the VectorBase species homepage • Expansion of documents relating to genomic resources (citations, links to primary data where possible) • Single collated table for BRC as separate download • SOPs • VectorBase wiki • ‘Documents’ section of main site VectorBase BRC4 2006

  9. VectorBase BRC4 2006

  10. Manual annotation progress VectorBase BRC4 2006

  11. Merging gene sets Gene set #1 Gene set #2 Reduce to single predictions per locus Compare exon/intron structures Identical structures Compatible structures Different structures Merge/Split structures Complex No Map Add isoform predictions based on EST/Peptide data Canonical gene set VectorBase BRC4 2006

More Related