80 likes | 211 Views
Having it all. Complete: every occurrence is found Precise: every occurrence is accurate Comprehensive: all types of features Richly described: biological functional and cross-species data What is needed to make this really happen? Can’t be swept under the rug. Managing annotations changes.
E N D
Having it all • Complete: every occurrence is found • Precise: every occurrence is accurate • Comprehensive: all types of features • Richly described: biological functional and cross-species data • What is needed to make this really happen? • Can’t be swept under the rug.
Managing annotations changes • More information: • Assembly changes, transcriptome data, comparative data, Cis regulatory element data, Mass spec data, repeat studies • Will lead to: • Additions, merges, splits, splerges, deletions, new feature types • Compounded by: • Conflict resolution and innate complexity • alternate transcripts, dicistronic genes, overlaps and intersections
The Essentials • Full-length cDNA sequences and other ‘hard’ biological evidence • High-quality assemblies • Manual editors for curation • Combiners • Annotation standards and verification • Tracking and versioning • Open source software components and standards are critical to long term success
Integration • Agree on Standards • GFF3 file format • Sequence Ontology • Gene Ontology • Phenotype… • Agree on Process • Exchange (DAS2) • Convergence • Versioning • Feedback
GFF3 • Need a common exchange format • To share and distribute data • Easy to transfer between databases • Then can be visualized and seen by everyone! • And can be edited/commented I.e. Apollo
SO enables rigorous description and querying of the data • How often are exons unique to a transcript? • How often does an exon appear in all of the transcripts? • For exons that occur in all the transcripts, How often are they coding exons? • For exons that occur in only one of the transcripts, how often are they noncoding? • Do unique exons contain the stop codon more often than exons in all the transcripts?
Annotation integration • DAS2 • IGB (Gregg Helt @ Affy) • Apollo (Suzanna Lewis @ Berkeley) • Allow exchange of annotations between researchers at different sites.
Design and Plan for Integration • It is essential to tightly define standards at the outset • It is essential to have on-going assessment and evaluation of accumulated pooled data