120 likes | 217 Views
D A S for ENCODE data coordination. Felix Kokocinski, WTSI. Project Overview. Partners:. University of California Santa Cruz, USA Washington University St. Louis, USA Broad Inst. of MIT and Harvard, USA Yale University, USA. HAVANA & EnsEMBL, Sanger Institute, UK
E N D
D A S for ENCODE data coordination Felix Kokocinski, WTSI
Project Overview Partners: • University of California Santa Cruz, USA • Washington University St. Louis, USA • Broad Inst. of MIT and Harvard, USA • Yale University, USA • HAVANA & EnsEMBL, Sanger Institute, UK • University of Lausanne, CH • Centre for Genomic Regulation, ES • Spanish Nat. Cancer Res. Centre, ES Goal: Annotate all evidence-based gene features at a high accuracy across the human genome • protein-coding loci with isoforms • nc loci with transcript evidence • pseudogenes
Manual Genome Annotation • ~20 annotators working according to HAVANA guidelines • computational pipeline for alignments • Otterlace software • input from partner groups, import of data source via DAS • verification with RT-PCR, RACE & sequencing
Perl API Update Scripts Source Adaptors interface WWW exper. ver. issues high prior. issues Data Exchange using DAS Distributed Annotation Sources GenTrack tracking system Otterlace ann. software
GenTrack Annotation Tracking • extension of open-source RoR ticketing system Redmine(www.redmine.org) • data import via DAS • modules for analyzing and flagging data • www.sanger.ac.uk/gentrack
GenTrack: Workflow • Entry points: • List of all genes & transcripts in region • High-priority loci • Loci with specific tags • Identify problem, compare in Otterlace • Resolve by • Changing annotation or • Disbelieving other source • Note decision
DAS Specifics Format: Specialized 1.53E <type-id> from sequence ontology (exon: SO:0000147) <method> (havana_manual_annotation) <type-category> Evidence code describing the type of method (inferred from RT-PCR experiment (ECO:0000109)) <note> - key=value pairs - parent, lastmod [req] (LASTMOD=2006-04-07T15:15:58+0100) - transcripttype, etc. [opt]
Thanks Jennifer Harrow Steve Searle Adam Frankish Bronwen Aken Toby Hunt James Gilbert Tim Hubbard Anacode Andy Jenkinson Steve Trevanion Jonathan Warren Redmine.org Paul Bevan ENCODE partners Jody Clements