110 likes | 222 Views
Gene Curation - Future. C.elegans????. Paul Davis The Sanger Institute. Overview. Things we will continue to do. Gene Model curation – future directions Curation of new sequence features. Possible features to curate EMBL/GenBank sequence features. Other nematodes.
E N D
Gene Curation - Future. C.elegans???? Paul Davis The Sanger Institute Advisory Board Meeting , C.S.H. 2005
Overview • Things we will continue to do. • Gene Model curation – future directions • Curation of new sequence features. • Possible features to curate • EMBL/GenBank sequence features. • Other nematodes. Advisory Board Meeting , C.S.H. 2005
We Will Continue To Do. • C. elegans Sequence Changes. • Transcript Data. • 3rd Party Submissions. • C. elegans Gene Model curation. • mRNA/EST data from NDBs. • User input. • Literature. • Maintain current curation lists. Advisory Board Meeting , C.S.H. 2005
Future Directions. • Investigate use of additional sequence features for gene prediction/annotation. • features • Protein alignments • WABA alignments – pairwise alignments • elegans::briggsae • Other nematode species. • SAGE tags • Clone boundaries and chromosome ends • Weak splice sites. • Short Introns. Advisory Board Meeting , C.S.H. 2005
Future • Approaches • Manual scan • Drive scripted generation of new curation lists. Advisory Board Meeting , C.S.H. 2005
Manual Scan. • Regions • 2 X 1Mb pilot sequences. • Each Mb pilot will be made of 5 sequence blocks of ~200kb. • Gene-rich internal regions • Chromosome arms. • Split between centre. • 2 centres to collaborate and QC. Advisory Board Meeting , C.S.H. 2005
Scan Aims • Data resulting from the scan will • Identify new curation targets. • Potentially alter current curation strategy. • Timings • 1Mb. • Whole Genome. Advisory Board Meeting , C.S.H. 2005
Caltech Literature Curators Sanger/St Louis Sequence Curators. Expanding Sequence Data. • Curation of new sequence features. • A start was made back in Dec 2004 • Manual annotation of 155 binding_site features taken from scientific literature. • Possible features to curate • Promoters defined experimentally in literature • EMBL/GenBank sequence features. Advisory Board Meeting , C.S.H. 2005
Binding Site Features ceh-22 Advisory Board Meeting , C.S.H. 2005
C. briggsae Gene Curation. • Agreed to start tackling known Gene problems • User community. • Data from the ~2500 briggsae ESTs. • User community expectations. • Is it good enough to have 1 well curated gene set for elegans, partial annotation on briggsae but only computationally derived gene sets for the other nematodes. Advisory Board Meeting , C.S.H. 2005
Discussion Topics • Either refine elegans further or curate new genomes? • Curation of other nematode species • C. briggsae has ~2500 ESTs • C. remanei has ~15000 ESTs • Other elegans gene predictions • Inclusion of additional predicted gene sets. e.g.GeneMark.hmm – (Mark Borodovsky) • Or is it good enough having the curated elegans set and 2 additional gene sets. • Genefinder – (Phil Green) • Twinscan – (Michael Brent) Advisory Board Meeting , C.S.H. 2005