310 likes | 432 Views
EGASP 2005 Evaluation Protocol. Paul Flicek EBI. Basics. The evaluations are probably wrong GTF is not standard There are hidden assumptions Filters, overlaps, clusters Terminology varies Genes, exons, etc. Evaluation Measures. Exons and introns Sensitivity (Sn) Specificity (Sp)
E N D
EGASP 2005EvaluationProtocol Paul Flicek EBI
Basics • The evaluations are probably wrong • GTF is not standard • There are hidden assumptions • Filters, overlaps, clusters • Terminology varies • Genes, exons, etc. EGASP2005Evaluations
Evaluation Measures • Exons and introns • Sensitivity (Sn) • Specificity (Sp) • Exon length • Exons per transcript • Transcript • Sn / Sp • Overlap • Gene • Sn / Sp EGASP2005Evaluations
Definitions EGASP2005Evaluations
Definitions • Positive Transcript • Correct translation start • Correct translation stop • Every splice site correct • Positive Gene • At least one positive transcript EGASP2005Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP2005Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP2005Evaluations
Examples Annotation Trans Sn = 0.0 Trans Sp = 0.0 Gene Sn = 0.0 Gene Sp = 0.0 Prediction EGASP2005Evaluations
Examples Annotation Trans Sn = 1.0 Trans Sp = 1.0 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP2005Evaluations
Examples Annotation Trans Sn = 0.5 Trans Sp = 0.5 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP2005Evaluations
Examples Annotation Trans Sn = 1.0 Trans Sp = 0.67 Gene Sn = 1.0 Gene Sp = 1.0 Prediction EGASP2005Evaluations
The winners are…(there are clear trends) • The most successful programs use expressed sequences • Programs using evolutionary conservation are more successful than those that do not • Exon and nucleotide measures are similar • We are improving EGASP2005Evaluations
Spear Catching Time EGASP2005Evaluations
EGASP 2005EvaluationsBlock 1 Paul Flicek EBI Expressed Sequence Methods
Nucleotide EGASP2005Evaluations
Exon EGASP2005Evaluations
Intron EGASP2005Evaluations
Gene EGASP2005Evaluations
Number of Genes 1027 1389 EGASP2005Evaluations
Unique Exons EGASP2005Evaluations
Summary EGASP2005Evaluations
EGASP 2005EvaluationsBlock 2 Paul Flicek EBI Evolutionary Conservation (Dual/Multiple Genome) Methods
Summary EGASP2005Evaluations
EGASP 2005EvaluationsBlock 3a Paul Flicek EBI Ab initio (single genome) and Exon only Methods
Summary EGASP2005Evaluations
EGASP 2005EvaluationsBlock 3b Paul Flicek EBI Open (Any) Methods
Summary EGASP2005Evaluations