100 likes | 199 Views
5 ' UTR Modeling Part II. Sam Gross Randy Brown. Old 5 ' UTR Model. Einit. Inter. Utr5. Prom. Esngl. Prom. ATG. Coding Exon. Inter. Utr5. New 5 ' UTR Model. Enc. Inc. Ep. Einit. Ea. Inter. Prom. Epa. Esngl. Prom. Epa. ATG. Coding Exon. Inter. Prom. Ep. Ea. ATG.
E N D
5' UTR ModelingPart II Sam Gross Randy Brown
Old 5' UTR Model Einit Inter Utr5 Prom Esngl Prom ATG Coding Exon Inter Utr5
New 5' UTR Model Enc Inc Ep Einit Ea Inter Prom Epa Esngl Prom Epa ATG Coding Exon Inter Prom Ep Ea ATG Coding Exon Inter Inc Prom Ep Enc Ea ATG Coding Exon Inter Inc Inc
5' UTR “Coding” Model • Scores bases in a possible 5’ UTR feature with a Markov chain • Different parameters for isochore I and isochores II, III, IV just like CDS • No frames in UTR sequence • Null model: first 1000bp of first coding intron Same parameters for all UTR features?
This suggests Epa and Ep features should use one model, and Ea and Enc features should use another • Also suggests two different initial exon models should be used: one for spliced UTR, one for unspliced UTR • Helps explain why initial exons look so different from internal and terminal exons However, it’s really not so simple…
Only about 75% of genes are associated with a CpG island • Even if a feature is within a CpG island, it could fall in a locally normal-looking region • Current plan: Ep, Epa, and Einit states will use dual coding model, with 75% chance to score using CpG-related model and 25% chance to score using non-CpG-related model
Future Directions • Finish dual coding model for Ep, Epa, and Einit • New 5’ UTR conservation models • More work on modeling CpG islands