200 likes | 320 Views
Using Exons to Define Isoforms in PRO. Timothy Danford Novartis Institutes for Biomedical Research PRO / AlzForum Kickoff Meeting Oct. 4, 2011. Genes vs. Proteins. Gene Transcript Exon Locus Allele Variant SNP Indel Rearrangement Motif. Protein Isoform Variant Domain Site
E N D
Using Exons to Define Isoforms in PRO Timothy Danford Novartis Institutes for Biomedical Research PRO / AlzForum Kickoff Meeting Oct. 4, 2011
Genes vs. Proteins • Gene • Transcript • Exon • Locus • Allele • Variant • SNP • Indel • Rearrangement • Motif • Protein • Isoform • Variant • Domain • Site • Complex • Motif • Fragment Can we join the worlds of PRO and of Genes, at a finer-grained level than that of “full sequence?”
Isoforms in PRO Today [Term] id: PR:000010173 name: microtubule-associated protein tau def: "A protein that is a translation product of the MAPT gene or a 1:1 ortholog thereof." [PRO:DNx] comment: Category=gene. Flag=automatic. synonym: "MAPT" EXACT PRO-short-label [] synonym: "neurofibrillary tangle protein" EXACT [] synonym: "paired helical filament-tau" EXACT [] synonym: "PHF-tau" EXACT [] synonym: "MAPTL" RELATED [] synonym: "Mtapt" RELATED [] synonym: "MTBT1" RELATED [] synonym: "TAU" RELATED [] is_a: PR:000000001 ! protein PRO v23 (10/2/2011)
Isoforms in PRO Today [Term] id: PR:000026993 name: microtubule-associated protein tau isoform Fetal-tau def: "A microtubule-associated protein tau that is a translation product of some mRNA giving rise to a protein with the amino acid sequence represented by UniProtKB:P10636-2 or a 1:1 ortholog thereof." [PRO:DAN] comment: Category=sequence. synonym: "Fetal-tau" EXACT [] is_a: PR:000010173 ! microtubule-associated protein tau PRO v23 (10/2/2011)
Isoforms in PRO Today PRO v23 (10/2/2011)
Isoforms in PRO Today PRO v23 (10/2/2011)
Isoforms in PRO Today PRO v23 (10/2/2011)
Isoforms in PRO Today PRO v23 (10/2/2011)
Isoforms in PRO Today PRO v23 (10/2/2011)
Tau Isoforms Share Functionally-relevant Exons “Conserved Protein Domains in Tau Suggest Functional Differences between Protein Isoforms” Adult Tau Fetal Tau Slide: Gwen Wong (AlzForum), Image: http://www.med.upenn.edu/cndr/TauSynuclein.shtml
What Questions Could We Askof PRO + Genomic Data? • Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature? • common exons? • common domains? (pfam, interpro, etc.) • Which “normal” protein isoforms overlap with SNPs or other genetic variants? • How do protein sites line up to sites on the gene? • How do mouse and human proteins correspond?
“Which isoform corresponds with which transcript(s)?” Transcript Variant: This variant (4) lacks six internal coding exons, as compared to variant 6. The reading frame is not affected, and the resulting isoform (4) has identical N- and C-termini but lacks five segments, as compared to isoform 6.
Defined class of Isoforms based on has_part and lacks_part to particular exons
How is “MAPT Exon 2” defined? • Take the “exon” definition from SO:0000147 • “A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.” • Exon number defined relative to the full-length or “canonical” transcript • “An exon that corresponds (aligns) to the second of 13 exons in the full-length MAPT transcript...” • Define the part of the protein derived from this portion of the transcript…
What Questions Could We Askof PRO + Genomic Data? • Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature? • common exons? • common domains? (pfam, interpro, etc.) • Which “normal” protein isoforms overlap with SNPs or other genetic variants? • How do protein sites line up to sites on the gene? • How do mouse and human proteins correspond?
What Questions Could We Askof PRO + Genomic Data? • Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature? • common exons? • common domains? (pfam, interpro, etc.) • Which “normal” protein isoforms overlap with SNPs or other genetic variants? • How do protein sites line up to sites on the gene? • How do mouse and human proteins correspond?
What Questions Could We Askof PRO + Genomic Data? • Which isoform corresponds to which transcript(s)? • Which isoforms share a common feature? • common exons? • common domains? (pfam, interpro, etc.) • Which “normal” protein isoforms overlap with SNPs or other genetic variants? • How do protein sites line up to sites on the gene? • How do mouse and human proteins correspond?