230 likes | 328 Views
Systematic identification of abundand A-to-I editing sites in the human transcriptome. Levanon et. al. General RNA editing. End modifications (5’ cap and 3’ poly-A) Splicing/Alternate splicing Cutting Pre-tRNA & pre-rRNA cut to yield 2 or more functional molecules Common in all life.
E N D
Systematic identification of abundand A-to-I editing sites in the human transcriptome Levanon et. al
General RNA editing • End modifications (5’ cap and 3’ poly-A) • Splicing/Alternate splicing • Cutting • Pre-tRNA & pre-rRNA cut to yield 2 or more functional molecules • Common in all life
General RNA editing • Chemical modification • Chemical groups added to nucleotides • Nucleotides themselves modified • Common in tRNA & rRNA • Also common in all life • Happens (rarely?) in mRNA
mRNA editing • Modifications in coding region can alter sequence of expressed protein • Ex apoliopoprotein B • C U edit changes codon CAA UAA causing in an inframe stop • Resulting proteins of different lengths (4563 aa and 2153 aa) are expressed in different tissues (liver and intestine)
mRNA editing • Ex Glu-B gene undergoes 2 edits • Glutamine Arginine • Arginine Glycine • Modifications are at protein’s active site (an ion channel) • Alters protein function
ADAR mediated mRNA editing • Paper focuses on ADAR mediated editing • Adenosine deaminases that act on RNA • ADAR is the enzyme which catalyzes editing • Converts an A to I by removing amine group • I is chemically similar to G
ADAR mediated mRNA editing • Substrate of ADAR is dsRNA • Requires 2 nearby inverted repeats to form stem structure
ADAR mediated mRNA editing • ADAR necessary for normal development • Knockout mice without ADAR1 die before birth • Without ADAR2 survive to birth but die soon after • ADAR deficient invertebrates show behavioral defects
Identification of ADAR edit sites • Naive method • Align transcribed sequence to genome • Call A-G mismatches as edit sites • Problems • High error rate in sequencing transcripts (~3%) • SNPs get classified as edit sites
Identification of ADAR edit sites • Solution presented in paper • Use knowledge of ADAR mechanism to filter out errors and SNPs • Start with human EST/cDNA sequence database (Genbank) • Align spliced sequence to genome to get genomic locus • Align expressed sequence (exons) to genomic locus
Identification of ADAR edit sites • Keep reverse compliment alignments • >32 bp in length • >85% identity
Identification of ADAR edit sites • Results in 429,000 putative dsRNA regions in 14,512 genes • Filter out candidates derived from low quality transcribed sequences • Filter out all known SNPs • Mismatches in remaining dsRNA regions called as ADAR edit sites
Identification of ADAR edit sites • Get 12,723 putative edit sites in 1,673 genes • Over 80% of mismatches are A-G
Sensitivity/Specificity • Parameters set to minimize false positive rate • Several well known editing examples not picked up • May be lots more sites than found in this paper
Experimental validation • Pick 30 novel predicted edit sites and test • Sequence transcripts from 5 tissues separately and pooled • Detected editing at 26/30 sites
Characterization of editing sites • 92% of edits in Alu repeats • 12% of all editing events in positions 27 & 28 of Alu repeats • 1.3% of edits in L1 repeats • Most of the time (83%) only one expressed sequence shows editing • Editing is not deterministic
mRNA editing in the brain • Previous work suggested most pre-mRNA editing in brain is in non-coding regions • In this study • 12% in 5’ UTR • 54% in 3’ UTR • 33% in introns
dsRNA stability • Edits can stabilize or destabilize dsRNA • Destabilize (78%) • A-U I-U • Stabilize (19%) • A-C I-C • Neutral (3%) • A-A I-A • A-G I-G
dsRNA stability • Mechanism seems to prefer stabilization over destabilization • 22% (19 + 3) of events targeted a mismatched base pairing • Frequency of mismatched base pairs at nearby sites was only 10%
dsRNA stability • Why does ADAR want to make stable stem-loops in pre-mRNA? • May have something to do with regulation of RNAi • Might not want to limit search to expressed sequences but look at whole transcripts