340 likes | 617 Views
Mutations and Epimutations. A story of two cultivars and their children. Matteo Pellegrini. Nipponbare and 93-11. Nipponbare : Oryza sativa japonica Primarily Japan, China, Indonesia Agronomic differences: Days to heading. 93-11 Oryza sativa indica India, Bangladesh, Nepal, China
E N D
Mutations and Epimutations A story of two cultivars and their children. Matteo Pellegrini
Nipponbare and 93-11 • Nipponbare: • Oryza sativa japonica • Primarily Japan, China, Indonesia • Agronomic differences: • Days to heading • 93-11 • Oryza sativa indica • India, Bangladesh, Nepal, China • Submerged growth • Agronomic differences: • Seed fertility • Long grain • Taller (83 cm)
Why Study Crosses? • Crosses of Indica and Japonica are often sterile • Show hybrid vigor in agronomic traits
Overview NPB 9311 • Identify SNPs between ecotypes. • SNP generation • Identify epiMutations between ecotypes. • Identify methyl-inheritance • Identify allele-specific expression • Identify RNA editing • 2 rice ecotypes: Nipponbare and 93-11 • Generated BS-seq data for NPB, 93-11, and 2 reciprocal crosses P F1
Detecting Cytosine MethylationA,Cunmethylated,Cmethylated,G, T ? …mmm… …ACCCGTACCCGATTAG… …ATCTGTATCCGATTAG… • Apply sodium bisulfite and amplify: • Unmethylated C→T, methylated C(and A/G/T) unchanged • Try to align new sequence to known reference; compare
Mapping Approach: BS Seeker BS reads are C/T converted, so normal aligners are not applicable Three letter alignment: Restore to 4 letters Bowtie mapping Convert C to T Compare alignments BS read: AATCGTA AATTGTA u m AATCGTA AATTGTA TTAATTGTAGG CTAATCGCAGG Ref. genome: CTAATCGCAGG TTAATTGTAGG Chen et al (2010) BMC Bioinformatics
Methylation levels at single-base resolution tagtgcgtggtg cattttagtgcgtgg ttttagcgcgtggtg Ref. genome: 5’--attgagacatcctagcgcgtggtgacaataata—-3’ 1/(1+2)=33.3% 3/(3+0)=100% • Calculate methylation level at each covered cytosine • Methylation level= #C/(#C+#T)
Workflow • Alignments • BS-Seeker mapping of NPB and 9311 samples to NPB reference genome. • Maps 9311 genome to NPB coordinates • Parent genomes • Each read generates a small implied sequence fragment. • Use this to generate a parent genome. • F1 read matching • Map reads to NPB reference genome to get location. • Compare each read to NPB and 9311 parent genomes and determine better match.
Detecting Alelle-Specific methylation parent1/parent2 SNP Methylation level at CG sites parent1 BS-seq Methylation level at CG sites parent2
Identifying SNPs • If sites: • > 3 reads/strand • > 90% agreement within ecotype • Strands agree with each other (compensate for Cs). • (obviously) disagree with each other. • Will miss indels, dups, inversions, other chr rearrangements. • Will miss long runs of SNPs ( > 3 within ~55 bp) (BS-seeker limit)
SNPs - NPB vs 93-11 • 1,209,456 mutations / 306,106,830 sites with mutual base calls • ~ 1/253 bases • Mostly (73%) C->T (or G->A if C->T on opposite strand) or T->C & A->G if in other 93-11
SNPs - NPB vs F1 (9N-NPB) • 12 mutations • Are these real or false? • Similar numbers amongst all F1 comparisons
Identifying epimutations • Use the binomial dist. to build min, max, and mean pct methylation at each C. • Confidence intervals at 5% are min, max Min/max As # of reads ^, interval size v Reads
Identifying epimutations (cont) • Called different if: • mean(sample1) < min(sample2) & mean(sample2) > max(sample1)
Epimutation rate 1 in 300 CG sites spontaneously mutate across one generation
Epimutation clusters 9311 cross 9311 cross 9311 parent NPB cross NPB cross NPB parent
Epimutation clusters II 9311 cross 9311 cross 9311 parent NPB cross NPB cross NPB parent
Epimutations are enriched in regions where parents differ Half of the epimutations between parents and crosses occur at sites where parents differ
Epimutations (continued) • Epimutations within genes • 498 genes were significantly enriched for epimutations • GO Term x-ecotypes indicates: ATP synthesizing related activity (ATP synthesis coupled proton transport, hydrogen transport, ion transmembrane transport, etc).
Expression • Many genes (~7800/25640) are differentially expressed between ecotypes. • GO term: choroplast related terms, response to cadmiumion.
Expression cont. • Across generations, only 78 genes differentially expressed • Of these only 2 were differentially expressed in the parents
Allele Specific Expression NPB cross • 681 examples of allele specific expression • Partially explain hybrid vigor? NPB parent 9311 parent 9311 cross NPB cross 9311 cross
Allele-Specific Genes Accumulate Mutations SNP Density All genes Allele-specific genes And are also enriched for differentially methylated sites
Allele-specific Expressioncont. And are also enriched for differentially methylated sites
RNA Editing • Cytidine deamination : C to U • Adenosine deaminase: A to I (G)
How Widespread • Recent studies indicate that RNA editing may be more widespread than originally thought • Others have disputed this claim (Schrider et al, PlosOne) • In plants RNA editing is thought to take place in the mitochondria and plastids • Is there editing in nuclear genes? Science. 2011 Jul 1;333(6038):53-8.
RNA Editing in Rice Initially we found lots of examples….
On Closer Inspection… Alignments are often off by one or more bases at splice sites
But more Filtering Should be done… Position of edit site along read
Conclusions • Epimutation rates are one in 300 cytosines across one generation • Clusters of epimutations are present • Are enriched in sites where parental epigenomes differ • Allele-specific expression is widespread and associated with • Increased SNP densities • Higher differential methylation • Find some evidence for RNA editing but…
Acknowledgements • Krishna Chodavarapu (Pellegrini Lab) • SuhuaFeng (Steve Jacobsen Lab) • Blake Myers, Guo-liang Wang, YulinJia