160 likes | 311 Views
The bonobo genome compared with the chimpanzee and human genomes. Kay Pruüfer et al. Nature (June,2012) Presenter: Chia -Ying Chen . Materials. a female bonobo ( Ulindi , Leipzig Zoo) -- 454 sequencing -- paired-end reads of insert
E N D
The bonobo genome compared with the chimpanzee and human genomes Kay Pruüfer et al. Nature (June,2012) Presenter: Chia-Ying Chen
Materials • a female bonobo • (Ulindi, Leipzig Zoo) • -- 454 sequencing • -- paired-end reads of insert • sizes of 3, 9 and 20 kb • (a total depth of 26 X ) • 19 individuals: • 3 bonobos • 2 western chimpanzees • 7 eastern chimpanzees • 7 central chimpanzees • -- Illumina 76 or 101 paired end (about 1X coverage)
Assessment of the Bonobo Genome panTro2 (Clint) (6X Sanger sequneced)
Retrotransponson Evolution in the Bonobo Genome using GMAP to align all available bonobo (198 million reads) and chimpanzee (46 million reads) sequence traces to hg18
991 27 30 99.7%
Using Aluretrotransposon to estimate split times 15M 6.5M 2.2M B C H O
Ontology analysis of transposon • to create 2 million simulated insertions • to count numbers of observed vs. simulated transposon integrants • inside or within +/- 50 kb Refseq genes • to query the PANTHER data biological processes molecular functions Enrichment or depletion of L1 integrants
Divergence, Site Pattern Analysis and Signals of Admixture to gain insight into the relationship between and within bonobo and chimpanzee populations Data: illumina reads of 16 chimpanzees and 3 bonobos the 454 reads of Ulindi the Sanger sequencing reads of Clint (panTro2)
Divergence times Bonobo - Chimpanzee : 2.2 million years Clint - Central chimpanzee : 1.3 million years Clint - Eastern chimpanzee : 1.3 million years Clint - Western chimpanzee : 0.5 millions years Ulindi - Bonobo : 0.5 million years Nc: A equals B, and C different Nb: A equals C, and B different Divergence between A and B :
Site Pattern Analysis and Signals of Admixture in blocks of 5 mega bases C1 C2 B H
Speciation Times, Ancestral Population Size and Incomplete Lineage sorting • The scenarios may lead to gene trees with a topology different from the species tree : • The population size of the ancestral species is sufficiently large. • The time span between speciation events is sufficiently small. These areas are termed incomplete lineage sorting (ILS) Based on the 4-way alignment (HCBO) set phred score=30 masked RepeaatMasker track removed over-collapsing of regions due to duplications
CoalHMM analysis is run on each mega base of alignment chunks
Correlation between ILS and gene ontology classes • to count the bases in each of the four ILS states for the entire length of genes including introns • to carry out GO enrichment test using FUNC • to identify GO categories that are either enriched or depleted for CH and BH bases using Wilcoxon rank test Genes depleted in ILS : intracellular, transcription, translation Genes enriched in ILS : protein signal to the membrane cell adhesion But, no preferential GO terms when the analyses separately identify GO categories for CH and BH bases.
Incomplete Lineage Sorting Regions and Balancing Selection The regions may be enriched in incomplete lineage sorting (ILS) due to long-standing balancing selection. We considered ILS assignment in 50 kb windows to identify candidate regions
If balancing selection remains active until present times, it may also affect the patterns of polymorph in present-day populations. • Balancing selection candidates: • to exhibit high diversity in chimpanzee • to be enriched for shared SNPs