1 / 29

Gene tree analyses of Aboriginal Australians

Gene tree analyses of Aboriginal Australians. Rosalind Harding University of Oxford. Aim. To investigate gene genealogies of two data sets Human mitochondrial coding genomes from Aboriginal Australians Hepatitis B virus from the Pacific region (collaboration with Rory Bowden) Why?

albert
Download Presentation

Gene tree analyses of Aboriginal Australians

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Gene tree analyses of Aboriginal Australians Rosalind Harding University of Oxford

  2. Aim • To investigate gene genealogies of two data sets • Human mitochondrial coding genomes from Aboriginal Australians • Hepatitis B virus from the Pacific region (collaboration with Rory Bowden) • Why? • To evaluate time depth of polymorphism • To use coalescent models rather than molecular clocks in phylogenies • To examine the implications of demographic assumptions

  3. Aim for the ongoing study of HBV • For HBV, mutations at fast sites have to be removed to resolve networks. • But, mutations at fast sites contribute to high estimates of mutation rates. • If we remove the fast sites, how do we recalibrate the mutation rates? • Can we match patterns of HBV diversity in the Pacific region to human dispersals that have been dated by archaeology and genetics, to suggest appropriate time scales?

  4. Theoretical background

  5. Coalescence times in a gene genealogy Notice that T(2) is longer than T(3). Here N is assumed constant This time scaling shows what we expect for a standard coalescence model. Rosenberg and Nordborg, 2002

  6. for the whole tree Introducing Mutation 1 2 3 Rosenberg and Nordborg, 2002 MCRA of 1, 2 & 3 G → T 1 2 3 The MCRA of 1, 2 & 3 is usually a more recent (younger) common ancestor than the common ancestor in whom a shared mutation, G → T, first arose.

  7. Constant N vs Expansion Gene genealogy simulated assuming constant Ne Gene genealogy simulated under population expansion A B C 3 7 5 1 1 1 1 1 1 1 2 1 3 1 1 Frequencies of 3 alleles Frequencies of 11 alleles

  8. Computational analyses • Software based on Genetree written by Prof Bob Griffiths • Input data: infinite-sites compatible gene tree • Unpublished upgrades that use importance sampling, following algorithms developed by Paul Fearnhead.

  9. Polymorphism data for gene genealogies

  10. Resolving the gene trees • MtDNA coding genomes: • Minor problem: recurrent or back mutation events • Solution: re-instate inferred mutation events following standard mtDNA phylogeny reconstructions • HBV data: • Major problem: a subset of fast sites • Solution: determine fast sites using Parat software from Meyer & Von Haeseler, 2003, Mol. Biol. Evol. 20(2):182-189, and proceed as above.

  11. Background to mtDNA study • Van Holst Pellekaan et al. (2006) Mitochondrial genomics identifies major haplogroups in aboriginal Australians. Am J Phys Anthrop 131: 282-294. • Estimated a time scaled genealogy for 8 mtDNA coding regions from individual samples sequenced by Van Holst Pellekaan. • No of genomes in public database and available to study has increased, now n=34.

  12. KYA 74,200 N M 15301G 10873T 10398A 9540T 8701A 15043A 14783C 12771A 10400T 8793C 4508T 53,200 MAuB NAuA 1598G 12705C 43,000 NAuD 36,350 36,300 15607G 15040T 14384C 8506C 8404C 8251A NAuC 15885T 15852C 15300C 12771A 10724C 8705C 6755A 6221C 5563A 5147A 14527G 9410G 9156G 6104T 5563A NAuE 26,700 12999G 8635A 8251A 7961C 1346G 1187C 10398G 13419G 13132T 11288T 11016A 10914A 10088T 6881G 6260A 5276G 4976G 4688C 3010A 1719A 591A 17,400 15521C 15511C 15110A 12756A 12414C 11404G 11353C 11065G 8614C 8167C 7805A 5237A 3391A 15002A 9095C 4008G 1598G 13341T 13105G 8542C 8474T 8014G 7705C 5126T 14572T 10645G 8269A 5442C 2772T 12715G 11110G 10786C14783T r1 d3 d38 d32 r17 r6 r7 r25 Present mtDNA genealogy Both of the major non-African haplogroups represented. Time scale estimated from gene tree suggests that these lineages evolved from original founders, 40,000 – 50,000 years ago.

  13. Network of 34 mtDNA genomes M: AuB; N: haplogroups O (AuD), S (AuA), P: P3, P4 (AuC), P5, P6, P7, P8 (AuE)

  14. Time scale for Australian mtDNA • Estimated mutation rate: • m: 0.0053 per coding region per generation • Data suggests population expansion • Find model parameters with relatively high likelihood • ML(q) = 350 = 2Nu • Population expansion rate since TMRCA : e5 • TMRCA: time to most recent common ancestor • Population size • N present: 33,000 • from N ancestral: 220 at TMRCA • TMRCA: 66,000 yrs Note: P3 is the only haplogroup with branches represented in both Australia and PNG.

  15. mtDNA genealogy New analysis confirms: Aboriginal Australian diversity has been evolving in isolation for ~40,000 years.

  16. Background to HBV study • Analysis by Rory Bowden • Focus on HBV variability in Australia and the Pacific and judge the time scale of the genealogy by comparison with hypotheses for HBV dispersal • Within Genotype C are two very distinct sequences from aboriginal Australians.

  17. HBV Genotypes • Worldwide, 7 HBV genotypes each with distinct geographic distribution. • Sampling in East Asia and Pacific region finds mainly genotypes C and D

  18. Australia and Pacific region First occupation of Australia: 50,000 yrs BP; PNG and Solomons: 30,000 yrs BP; Austronesian expansion: Vanuatu: 5,000 yrs BP; Fiji and Tonga: 3,000 yrs BP.

  19. HBV C Genotype Network China/Japan Various, mainly Melanesian AUSTRALIA Vanuatu Tonga, Fiji

  20. HBV C: Starting again … Network after removal of 10 fastest sites. S antigen sequences, relative rate cut-off of 15.

  21. HBV C: more resolution Network of S antigen sequences

  22. HBV Genotype C in the Pacific

  23. Time scales 5000 or 50,000 years? 3000 or 30,000 years? 2000 or 20,000 years?

  24. Conclusions • Gene trees can be constructed for mtDNA and HBV data to represent polymorphism data. • Coalescent analyses are feasible • Contemporary mtDNA diversity in aboriginal Australians dates represents founding lineages • Contemporary HBV diversity in Australia and Pacific could be explained by two alternative times scales (more work to do!) • Over 50,000 years • Over 5,000 years

  25. Abstract: Gene tree analyses of Aboriginal Australians. Genetrees from mitochondrial DNA sequences have been widely used for phylogeographic analyses of modern human dispersal but are not so often used in combination with coalescent models for demographic inferences. Given the lack of recombination in mtDNA, such data should be ideal for gene tree based coalescent analyses. However, the same mutability that makes them so informative for studies of geographic variation also generates difficulties for analyses assuming an infinite-sites mutation process. The main aim of this talk is to present some gene tree based coalescent analyses applied to hypervariable sequence data from mtDNA and also other genomes, and discuss solutions to the problems ensued. The primary data set comprises 34 mtDNA coding genomes from Aboriginal Australians and extends work presented by van Holst Pellekaan et al. (Am J Phys Anthrop 131:282-294, 2006). Mitochondrial DNA is not the only haploid genome that has value for anthropological genetics. Vertically transmitted bacteria can also be informative, as has been previously shown using data from Helicobacter pylori. In collaboration with Rory Bowden, we hope to show that Hepatitis B virus strains may also provide insights into anthropological questions about Aboriginal Australian prehistory. Hopefully, I will have results on some gene tree analyses as well as methodological issues to discuss.

  26. The Pacific

  27. Pacific and Indian D Genotype

  28. A more complicated network …

  29. HBV S antigen gene sequences

More Related