460 likes | 472 Views
Explore the implications of hybridization capture and high-throughput sequencing on ancient DNA research, including the advantages, disadvantages, and cost considerations. Gain insights from studies on mammoth, Neanderthal, and other ancient genomes.
E N D
Hybridization capture, high-throughput sequencing and its implications for ancient DNA research Michael Hofreiter
Is science becoming infantilized? Our young people are undisciplined and sleazy. They do not listen to their parents anymore. The end of the world is near. Ur, Chaldäa, 2,000 BC
Is ancient DNA research infantile? It’s a zebra Higuchi et al. 1984, Nature
However........... Watson and Crick 1953 was also a short Nature paper
Simple stories are not always bad There are lies for children and lies for adults Terry Pratchett
Some more reflections Not all investigations deserve equal respect. Observations alone do not always make sense. What do we really learn from genomic data? How to win a Nobel prize? I don’t know.
The latest fancy piece of kit ~ 200 Gb total sequence ~ 1 billion individual reads
The latest Increase in sequencing throughput
So what have we learned from ancient genomes Mammoth genome draft: Hm............ Saqqaq genome: Migrated from Arctic north-east Asia 5,500 B.P. Neanderthal genome draft: Diverged from modern humans ~ 0.4 mya Maybe gene flow into modern human gene pool Genetic regions were selected on human lineage
And what did they cost? Mammoth genome draft: ~ $ 800,000 Saqqaq genome: $ 500,000 Neanderthal genome draft: $ 6.4 million
The disadvantages Shotgun sequencing Made for a maximum of 8 samples Costs - $ 20,000 per run
Another problem Neandertal 4.0% Percentage endogenous DNA
“spelaeus” “eremus” “ladinicus” “rossicus” “ingressus” “kudarensis” NJ tree 123 sequences 250 bp control region outgroups
“spelaeus” “eremus” “ladinicus” “rossicus” “ingressus” 51! “kudarensis” Condensed NJ tree 50% bootstrap cutoff 123 sequences 250 bp D-loop outgroups
Ursus spelaeus Ursus ingressus Ursus kudarensis SP1325 Zoolithen cave Ger Combined NJ, ML and Bayesian tree based on 9,632 bp of 2 published and 31 additional cave bear specimens SP2083 A Ceza Sp 90-86-1.0 99-100-1.0 SP2085 A Ceza Sp SP1659 Arcy Cure Fr 99-100-1.0 EU327344 Chauvet Fr 100-99-1.0 SP2091 Eiros Sp 100-100-1.0 SP1497 Herrmanns cave Ger SP2081 Cova Linares Sp 93-94-1.0 SP1330 Zoolithen cave Ger 99-85-1.0 SP1334 Zoolithen cave Ger 100-100-1.0 SP2129 Grotte d’ours Fr SP370 Herdengel cave Au 100-40-0.5 100-100-1.0 SP2133 Schneiber cave Ger SP1324 Zoolithen cave Ger SP1844 Divje babe Slo SP1626 Pestera cu Oase Ro 100-84-1.0 100-100-1.0 100-100-1.0 SP1629 Pestera cu Oase Ro SP2125 Medvedia jaskyna Slv 85-91-1.0 SP2062 Bolshoi cave Ru 95-87-0.89 SP2065 Medvezhiya cave Ru 61-62-0.89 SP2064 Secrets cave Ru SP1845 Divje babe Slo SP2027 Geissenkloesterle Ger 58-55-0.98 92-86-1.0 59-59-0.98 SP2106 Geissenkloesterle Ger 97-91-1.0 SP232 Nixloch Au SP234 Potocka zijalka Slo 100-95-1.0 SP335 Gamssulzen Au 100-100-1.0 SP233 Potocka zijalka Slo SP1850 Divje babe Slo NC011112 Gamssulzen Au 63-63-0.93 SP341 Gamssulzen Au SP2073 Hovk Arm 100-100-1.0 SP2074 Hovk Arm EU497665 Ursus arctos
Results of DMPS between 13.0 and 16.5 kb replicated sequence for each of the 31 individuals ~1.0 Mb of targeted aDNA sequence data
Requirements for PCR PCR target Primer F Primer R Min 20BP Min 30BP Min 20BP Min molecule length 70BP
Fragment length in ancient DNA Frequency ½ fragment size = 2 - 100x number of molecules 30 50 70 Fragment length in BP
DNA hybridization capture • ~5Mb targeted per array • 7 arrays, whole exome • ~98% of exons retrieved • 300,000 primer pairs for aDNA • 6,000 LR-PCRs for modern DNA Probes Glass slide
Ancient DNA capture Science 2010
“spelaeus” “eremus” “ladinicus” “rossicus” “ingressus” “kudarensis” NJ tree 123 sequences 250 bp control region outgroups
The costs Capture array up to 1 million features £ 350 each SureSelect 10 rxns 200 kb – 6.6 Mb £ 6,638 SureSelect 100 rxns 200 kb £ 30,777 SureSelect 1,000 rxns 200 kb £ 107,719 => Home-made solutions => Multiplexing
So..................... How does it work?
Jumping artefacts Clade 1 Clade 2 Clade 3
Possible capture methodologies Methodology Results Problems SureSelect no experience yet high costs Array capture mammoth mtDNA jumping artefacts PEC mammoth nuDNA limited sensitivity high costs Dynalbeads In solution 454, biotin adaptors Castor mtDNA length limited 454, biotin UTP Castor mtDNA length limited Illumina, biotin UTP Castor mtDNA length limited jumping artefacts
Capture advantages High sequence yield per sample aliquot Time and work efficient Higher sensitivity than PCR
Capture disadvantages High costs Sometimes low on-target ratio Problems with multiplexing Generally jumping artefacts
Summary for capture Long term little alternative - if large amounts of data required Also some methods have better sensitivity than PCR Multiplex problems especially for low-complexity data need resolving Currently not suitable for routine applications Methodological development required
Some final thoughts How should blank controls be done? And how many? What does contamination mean when you have 20 million sequence reads? How shall we replicate the data? Is independent replication possible? And is it necessary?
Molecular Ecology Thanks • Many people • Adrian Briggs, Harvard Medical school • Kevin Campbell, University of Manitoba • Research Group Molecular Ecology • Sequencing group in Leipzig • MPG, DFG and Volkswagen foundation for money • University of York • For your attention