270 likes | 575 Views
Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg). 17. Jahrestagung der Deutschen Gesellschaft für Humangenetik Heidelberg, 08.–11. März 2006. Human mtDNA. HVS-I alias HVR1. from MITOMAP.
E N D
Potentials and limits of haplotype trees in exploring population structure and pathogenicity of mutations Hans-Jürgen Bandelt (Hamburg) 17. Jahrestagung der Deutschen Gesellschaft für Humangenetik Heidelberg, 08.–11. März 2006
Human mtDNA HVS-I alias HVR1 fromMITOMAP
The perception of evolution as seen through the lenses of laboratories constitutes an overlay of two different processes: Perceived evolution = Natural evolution (of the genome) + Artificial evolution (in the lab)
mtDNA and evolution α: Natural evolution Migrational processes (prehistory)
ML tree of basal African mtDNA haplogroups Time (years) Coding-region variation displayed Torroni et al. (TIG, June 2006) 1048 4312 6185 9755 11914 12007 200,000 L 3516A 5442 9042 9347 10589 10664 10915 13276 L1’5 = L1’2’3’4’5’6’7 2758 2885 7146 8468 13105 3666 7055 7389 13789 14178 14560 L2’5= L2’3’4’5’6’7 L0 L1 3423 7972 12432 12950 825A 8655 10688 10810 13506 15301 150,000 4586 9818 2395d 5951 6071 8027 9072 10586 12810 13485 14000A 14911 L5 L0ak 2245 5603 11641 15136 15431 L2’6 = L2’3’4’6’7 Ethiopian samples 4104 7521 . L1c 2416 8206 9221 10115 13590 L4’6 = L3’4’6’7 L0af 10321 709 851 930 1822 4496 5004 5111 5147 5656 6182 6297 7424 7873 8155 8188 8582 8754 9305 9329 9899 11015 11025 11881 12236 13105 13722 14212 14239 14581 14905 14971 15217 15884 3594 7256 13650 L1c1’2 709 770 961 13710 15289 15499 100,000 5231 5460 8428 8566 11176 12720 14308 12049 13149 L2 L3’7 = L3’4’7 L1c2 769 1018 3693 6150 6253 7076 7337 8784 8877 10792 10793 11654 L2d L6 L3 L7 3918 8104 9855 12609 13470 7861 9575 13105 8701 9540 10398 10873 15301 10400 14783 15043 3396 4218 15514 15944d L0a L3a L3bd = L3bcd 10819 870 2159 2332 3254A 3434 6231 8856 9130A 9554 9941 10700 10955 11353 11944 12630 13239 14845 15263 15458 15703 15777C L3h 3357 5460 6167 7376 7762 7775 8473 8631 8697 10373 11253 11344 11485 11653 12280 12414 13174 13344 14000A 14302 3450 5773 6221 9449 10086 13914A 15311 15824 15944d M 5147 7424 8618 13886 14284 2417G 3027 3720 4976 5213 8152 9809C 10493 11065 11260 11701 12188 12215 12546T 12714 12810 13569 13830 15383 5147 5711 6257 8460 9bp-del 11172 L3c L3f 5601 9950 N 965+3C 1461 4964 5267 6002 6284 9332 10978 11116 11743 12405 12714 12771 14533A 14791 14959 15244 L4 2707 3879 4122 5147 5460 5567 5813 5930 8020 9098 9254 9380 9965 11440 12469 13080 13755 6446 6680 12403 12950C 14110 L3ex = L3eix 1719 2831 3777 4388 4859 5300 7055 8767 9509 9827 10044 10289 11563 11590 11963 14410 678 792 3582 4491 5393 7394 8835 9337 9682 11944 12373 14221 14371 14560 14587 15833 3483 6401 8311 8817 13708 959 1692 4643 5181 6293 6480 6602 8158 8251 8400 9932 10604 11176 11770 14590 15940 50,000 L3f1 L1c2a 12705 7645 14040 14395 L3d 2352 14212 721 2357 5310 10184 10314 12618 12816 13443 13708 14461 14566 14851 15553 R 1598 2220 5162 5899+C 6962 10031 11164 11252 11959 12477 12540 15929 3197 3693 4048 4350 5194 7270 8853 12507 12634 14148 15106 15952 L0a2 M1 921 L3x L3i 3435 3621 5894+T 6392 7129 8041 8197 8928 9941 12340 14034 L3b L3e 750 1438 2706 4769 7028 8860 11719 14766 15326 L3d1 648 723 1413 5471 5580 5746 10750 14182 14861 11143 14755 813 3604 3705 4375 4793 6671 12346 13635 15514 1193 3441 5211 5581 9477 10373 11002 15299 750 1822 3666 7819A 8527 8932 11440 14769 745+T 1719 1842 5821 9365 15314 15479 L0a2a 3459 5046 5605 6272 6680 6842 5441 8222 12630 14818 15388 15944d 2158 8598 10679 11260 13687 13800A L3e5 4715 8392 12561 15367 9545 9554 13116 5899+C 14750 15172 5186 14905 0 rCRS 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
One of the first views of the East Asian mtDNA phylogeny (Ozawa,Herz 1994) all mutations that distinguish haplogroups M and R (part of N) incorrect rooting CRS R M
Star-burst of autochthonous mtDNA lineages in Eurasia (haplogroup N and its subhaplogroup R) pre-HV JT U W R2 N1 R5 X R1 R6 N5 R7 N R R8 R30 A 9140 6755 15607 R31 West Eurasia 8404 N9 South Asia R9 East Asia P R11 S B O Oceania Palanichamy et al (Amer J Hum Genet, 2004)
... and a massive burst in haplogroup M, as e.g. seen in India: Sun et al (Mol Biol Evol, March 2006)
An Out-of-Africa model based on mtDNA analysis Kivisild et al (Springer-Verlag, April 2006)
Sketch of the phylogeny of basal European mtDNA haplogroups N N2 N1 X R N1b N1a’I W U JT R0=pre-HV I N1a R0a=(pre-HV)1 HV H HV0=pre-V H1 H3 HV0a V Torroni et al (TIG, June 2006)
Spatial frequency distributions of haplogroups H1, H3, V, and U5b reveal signature of post-LGM expansions Torroni et al (TIG, June 2006)
mtDNA and evolution β: Artificial evolution Laboratory-specific processes (error and fraud)
Major sources of error in mtDNA sequence data Artificial Recombination through contamination or sample mix-up (or targeting nuclear inserts of mtDNA) Phantom mutations sequencing errors at electrophoresis Documentation errors incurred by casual reading or writing
Impurifying selection is the driving force in artificial evolution inasmuch as incorrect data are more flexible to interpret and can support sexy stories — seemingly told by DNA — which are then disseminated by high-impact factor journals (e.g. Science and Nature). Worst case: mtDNA in cancer research (Salas et al, PLoS Medicine 2005)
Case of mtDNA sample mix-up, mis-interpreted as somatic mutations; data generated with MitoChip by Maitra et al (Genome Res, 2004) Data re-analysis by Bandelt et al (J Med Genet, 2005)
10410 @9824 F B D F A C E 1 3000 6000 9000 12000 15000 16569 NDsq0168 M7a2 F1a1b M7a NDsq0167 F1a1b M7a F1a1b A case of cross-over in the 672 human complete mtDNA sequences from Tanaka et al (2004) NDsq0167 NDsq0178 16519 16140 15422 8005 5899+C 4435 2218 965+CC 961 249 15618 200 195 M7a 15301 10873 10398 9540 8701 16209 4958 4386 2772 2626 15043 14783 10400 489 N 12771 98246455 rCRS M7 M L3 12705 16223 R R9 10310 6392 249d F 13928C 16304 3970 12882 12406 16519 10609 6962 522-523d F1 F1a’c F1a F1a1 13759 16172 4086 16162 9548 16129 9053 14002 63 64 F1a1b @6455 965.2+CC NDsq0015 NDsq0168
Prime example of a phantom mutation (Brandstätter et al, Electrophoresis 2005)
Electropherogram from Nasidze and Stoneking (2001) generated 1997 / 1998 and for the first time presented in Stoneking and Nasidze (Ann Hum Genet, 2006) rCRS
Phantom mutations can be found in excess in the HVS-I Caucasus data of Nasidze and Stoneking (2001). In view of additional problems, this may be regarded as the worst data set ever published in the realm of molecular anthropology; see Bandelt and Kivisild (Ann Hum Genet 2006) for data re-analysis
Sequences with phantom transitions at 16280-16281 in those Caucasus data Code Mutation (16000+) Haplogroup AR31 067 279G280281 355 HV1 AR483 069 126 145 280281 367C J AZ2 280281? AZ342 280281 298 pre-V AZ6 154 168A280281 356 384? CH444 111 214G 249 280281 327 388U1b CH451 280281 292 ? DAR23 129 223 278 280281 ? DAR36 258 280281384 ? KAB408 224 280281 311 K This mutation pair has never been observed in >40,000 HVS-I sequences!
Electropherogram presented by Stoneking and Nasidze(Ann Hum Genet, 2006) rCRS
Phantom mutations in the HVS-I data of Plaza et al (Ann Hum Genet, 2003) (267 samples) Sample Mutation (16000+) Haplogroup Algeria 279N285N ? Andalusia 129 182C 183C 189 223 249 311 359 371 M1 Andalusia 129 281 ? Andalusia 281 ? Catalonia 093 192 270 281 290A 304 311 U5b Catalonia 224 281 311 K Morroco 093 224 242 311 371 K Morroco 124 223 284C285T 300 319 374T L2d Morroco 126 187 189 223 264 270 278 293 311 371374 L1b Morroco 126 284C 292 294 T2 Morroco 183C 189 223 278 382G X Morroco 189 192 270 369T U5b Saharawi 093 172 185 223 327 382G L3e1 Saharawi 172 281 311 U6? Saharawi 189 382G ?
Comparison with 1624 complete sequences stored in the mtDB database Variation in 16279-16285: Only 20 transitional variants at 16284 Variation in 16369-16389: Only 1+1+6 transitional variants at 16371, 16380, and 16381
Re-evaluation of the mtDNA data from the lab of Min-Xin Guan rCRS R M N missing mutations misscored mutations in red Yao et al (Hum Genet, 2006)
Strategies of authors to deal with errors 1st: Publishing a corrigendum [rare event] 2nd: No correction — but avoiding similar errors in future work [common practice] 3rd: No action — and committing the same errors as before [e.g.as Min-Xin Guan and colleagues do] 4th: Fraudulent action — performing fake analyses and giving false statements [as done by Mark Stoneking and Ivane Nasidze in the Ann Hum Genet]
... only L strand, no H strand information shown! Stoneking and Nasidze (2006)
Human Mitochondrial DNA and the Evolution of Homo sapiensSeries: Nucleic Acids and Molecular Biology, Vol.18 Volume package:Human Mitochondrial DNA Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.) 2006, Approx. 250 p., 31 illus., 2 in colour., HardcoverISBN: 3-540-31788-0Springer-Verlag Due: April 2006