Ancient Substructure in Early mtDNA Lineages of Southern Africa
2013; Elsevier BV; Volume: 92; Issue: 2 Linguagem: Inglês
10.1016/j.ajhg.2012.12.010
ISSN1537-6605
AutoresChiara Barbieri, Mário Vicente, Jorge Rocha, Sununguko Wata Mpoloka, Mark Stoneking, Brigitte Pakendorf,
Tópico(s)Genetic diversity and population structure
ResumoAmong the deepest-rooting clades in the human mitochondrial DNA (mtDNA) phylogeny are the haplogroups defined as L0d and L0k, which are found primarily in southern Africa. These lineages are typically present at high frequency in the so-called Khoisan populations of hunter-gatherers and herders who speak non-Bantu languages, and the early divergence of these lineages led to the hypothesis of ancient genetic substructure in Africa. Here we update the phylogeny of the basal haplogroups L0d and L0k with 500 full mtDNA genome sequences from 45 southern African Khoisan and Bantu-speaking populations. We find previously unreported subhaplogroups and greatly extend the amount of variation and time-depth of most of the known subhaplogroups. Our major finding is the definition of two ancient sublineages of L0k (L0k1b and L0k2) that are present almost exclusively in Bantu-speaking populations from Zambia; the presence of such relic haplogroups in Bantu speakers is most probably due to contact with ancestral pre-Bantu populations that harbored different lineages than those found in extant Khoisan. We suggest that although these populations went extinct after the immigration of the Bantu-speaking populations, some traces of their haplogroup composition survived through incorporation into the gene pool of the immigrants. Our findings thus provide evidence for deep genetic substructure in southern Africa prior to the Bantu expansion that is not represented in extant Khoisan populations. Among the deepest-rooting clades in the human mitochondrial DNA (mtDNA) phylogeny are the haplogroups defined as L0d and L0k, which are found primarily in southern Africa. These lineages are typically present at high frequency in the so-called Khoisan populations of hunter-gatherers and herders who speak non-Bantu languages, and the early divergence of these lineages led to the hypothesis of ancient genetic substructure in Africa. Here we update the phylogeny of the basal haplogroups L0d and L0k with 500 full mtDNA genome sequences from 45 southern African Khoisan and Bantu-speaking populations. We find previously unreported subhaplogroups and greatly extend the amount of variation and time-depth of most of the known subhaplogroups. Our major finding is the definition of two ancient sublineages of L0k (L0k1b and L0k2) that are present almost exclusively in Bantu-speaking populations from Zambia; the presence of such relic haplogroups in Bantu speakers is most probably due to contact with ancestral pre-Bantu populations that harbored different lineages than those found in extant Khoisan. We suggest that although these populations went extinct after the immigration of the Bantu-speaking populations, some traces of their haplogroup composition survived through incorporation into the gene pool of the immigrants. Our findings thus provide evidence for deep genetic substructure in southern Africa prior to the Bantu expansion that is not represented in extant Khoisan populations. Sub-Saharan Africa harbors the deepest-rooting lineages of human mitochondrial DNA (mtDNA), in agreement with an African origin of modern humans supported by both fossil and genetic evidence.1Campbell M.C. Tishkoff S.A. The evolution of human genetic and phenotypic variation in Africa.Curr. Biol. 2010; 20: R166-R173Abstract Full Text Full Text PDF PubMed Scopus (127) Google Scholar, 2Blum M.G.B. Jakobsson M. Deep divergences of human gene trees and models of human origins.Mol. Biol. Evol. 2011; 28: 889-898Crossref PubMed Scopus (51) Google Scholar, 3Phillipson D.W. African Archaeology. Cambridge University Press, Cambridge2005Crossref Scopus (7) Google Scholar, 4Tattersall I. Out of Africa: modern human origins special feature: human origins: out of Africa.Proc. Natl. Acad. Sci. USA. 2009; 106: 16018-16021Crossref PubMed Scopus (85) Google Scholar Several studies concurred in placing the root of the mtDNA phylogeny in the southern half of the continent,5Gonder M.K. Mortensen H.M. Reed F.A. de Sousa A. Tishkoff S.A. Whole-mtDNA genome sequence analysis of ancient African lineages.Mol. Biol. Evol. 2007; 24: 757-768Crossref PubMed Scopus (201) Google Scholar, 6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar, 7Tishkoff S.A. Gonder M.K. Henn B.M. Mortensen H. Knight A. Gignoux C. Fernandopulle N. Lema G. Nyambo T.B. Ramakrishnan U. et al.History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation.Mol. Biol. Evol. 2007; 24: 2180-2195Crossref PubMed Scopus (159) Google Scholar and two deep-rooting clades of this phylogeny—haplogroups L0d and L0k—have been unanimously associated with so-called Khoisan populations.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar, 7Tishkoff S.A. Gonder M.K. Henn B.M. Mortensen H. Knight A. Gignoux C. Fernandopulle N. Lema G. Nyambo T.B. Ramakrishnan U. et al.History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation.Mol. Biol. Evol. 2007; 24: 2180-2195Crossref PubMed Scopus (159) Google Scholar, 8Chen Y.S. Olckers A. Schurr T.G. Kogelnik A.M. Huoponen K. Wallace D.C. mtDNA variation in the South African Kung and Khwe-and their genetic relationships to other African populations.Am. J. Hum. Genet. 2000; 66: 1362-1383Abstract Full Text Full Text PDF PubMed Scopus (148) Google Scholar, 9Knight A. Underhill P.A. Mortensen H.M. Zhivotovsky L.A. Lin A.A. Henn B.M. Louis D. Ruhlen M. Mountain J.L. African Y chromosome and mtDNA divergence provides insight into the history of click languages.Curr. Biol. 2003; 13: 464-473Abstract Full Text Full Text PDF PubMed Scopus (125) Google Scholar The generic term "Khoisan" covers hunter-gatherer and pastoralist populations of southern Africa who speak non-Bantu indigenous languages and share some linguistic features (one of the most characteristic being the heavy use of click consonants in their languages); however, these similarities might be the effect of contact.10Güldemann T. The Kalahari basin as an object of areal typology: A first approach.in: Schladt M. Language, Identity and Conceptualization among the Khoisan. Rüdiger Köppe, Köln1997: 137-169Google Scholar Haplogroups L0d and L0k are present nearly exclusively in Khoisan populations and neighboring Bantu-speaking populations that have been in documented close contact with them;11Coelho M. Sequeira F. Luiselli D. Beleza S. Rocha J. On the edge of Bantu expansions: mtDNA, Y chromosome and lactase persistence genetic variation in southwestern Angola.BMC Evol. Biol. 2009; 9: 80Crossref PubMed Scopus (85) Google Scholar, 12Schlebusch C.M. Naidoo T. Soodyall H. SNaPshot minisequencing to resolve mitochondrial macro-haplogroups found in Africa.Electrophoresis. 2009; 30: 3657-3664Crossref PubMed Scopus (29) Google Scholar, 13Quintana-Murci L. Harmant C. Quach H. Balanovsky O. Zaporozhchenko V. Bormans C. van Helden P.D. Hoal E.G. Behar D.M. Strong maternal Khoisan contribution to the South African coloured population: a case of gender-biased admixture.Am. J. Hum. Genet. 2010; 86: 611-620Abstract Full Text Full Text PDF PubMed Scopus (90) Google Scholar, 14Barbieri C. Butthof A. Bostoen K. Pakendorf B. Genetic perspectives on the origin of clicks in Bantu languages from southwestern Zambia.Eur. J. Hum. Genet. 2012; (Published online August 29, 2012)https://doi.org/10.1038/ejhg.2012.192Crossref PubMed Scopus (35) Google Scholar the only known exceptions are sporadic occurrences of haplogroup L0d in East Africa (e.g., in the Sandawe from Tanzania)7Tishkoff S.A. Gonder M.K. Henn B.M. Mortensen H. Knight A. Gignoux C. Fernandopulle N. Lema G. Nyambo T.B. Ramakrishnan U. et al.History of click-speaking populations of Africa inferred from mtDNA and Y chromosome genetic variation.Mol. Biol. Evol. 2007; 24: 2180-2195Crossref PubMed Scopus (159) Google Scholar and in an individual from Yemen6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar as well as an individual from Kuwait6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar who belongs to haplogroup L0k. Specialists recognize three independent language families among Khoisan, namely Tuu, Kx'a, and Khoe-Kwadi,15Heine B. Honken H. The Kx'a family: a new Khoisan genealogy.J. Asian Afr. Stud. 2010; 79: 5-36Google Scholar, 16Güldemann T. Studies in Tuu (Southern Khoisan). University of Leipzig Papers on Africa, Languages and Literatures 23. Institut für Afrikanistik, Universität Leipzig, Leipzig2005Google Scholar, 17Güldemann, T., and Elderkin, E.D. (2010). On external genealogical relationships of the Khoe family. In Khoisan Languages and Linguistics: Proceedings of the 1st International Symposium January 4–8, 2003, M. Brenzinger and C. König, eds. (Riezlern/Kleinwalsertal. Quellen zur Khoisan-Forschung. Köln: Rüdiger Köppe,), pp. 15–52.Google Scholar which are spoken by a large number of different ethnolinguistic groups comprising both foragers and pastoralists. The forager populations of the central Kalahari, who speak languages belonging to the Tuu and Kx'a families, are assumed to be the descendants of autochthonous Late Stone Age populations, whereas the Khoe-Kwadi languages may have been brought to the area by pastoralist populations around 2,000 years ago.18Deacon H.J. Deacon J. Human Beginnings in South Africa: Uncovering the Secrets of the Stone Age. Altamira Press, Walnut Creek, CA1999Google Scholar, 19Güldemann T. Stoneking M. A historical appraisal of clicks: a linguistic and genetic population perspective.Annu. Rev. Anthropol. 2008; 37: 93-109Crossref Scopus (53) Google Scholar, 20Güldemann T. A linguist's view: Khoe-Kwadi speakers as the earliest food-producers of southern Africa.South Afr. Humanit. 2008; 20: 93-132Google Scholar The populations speaking Bantu languages, in contrast, are known for their expansion over almost half the African continent and are associated with the concomitant spread of the Bantu language family, an agricultural lifestyle, and iron technology.3Phillipson D.W. African Archaeology. Cambridge University Press, Cambridge2005Crossref Scopus (7) Google Scholar, 21Ehret C. Bantu expansions: Re-envisioning a central problem of early African history.Int. J. Afr. Hist. Stud. 2001; 34: 5-41Crossref Google Scholar, 22Pakendorf B. Bostoen K. de Filippo C. Molecular perspectives on the Bantu expansion: a synthesis.Language Dynamics and Change. 2011; 1: 50-88Crossref Scopus (52) Google Scholar Archeological data suggest that they may have reached southern Africa not earlier than 2,000–1,200 years ago,3Phillipson D.W. African Archaeology. Cambridge University Press, Cambridge2005Crossref Scopus (7) Google Scholar, 23Kinahan J. From the beginning: the archaeological evidence.in: A History of Namibia: From the Beginning to 1990, M. Wallace. Hurst and Company, London2011: 15-43Google Scholar, 24Reid A. Sadr K. Hanson-James N. Herding traditions.in: Lane P. Reid A. Segobye A. Ditswa MMung: The Archaeology of Botswana. Pula Press and The Botswana Society, Gaborone1998: 81-100Google Scholar where they met populations who were probably ancestral to current Khoisan populations. The most recent comprehensive study that focused on the deepest-rooting lineages of the mtDNA phylogeny was undertaken by Behar et al.,6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar who analyzed a total of 624 full mtDNA sequences belonging to haplogroup L∗(xM,N). Although this was the first substantial collection of complete mtDNA genome sequences from Africa, some limitations arose from the inclusion of a large number of sequences from diverse published sources that were not always of high quality; furthermore, for some sequences the source population or the country of origin was not clearly specified. Nevertheless, the sequences considered in that study still represent the vast majority of the haplogroup L∗(xM,N) data set included in the most recent version of Phylotree (Build 15, September 201225van Oven M. Kayser M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation.Hum. Mutat. 2009; 30: E386-E394Crossref PubMed Scopus (1341) Google Scholar), a comprehensive database of mtDNA genome sequences that is periodically updated when more data become available. It thus represents the most accessible resource for studying mtDNA variation and is a widely used reference for mtDNA nomenclature.26Kloss-Brandstätter A. Pacher D. Schönherr S. Weissensteiner H. Binna R. Specht G. Kronenberg F. HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups.Hum. Mutat. 2011; 32: 25-32Crossref PubMed Scopus (377) Google Scholar Behar et al.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar focused particularly on the root of the phylogeny, i.e., the age and variability of the Khoisan-specific haplogroups L0d and L0k, with the aim of investigating the most likely model of origin and isolation of Khoisan populations. With their data they were able to suggest a time frame for the dispersal of the main lineages and the split of Khoisan and other modern humans, which they dated not later than 90 thousand years ago (kya); furthermore, they suggested that the early human settlement of Africa was matrilineally structured. These hypotheses are relevant for the interpretation of early human demography and evolution; however, their results were substantially limited by the fact that only one ethnolinguistically undefined "Khoisan" sample of 38 individuals was included, thereby missing the potentially immense variability of the different ethnolinguistic populations subsumed under the generalized label Khoisan. In addition, only 30 sequences from haplogroup L0d and 7 from L0k were included, representing only a small and probably incomplete fraction of the overall variation in these haplogroups. We here report analyses of 500 mtDNA genome sequences belonging to haplogroups L0d and L0k, of which 15 have already been published in Barbieri et al.,14Barbieri C. Butthof A. Bostoen K. Pakendorf B. Genetic perspectives on the origin of clicks in Bantu languages from southwestern Zambia.Eur. J. Hum. Genet. 2012; (Published online August 29, 2012)https://doi.org/10.1038/ejhg.2012.192Crossref PubMed Scopus (35) Google Scholar leading to a more than 10-fold increase in the available complete mtDNA genome sequences from southern Africa (Phylotree ver. 1525van Oven M. Kayser M. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation.Hum. Mutat. 2009; 30: E386-E394Crossref PubMed Scopus (1341) Google Scholar). With this rich data set, we aim to elucidate the phylogenetic relationships, the patterns of diversity, and the distribution of these relatively understudied haplogroups that represent some of the deepest-rooting lineages in the maternal phylogeny of modern humans. The broader data set from which the subset of L0d and L0k sequences was chosen consists of mtDNA genome sequences generated from saliva samples collected in Botswana, Namibia, Zambia, and Angola after prior approval by the relevant institutional review boards and with the consent of the donors after the aims of the study had been explained to them with the help of local translators, where necessary. Details of the samples have been described elsewhere.11Coelho M. Sequeira F. Luiselli D. Beleza S. Rocha J. On the edge of Bantu expansions: mtDNA, Y chromosome and lactase persistence genetic variation in southwestern Angola.BMC Evol. Biol. 2009; 9: 80Crossref PubMed Scopus (85) Google Scholar, 27de Filippo C. Barbieri C. Whitten M. Mpoloka S.W. Gunnarsdóttir E.D. Bostoen K. Nyambe T. Beyer K. Schreiber H. de Knijff P. et al.Y-chromosomal variation in sub-Saharan Africa: insights into the history of Niger-Congo groups.Mol. Biol. Evol. 2011; 28: 1255-1269Crossref PubMed Scopus (87) Google Scholar, 28Pickrell J.K. Patterson N. Barbieri C. Berthold F. Gerlach L. Güldemann T. Kure B. Mpoloka S.W. Nakagawa H. Naumann C. et al.The genetic prehistory of southern Africa.Nat. Commun. 2012; 3: 1143Crossref PubMed Scopus (185) Google Scholar The sequence data set analyzed here comprises 45 ethnolinguistic groups, who speak Khoisan languages belonging to all three accepted language families as well as different Bantu languages; individuals were assigned to populations on the basis of the ethnic affiliation of their maternal grandmother (Table S1 available online). Libraries enriched for mtDNA29Meyer M. Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing.Cold Spring Harb. Protoc. 2010; 2010 (pdb.prot5448)Crossref Scopus (1255) Google Scholar, 30Maricic T. Whitten M. Pääbo S. Multiplexed DNA sequence capture of mitochondrial genomes using PCR products.PLoS ONE. 2010; 5: e14004Crossref PubMed Scopus (370) Google Scholar were sequenced on the Illumina GAIIx platform, resulting in an average 400-fold coverage. Sequences were manually checked with BioEdit and read alignments were screened with ma31Briggs A.W. Good J.M. Green R.E. Krause J. Maricic T. Stenzel U. Lalueza-Fox C. Rudan P. Brajkovic D. Kucan Z. et al.Targeted retrieval and analysis of five Neandertal mtDNA genomes.Science. 2009; 325: 318-321Crossref PubMed Scopus (373) Google Scholar to exclude alignment errors and confirm indels. The two poly-C regions (np 303–315, 16,183–16,194) were excluded from the analysis. To minimize the impact of missing data, we applied imputation and resolved unknown positions by comparison to at least two otherwise identical haplotypes in the data set. Before imputation, 74 sequences included positions with missing data; after imputation, only 26 sequences still had missing positions. In the final alignment, 32 positions were left with an unknown nucleotide call (26 of which corresponded to polymorphic sites) and were excluded from the analyses (see Table S2 for a list of the excluded positions). Basic haplogroups were defined with the web tool Haplogrep.26Kloss-Brandstätter A. Pacher D. Schönherr S. Weissensteiner H. Binna R. Specht G. Kronenberg F. HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups.Hum. Mutat. 2011; 32: 25-32Crossref PubMed Scopus (377) Google Scholar Mutations that did not fit the overall phylogeny were checked manually in the read alignments to exclude the possibility of erroneous base calls. Although we took into account published data on the frequency of haplogroups L0d and L0k, only the 500 sequences that were generated with the same technology and from individuals for whom we know the place of sampling and ethnicity were included in the phylogenetic analyses. We did not include previously published sequences, because they do not add substantial information to our analysis and often pose problems because of missing positions6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar or missing ethnolinguistic information. The only exceptions are the L0k2 sequence from Yemen6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar and the six L0d3 sequences from South Africa, Kuwait,6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar and Tanzania,5Gonder M.K. Mortensen H.M. Reed F.A. de Sousa A. Tishkoff S.A. Whole-mtDNA genome sequence analysis of ancient African lineages.Mol. Biol. Evol. 2007; 24: 757-768Crossref PubMed Scopus (201) Google Scholar which we included to clarify the structure in L0k and L0d3 discussed below. First, we compared the frequency and distribution of haplogroups L0d and L0k in our data set and in the available literature (where in most cases haplogroups were assigned based on partial mtDNA sequence variation and/or RFLP typing; cf. Table S1 and Figure S1 for details) and plotted the frequencies of each haplogroup (Figures 1A and 1B ) with the software Surfer ver. 10.4.799 (Golden Software). The maps show a concentration of both L0d and L0k in the southern part of the continent, with L0d present in high frequency in populations from South Africa, Namibia, and Botswana, and sporadically ( 10% in several populations of foragers in Botswana and Namibia who speak languages belonging to all three Khoisan linguistic families (see Table S1), as well as in the Bantu-speaking Fwe from southwestern Zambia. We next reconstructed a phylogeny of the L0d and L0k mtDNA genome sequences from the most probable tree out of 10 million MCMC chains with BEAST (v1.7.234Drummond A.J. Suchard M.A. Xie D. Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7.Mol. Biol. Evol. 2012; 29: 1969-1973Crossref PubMed Scopus (7819) Google Scholar) and identified the mutations defining different branches by viewing the aligned sequences in BioEdit in comparison to the Reconstructed Sapiens Reference Sequence (RSRS35Behar D.M. van Oven M. Rosset S. Metspalu M. Loogväli E.L. Silva N.M. Kivisild T. Torroni A. Villems R. A "Copernican" reassessment of the human mitochondrial DNA tree from its root.Am. J. Hum. Genet. 2012; 90: 675-684Abstract Full Text Full Text PDF PubMed Scopus (304) Google Scholar). The node branches were dated with the mutation rate of 1.26 × 10−8 for the coding region only,36Mishmar D. Ruiz-Pesini E. Golik P. Macaulay V. Clark A.G. Hosseini S. Brandon M. Easley K. Chen E. Brown M.D. et al.Natural selection shaped regional mtDNA variation in humans.Proc. Natl. Acad. Sci. USA. 2003; 100: 171-176Crossref PubMed Scopus (786) Google Scholar which makes our estimates comparable to those from Behar et al.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar The complete tree of sequences showing mutations that characterize the major branches is available in the Supplemental Data (Figure S2); further discussion of some of these mutations is found in Table S3. Figure 2 summarizes the tree topology and the TMRCA of lineages, with confidence intervals indicated for the major nodes. The tree coalesces 145 kya (95% C.I.: 118–179 kya), corresponding to the time of split between L0d and L0k. From the topology of the tree, different sublineages can be distinguished for both the L0d and L0k haplogroups. For L0d, three main branches (L0d3, L0d1, and L0d2) separate around 95 kya (95% C.I.: 79–121 kya), whereas L0k splits into L0k1 and L0k2 approximately 40 kya (95% C.I.: 28–53 kya). The first branch of L0d is the uncommon L0d3, which is found in a population with South African Khoisan ancestry (Karretjie People) at 13% and in a Coloured population at 10%,32Schlebusch C.M. de Jongh M. Soodyall H. Different contributions of ancient mitochondrial and Y-chromosomal lineages in 'Karretjie people' of the Great Karoo in South Africa.J. Hum. Genet. 2011; 56: 623-630Crossref PubMed Scopus (17) Google Scholar as well as being attested in one undefined Khoi and one individual from Kuwait6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar and three Sandawe and one Burunge from Tanzania (our identification, based on sequences from Gonder et al.5Gonder M.K. Mortensen H.M. Reed F.A. de Sousa A. Tishkoff S.A. Whole-mtDNA genome sequence analysis of ancient African lineages.Mol. Biol. Evol. 2007; 24: 757-768Crossref PubMed Scopus (201) Google Scholar). In our data set, it is found in only five individuals (two Nama and one Hai||om, who speak Khoe languages, and two Kgalagadi, who speak a Bantu language). As can be seen from the tree (Figure 2), L0d3 splits into two branches (L0d3a and L0d3b) 45 kya (95% C.I.: 30–61 kya), with eight mutations defining L0d3b (Figure S2 and Table S3). Interestingly, this split reflects geographic substructure: L0d3a is restricted to East Africa and the Middle East, being found in the individuals from Kuwait and Tanzania, and L0d3b is restricted to southern Africa, being found in the five individuals of our data set plus the Khoi sequence published by Behar et al.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar L0d1 is the most common subhaplogroup: it is present in all Khoisan populations, all Bantu-speaking populations of our data set from Botswana and Namibia, and a few individuals from Bantu-speaking populations of Zambia and Angola. It coalesces approximately 55 kya (95% C.I.: 44–68 kya) and comprises two branches, of which the first includes haplogroups L0d1a and L0d1c. L0d1a is a monophyletic clade; however, two sites, namely T199C and C16266A, previously assumed to define this clade, pose problems for reconstructing the history of mutations (see Table S3 for details). In L0d1c, substantial variation emerges from our expanded data set that pushes the coalescence date back to 32 kya (95% C.I.: 24–41 kya), 10 ky older than previously estimated.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar A low posterior probability is associated with the first nodes; these are represented by paraphyletic clades that are characterized by a large number of private mutations. In addition to the paraphyletic clades, L0d1c contains two monophyletic clades. The first is the previously attested L0d1c1, which is defined by only two of the mutations previously associated with it (Figure S2 and Table S3). The second monophyletic clade in L0d1c, which we here define as L0d1c2a, is represented by six haplotypes and supported by four mutations (Figure S2). The second basal branch of L0d1 is subhaplogroup L0d1b, which coalesces approximately 45 kya (95% C.I.: 35–56 kya) and is thus 10 ky older than previously estimated.6Behar D.M. Villems R. Soodyall H. Blue-Smith J. Pereira L. Metspalu E. Scozzari R. Makkan H. Tzur S. Comas D. et al.Genographic ConsortiumThe dawn of human matrilineal diversity.Am. J. Hum. Genet. 2008; 82: 1130-1140Abstract Full Text Full Text PDF PubMed Scopus (298) Google Scholar As shown by our data, this is characterized by only one mutation, T3618C, splitting immediately into several subhaplogroups. Because the haplogroup previously labeled L0d1b1 is only the second of thr
Referência(s)