Deciphering the Ancient and Complex Evolutionary History of Human Arylamine N-Acetyltransferase Genes
2006; Elsevier BV; Volume: 78; Issue: 3 Linguagem: Inglês
10.1086/500614
ISSN1537-6605
AutoresÉtienne Patin, Luis B. Barreiro, Pardis C. Sabeti, Frédéric Austerlitz, Francesca Luca, Antti Sajantila, Doron M. Behar, Ornella Semino, Anavaj Sakuntabhai, Nicole Guiso, Brigitte Gicquel, Ken McElreavey, Rosalind M. Harding, Évelyne Heyer, Lluís Quintana‐Murci,
Tópico(s)Forensic and Genetic Research
ResumoThe human N-acetyltransferase genes NAT1 and NAT2 encode two phase-II enzymes that metabolize various drugs and carcinogens. Functional variability at these genes has been associated with adverse drug reactions and cancer susceptibility. Mutations in NAT2 leading to the so-called slow-acetylation phenotype reach high frequencies worldwide, which questions the significance of altered acetylation in human adaptation. To investigate the role of population history and natural selection in shaping NATs variation, we characterized genetic diversity through the resequencing and genotyping of NAT1, NAT2, and the pseudogene NATP in a collection of 13 different populations with distinct ethnic backgrounds and demographic pasts. This combined study design allowed us to define a detailed map of linkage disequilibrium of the NATs region as well as to perform a number of sequence-based neutrality tests and the long-range haplotype (LRH) test. Our data revealed distinctive patterns of variability for the two genes: the reduced diversity observed at NAT1 is consistent with the action of purifying selection, whereas NAT2 functional variation contributes to high levels of diversity. In addition, the LRH test identified a particular NAT2 haplotype (NAT2*5B) under recent positive selection in western/central Eurasians. This haplotype harbors the mutation 341T→C and encodes the “slowest-acetylator” NAT2 enzyme, suggesting a general selective advantage for the slow-acetylator phenotype. Interestingly, the NAT2*5B haplotype, which seems to have conferred a selective advantage during the past ∼6,500 years, exhibits today the strongest association with susceptibility to bladder cancer and adverse drug reactions. On the whole, the patterns observed for NAT2 well illustrate how geographically and temporally fluctuating xenobiotic environments may have influenced not only our genome variability but also our present-day susceptibility to disease. The human N-acetyltransferase genes NAT1 and NAT2 encode two phase-II enzymes that metabolize various drugs and carcinogens. Functional variability at these genes has been associated with adverse drug reactions and cancer susceptibility. Mutations in NAT2 leading to the so-called slow-acetylation phenotype reach high frequencies worldwide, which questions the significance of altered acetylation in human adaptation. To investigate the role of population history and natural selection in shaping NATs variation, we characterized genetic diversity through the resequencing and genotyping of NAT1, NAT2, and the pseudogene NATP in a collection of 13 different populations with distinct ethnic backgrounds and demographic pasts. This combined study design allowed us to define a detailed map of linkage disequilibrium of the NATs region as well as to perform a number of sequence-based neutrality tests and the long-range haplotype (LRH) test. Our data revealed distinctive patterns of variability for the two genes: the reduced diversity observed at NAT1 is consistent with the action of purifying selection, whereas NAT2 functional variation contributes to high levels of diversity. In addition, the LRH test identified a particular NAT2 haplotype (NAT2*5B) under recent positive selection in western/central Eurasians. This haplotype harbors the mutation 341T→C and encodes the “slowest-acetylator” NAT2 enzyme, suggesting a general selective advantage for the slow-acetylator phenotype. Interestingly, the NAT2*5B haplotype, which seems to have conferred a selective advantage during the past ∼6,500 years, exhibits today the strongest association with susceptibility to bladder cancer and adverse drug reactions. On the whole, the patterns observed for NAT2 well illustrate how geographically and temporally fluctuating xenobiotic environments may have influenced not only our genome variability but also our present-day susceptibility to disease. The two human N-acetyltransferase genes, NAT1 (MIM 108345) and NAT2 (MIM 243400), represent one of the first and clearest examples of the importance of genetic variation among individuals and across populations in drug response (Weber Weber, 1987Weber WW The acetylator genes and drug response. Oxford University Press, New York1987Google Scholar). The two homologous genes are situated within a 200-kb region in 8p22, together with the NATP pseudogene (fig. 1). Both genes encode phase-II enzymes named “arylamine N-acetyltransferases” (NATs), which catalyze the transfer of an acetyl group to different arylhydrazines and arylamine drugs (Blum et al. Blum et al., 1990Blum M Grant DM McBride W Heim M Meyer UA Human arylamine N-acetyltransferase genes: isolation, chromosomal localization, and functional expression.DNA Cell Biol. 1990; 9: 193-203Crossref PubMed Scopus (452) Google Scholar). Both genes carry functional polymorphisms whose effects on enzymatic activity have been well studied (Hein et al. Hein et al., 2000Hein DW Doll MA Fretland AJ Leff MA Webb SJ Xiao GH Devanaboyina US Nangju NA Feng Y Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms.Cancer Epidemiol Biomarkers Prev. 2000; 9: 29-42PubMed Google Scholar). Whereas the variants associated with reduced activity attain only low frequencies in NAT1, they constitute common polymorphisms in NAT2 (Upton et al. Upton et al., 2001Upton A Johnson N Sandy J Sim E Arylamine N-acetyltransferases: of mice, men and microorganisms.Trends Pharmacol Sci. 2001; 22: 140-146Abstract Full Text Full Text PDF PubMed Scopus (92) Google Scholar). Two main classes of NAT2 phenotypes are therefore observed: the “fast-acetylation” phenotype, which refers to the wild-type acetylation activity, and the “slow-acetylation” phenotype, which results in reduced protein activity. In addition, NAT1 and NAT2 metabolize numerous common carcinogens, and variation in these genes can result in varying susceptibility to cancer (for a review, see the work of Hein [Hein, 2002Hein DW Molecular genetics and function of NAT1 and NAT2: role in aromatic amine metabolism and carcinogenesis.Mutat Res. 2002; 506-507: 65-77Crossref PubMed Scopus (387) Google Scholar]). For example, the slow-acetylator NAT2 phenotype has been associated with side effects to the commonly used antitubercular isoniazid (Huang et al. Huang et al., 2002Huang YS Chern HD Su WJ Wu JC Lai SL Yang SY Chang FY Lee SD Polymorphism of the N-acetyltransferase 2 gene as a susceptibility risk factor for antituberculosis drug-induced hepatitis.Hepatology. 2002; 35: 883-889Crossref PubMed Scopus (378) Google Scholar) and with higher risk for bladder cancer (Cartwright et al. Cartwright et al., 1982Cartwright RA Glashan RW Rogers HJ Ahmad RA Barham-Hall D Higgins E Kahn M A role of N-acetyltransferase phenotypes in bladder carcinogenesis: a pharmacogenetic epidemiological approach to bladder cancer.Lancet. 1982; 2: 842-846Abstract PubMed Scopus (518) Google Scholar; Garcia-Closas et al. Garcia-Closas et al., 2005Garcia-Closas M Malats N Silverman D Dosemeci M Kogevinas M Hein DW Tardon A Serra C Carrato A Garcia-Closas R Lloreta J Castano-Vinyals G Yeager M Welch R Chanock S Chatterjee N Wacholder S Samanic C Tora M Fernandez F Real FX Rothman N NAT2 slow acetylation, GSTM1 null genotype, and risk of bladder cancer: results from the Spanish Bladder Cancer Study and meta-analyses.Lancet. 2005; 366: 649-659Abstract Full Text Full Text PDF PubMed Scopus (488) Google Scholar). Nevertheless, most NAT2 mutations leading to the slow phenotype are found at high frequencies worldwide, calling into question the role of altered acetylation in human adaptation. Moreover, the function of NATs in mediating the interactions between humans and their xenobiotic environment, which varies depending on diet and lifestyle, makes them excellent targets for the action of natural selection. Indeed, several studies have identified the signature of different selective pressures in genes involved in the metabolism of exogenous substances, including the members of the CYP3A family (Thompson et al. Thompson et al., 2004Thompson EE Kuttab-Boulos H Witonsky D Yang L Roe BA Di Rienzo A CYP3A variation and the evolution of salt-sensitivity variants.Am J Hum Genet. 2004; 75: 1059-1069Abstract Full Text Full Text PDF PubMed Scopus (230) Google Scholar), CYP1A2 (Wooding et al. Wooding et al., 2002Wooding SP Watkins WS Bamshad MJ Dunn DM Weiss RB Jorde LB DNA sequence variation in a 3.7-kb noncoding sequence 5′ of the CYP1A2 gene: implications for human population history and natural selection.Am J Hum Genet. 2002; 71: 528-542Abstract Full Text Full Text PDF PubMed Scopus (67) Google Scholar), LCT (Bersaglieri et al. Bersaglieri et al., 2004Bersaglieri T Sabeti PC Patterson N Vanderploeg T Schaffner SF Drake JA Rhodes M Reich DE Hirschhorn JN Genetic signatures of strong recent positive selection at the lactase gene.Am J Hum Genet. 2004; 74: 1111-1120Abstract Full Text Full Text PDF PubMed Scopus (681) Google Scholar), TAS2R16 (Soranzo et al. Soranzo et al., 2005Soranzo N Bufe B Sabeti PC Wilson JF Weale ME Marguerie R Meyerhof W Goldstein DB Positive selection on a high-sensitivity allele of the human bitter-taste receptor TAS2R16.Curr Biol. 2005; 15: 1257-1265Abstract Full Text Full Text PDF PubMed Scopus (153) Google Scholar), PTC (Wooding et al. Wooding et al., 2004Wooding S Kim UK Bamshad MJ Larsen J Jorde LB Drayna D Natural selection and molecular evolution in PTC, a bitter-taste receptor gene.Am J Hum Genet. 2004; 74: 637-646Abstract Full Text Full Text PDF PubMed Scopus (237) Google Scholar), HFE (Toomajian and Kreitman Toomajian and Kreitman, 2002Toomajian C Kreitman M Sequence variation and haplotype structure at the human HFE locus.Genetics. 2002; 161: 1609-1623PubMed Google Scholar; Toomajian et al. Toomajian et al., 2003Toomajian C Ajioka RS Jorde LB Kushner JP Kreitman M A method for detecting recent selection in the human genome from allele age estimates.Genetics. 2003; 165: 287-297PubMed Google Scholar), MDR1 (Tang et al. Tang et al., 2004Tang K Wong LP Lee EJ Chong SS Lee CG Genomic evidence for recent positive selection at the human MDR1 gene locus.Hum Mol Genet. 2004; 13: 783-797Crossref PubMed Scopus (83) Google Scholar), and MRP1 (Wang et al. Wang et al., 2005Wang Z Wang B Tang K Lee EJ Chong SS Lee CG A functional polymorphism within the MRP1 gene locus identified through its genomic signature of positive selection.Hum Mol Genet. 2005; 14: 2075-2087Crossref PubMed Scopus (51) Google Scholar). The main objective of the present study was to investigate the evolutionary history of the NATs region by unraveling the relative influences of natural selection and human demography in determining its present-day variability. With this goal in mind, we first resequenced NAT1, NAT2, and the pseudogene NATP in a multiethnic panel of 80 individuals (referred to as the “resequencing panel”). To further investigate the global linkage disequilibrium (LD) patterns in the NATs region, we selected 21 SNPs—including 5 NAT1 and 7 NAT2 SNPs retrieved from the initial sequence-based data set as well as 9 intergenic SNPs—to cover the entire 200-kb region (fig. 1). These markers were all genotyped in an extended collection of 563 individuals (referred to as the “genotyping panel”) originating from 13 different ethnologically well-defined human populations. Coalescent methods, sequence-based neutrality tests, and the long-range haplotype (LRH) test (Sabeti et al. Sabeti et al., 2002Sabeti PC Reich DE Higgins JM Levine HZ Richter DJ Schaffner SF Gabriel SB Platko JV Patterson NJ McDonald GJ Ackerman HC Campbell SJ Altshuler D Cooper R Kwiatkowski D Ward R Lander ES Detecting recent positive selection in the human genome from haplotype structure.Nature. 2002; 419: 832-837Crossref PubMed Scopus (1319) Google Scholar) were performed to provide insight into the role of these genes in human adaptation to geographically and historically fluctuating xenobiotic environments. The resequencing panel consisted of 80 individuals (160 chromosomes) from eight populations representing major geographic regions; sub-Saharan African chromosomes were represented by Bakola Pygmies from Cameroon (20) and by Bantu speakers from Gabon (20); western Eurasian samples were represented by Ashkenazi Jews (20), Sardinians (12), French (20), and Saami from Finland (20); and eastern Eurasian samples were represented by Indians from Gujarat (20) and by Thai (28). One chimpanzee (Pan troglodytes) was also sequenced to define the ancestral state of each mutation. The genotyping panel consisted of 563 individuals (1,126 chromosomes) from 13 populations. Sub-Saharan African chromosomes were represented by Bakola Pygmies (80) and Baka Pygmies (60) from Cameroon, Ateke Bantu speakers from Gabon (100), and Somali (48); North African and western and central Eurasian samples were represented by Morrocans (88), Ashkenazi Jews (80), Sardinians (98), Swedes (100), Saami from Finland (96), and Turkmen from Uzbekistan (100); and eastern Eurasian samples were represented by Gujarati from India (100), Chinese from the Hunan and Zhejang regions (88), and Thai (88). All individuals were healthy donors from whom informed consent was obtained. Six different regions were PCR amplified, for a total of ∼8.5 kb per chromosome (fig. 1): the entire coding exon of the NAT1 gene (870 bp) and 1,735 bp of noncoding flanking parts (1,122 bp in 5′ end and 613 bp in 3′ end); the entire coding exon of the NAT2 gene (870 bp) and 1,950 bp of noncoding flanking parts, including 1,603 bp surrounding its first noncoding exon; the pseudogene NATP (2,145 bp); and two intergenic noncoding regions at 10 kb and 100 kb (1,068 bp) from NAT1 5′ end. Details about PCR and sequencing conditions are available on request. As a measure of quality control for the data, individuals presenting singletons or ambiguous polymorphisms were reamplified and resequenced. Sequences were analyzed using the GENALYS software (Takahashi et al. Takahashi et al., 2003Takahashi M Matsuda F Margetic N Lathrop M Automated identification of single nucleotide polymorphisms from sequencing data.J Bioinform Comput Biol. 2003; 1: 253-265Crossref PubMed Scopus (69) Google Scholar). The newly discovered sequence-based variation was used to determine the minimal number of SNPs able to distinguish the haplotypic diversity (haplotype-tagging SNPs [htSNPs]) of NAT1 and NAT2 loci in a given population. Five SNPs and one (TAA)n microsatellite were typed in NAT1 by either genotyping or sequencing, and seven SNPs were genotyped in the NAT2 coding region. In addition, we genotyped nine intergenic SNPs selected because they were polymorphic in all human populations (fig. 1). These SNPs were chosen either from dbSNP (when dbSNPs met the previous criterion) or from the intergenic regions sequenced here. Genotyping was performed by either fluorescence polarization (VICTOR-2TM technology) or TaqMan (ABI Prism-7000 Sequence Detection System) assays. Allele frequencies were determined by gene counting, and deviations from Hardy-Weinberg equilibrium were tested by Arlequin v.2.001 (Schneider et al. Schneider et al., 2000Schneider S Roessli D Excoffier L Arlequin version 2.000: a software for population genetic data analysis. Genetics and Biometry Laboratory, University of Geneva, Geneva2000Google Scholar). Haplotype reconstruction was performed using the Bayesian method implemented in PHASE v.2.1.1 (Stephens and Donnelly Stephens and Donnelly, 2003Stephens M Donnelly P A comparison of Bayesian methods for haplotype reconstruction from population genotype data.Am J Hum Genet. 2003; 73: 1162-1169Abstract Full Text Full Text PDF PubMed Scopus (2940) Google Scholar), and htSNPs were defined using BEST v.1.0 (Sebastiani et al. Sebastiani et al., 2003Sebastiani P Lazarus R Weiss ST Kunkel LM Kohane IS Ramoni MF Minimal haplotype tagging.Proc Natl Acad Sci USA. 2003; 100: 9900-9905Crossref PubMed Scopus (140) Google Scholar), after the exclusion of singletons because they could not be positioned with certainty on a given haplotypic context. With the use of phased data, the neutral parameter θML and the time since the most recent common ancestor (TMRCA) were estimated by maximum likelihood with GENETREE (Griffiths and TavaréGriffiths and Tavaré, 1994Griffiths RC Tavaré S Sampling theory for neutral alleles in a varying environment.Philos Trans R Soc Lond B Biol Sci. 1994; 344: 403-410Crossref PubMed Scopus (404) Google Scholar), under a standard coalescent model. Since this model assumes no recombination, for this particular analysis we had to exclude a few SNPs or rare recombinant haplotypes (in NAT1, the first four 5′ SNPs; in NATP, three singleton haplotypes; in NAT2, two singleton haplotypes). Time, scaled in 2Ne units, was converted into years by use of a 25-year generation time and an Ne value obtained as θML divided by 4μ. The mutation rate per gene per generation (μ) was deduced from Dxy, the average number of nucleotide substitutions per site between human and chimpanzee (Nei Nei, 1987Nei M Molecular evolutionary genetics. Columbia University Press, New York1987Crossref Google Scholar, equation 10.20), calculated by DnaSP v.4.0 (Rozas et al. Rozas et al., 2003Rozas J Sanchez-DelBarrio JC Messeguer X Rozas R DnaSP, DNA polymorphism analyses by the coalescent and other methods.Bioinformatics. 2003; 19: 2496-2497Crossref PubMed Scopus (5089) Google Scholar), with consideration that the two species diverged 200,000 generations ago. Simulations were performed to estimate the probability of a TMRCA greater than a given value, under a Wright-Fisher model. Fifty thousand simulations were performed using a version of the MS program modified to obtain TMRCA values (R. Hudson, personal communication). Using DnaSP, we calculated the nucleotide diversity (π) and Watterson's estimator of θ (θW) (Watterson Watterson, 1975Watterson GA On the number of segregating sites in genetical models without recombination.Theor Popul Biol. 1975; 7: 256-276Crossref PubMed Scopus (2861) Google Scholar), and we performed a number of statistical tests: Tajima's D (TD) (Tajima Tajima, 1989Tajima F Statistical method for testing the neutral mutation hypothesis by DNA polymorphism.Genetics. 1989; 123: 585-595Crossref PubMed Google Scholar), Fu and Li's F* (Fu and Li Fu and Li, 1993Fu YX Li WH Statistical tests of neutrality of mutations.Genetics. 1993; 133: 693-709Crossref PubMed Google Scholar), Fay and Wu's H (Fay and Wu Fay and Wu, 2000Fay JC Wu CI Hitchhiking under positive Darwinian selection.Genetics. 2000; 155: 1405-1413PubMed Google Scholar), KA/KS (Kimura Kimura, 1968Kimura M Evolutionary rate at the molecular level.Nature. 1968; 217: 624-626Crossref PubMed Scopus (1986) Google Scholar), the Hudson-Kreitman-Aguadé (HKA) test (Hudson et al. Hudson et al., 1987Hudson RR Kreitman M Aguade M A test of neutral molecular evolution based on nucleotide data.Genetics. 1987; 116: 153-159PubMed Google Scholar), and the McDonald-Kreitman (MK) test (McDonald and Kreitman McDonald and Kreitman, 1991McDonald JH Kreitman M Adaptive protein evolution at the Adh locus in Drosophila.Nature. 1991; 351: 652-654Crossref PubMed Scopus (2388) Google Scholar). A neutrality test based on the expected heterozygosity was also performed with the Bottleneck program (Cornuet and Luikart Cornuet and Luikart, 1996Cornuet JM Luikart G Description and power analysis of two tests for detecting recent population bottlenecks from allele frequency data.Genetics. 1996; 144: 2001-2014PubMed Google Scholar) on the NAT1 3′ UTR microsatellite, by the use of coalescent simulations (10,000 runs) and with the assumption of different mutational models (stepwise mutation model and two-phased mutation model with 0%–40% of multistep changes). Pairwise LD between the 21 genotyped SNPs was estimated after the exclusion, in each population, of SNPs with a minor-allele frequency (MAF) <0.10. Using DnaSP, we calculated the statistics D′ (Lewontin Lewontin, 1964Lewontin RC The interaction of selection and linkage. II. Optimum models.Genetics. 1964; 50: 757-782PubMed Google Scholar) and r2 (Hill and Robertson Hill and Robertson, 1968Hill WG Robertson A The effects of inbreeding at loci with heterozygote advantage.Genetics. 1968; 60: 615-628PubMed Google Scholar) and tested their statistical significance, using a Fisher's exact test followed by Bonferroni corrections. To perform the LRH test, we selected two core regions (in NAT1, SNPs 445, 1088, 1095, and 1191; in NAT2, SNPs 341, 481, 590, 803, and 857) identified as haplotype blocks, following the criteria of Gabriel et al. (Gabriel et al., 2002Gabriel SB Schaffner SF Nguyen H Moore JM Roy J Blumenstiel B Higgins J DeFelice M Lochner A Faggart M Liu-Cordero SN Rotimi C Adeyemo A Cooper R Ward R Lander ES Daly MJ Altshuler D The structure of haplotype blocks in the human genome.Science. 2002; 296: 2225-2229Crossref PubMed Scopus (4532) Google Scholar), and we assessed, for each core haplotype, its relative extended haplotype homozygozity (REHH) 200 kb apart. To test the significance of potentially selected core haplotypes, we first compared our sub-Saharan African and non-African data sets with coalescent simulations of 1-Mb regions, assuming a neutral model of evolution with recombination (Hudson Hudson, 2002Hudson RR Generating samples under a Wright-Fisher neutral model.Bioinformatics. 2002; 18: 337-338Crossref PubMed Scopus (1688) Google Scholar). Model parameters (including demography and recombination rate) were consistent with current estimates for African and non-African populations (Schaffner et al. Schaffner et al., 2005Schaffner SF Foo C Gabriel S Reich D Daly MJ Altshuler D Calibrating a coalescent simulation of human genome sequence variation.Genome Res. 2005; 15: 1576-1583Crossref PubMed Scopus (477) Google Scholar). Similarly, our sub-Saharan African and non-African data sets were compared with the empirical distribution of “core haplotype frequencies versus REHH” obtained from the screening of the entire chromosome 8 in Yoruban and European-descent populations, respectively (HapMap database). To infer the population growth rate, r, and the age, g, of NAT2 nonsynonymous mutations, we used a joint maximum-likelihood estimation of these parameters, as described in Austerlitz et al. (Austerlitz et al., 2003Austerlitz F Kalaydjieva L Heyer E Detecting population growth, selection and inherited fertility from haplotypic data in humans.Genetics. 2003; 165: 1579-1586PubMed Google Scholar). We compared these results with coalescent-based estimations of the two parameters: the growth rate estimation of Slatkin and Bertorelle (Slatkin and Bertorelle, 2001Slatkin M Bertorelle G The use of intraallelic variability for testing neutrality and estimating population growth rate.Genetics. 2001; 158: 865-874PubMed Google Scholar) and the Reeve and Rannala (Reeve and Rannala, 2002Reeve JP Rannala B DMLE+: Bayesian linkage disequilibrium gene mapping.Bioinformatics. 2002; 18: 894-895Crossref PubMed Scopus (97) Google Scholar) age estimation using the DMLE+ v.2.2 software. One million iterations were performed for each estimation. The recombination parameter required for these analyses was estimated by comparing deCODE and Marshfield genetic and physical distances in the NATs region (UCSC Genome Bioinformatics). The coefficient of selection, s, of the NAT2 mutation 341T→C was estimated using the deterministic equation 3.29 of Wright (Wright, 1969Wright S Evolution and the genetics of population. University of Chicago Press, Chicago1969: 33Google Scholar), which relates the frequency of an allele in generation t+1 to its frequency in generation t. We stated the degree of dominance, h, to 0.0 (recessivity) and 0.5 (codominance). We assumed the frequency, p0, of the C allele before selection to vary between 0.05 and 0.15 (corresponding to the allele frequency in Pygmies and eastern Eurasians). Making these assumptions, we calculated the s values that would yield a frequency of 0.50 (the present-day frequency of the 341C allele in western Eurasians) from its initial p0 frequency in g generations. The initial sequencing screening of the resequencing panel yielded a total of 111 mutations, including 68 transitions, 34 transversions, 8 insertions/deletions, and 1 triallelic microsatellite (table 1) (GenBank accession numbers DQ305496–DQ305975). In NAT1, we observed 2 nonsynonymous and 4 synonymous SNPs in its coding region and 26 SNPs and the triallelic (TAA)n microsatellite in its flanking regions. In the NAT2 coding region, we found two synonymous and eight nonsynonymous mutations, three of which were newly identified (L24I, T193M, and Y208H). These three variants were singletons and were restricted to sub-Saharan samples. In addition, 14 SNPs and 3 indels in NAT2 flanking regions were observed. In NATP, we identified 32 SNPs and 5 indels. For all the SNPs, only 1.54% of the tests departed significantly from Hardy-Weinberg equilibrium. However, these few tests would become nonsignificant after a correction for multiple testing.Table 1Polymorphisms Identified through the Resequencing Survey of the NATs RegionDerived-Allele Frequency in Population (%)PolymorphismdbSNPReference NumberAllele(s) (Ancestral/Derived)SNP TypeBakola (2N=20)Bantu (2N=20)Ashkenazi (2N=20)Sardinian (2N=12)French (2N=20)Saami (2N=20)Gujarati (2N=20)Thai (2N=28)NAT1−1112rs8190842G/AIntron5.010.0.0.05.0.05.07.1NAT1−1048rs8190843G/CIntron.05.0.0.0.0.0.0.0NAT1−943rs8190844C/TIntron.0.0.0.05.0.0.0.0NAT1−929rs8190845G/AIntron45.050.05.033.315.020.05.025.0NAT1−868rs8190846G/AIntron.0.0.0.05.0.05.07.1NAT1−844rs8190847G/AIntron.0.0.0.05.00.05.07.1NAT1−826rs8190848C/TIntron100.0100.0100.0100.095.0100.095.092.9NAT1−720NAT/AIntron100.0100.0100.0100.095.0100.095.092.9NAT1−706rs8190851G/AIntron100.0100.0100.0100.095.0100.095.092.9NAT1−688rs8190852G/CIntron100.0100.0100.0100.095.0100.095.092.9NAT1−685rs8190853C/TIntron100.0100.0100.0100.095.0100.095.092.9NAT1−621NAA/GIntron.0.0.0.0.0.0.03.6NAT1−565rs8190854A/GIntron10.0.0.0.0.0.0.0.0NAT1−433rs8190856T/CIntron.0.0.0.05.0.05.07.1NAT1−344rs4986988T/CIntron100.0100.0100.0100.095.0100.095.092.9NAT1−278rs17126356T/AIntron5.010.0.08.3.0.0.0.0NAT1−40rs4986989T/AIntron100.0100.0100.0100.095.0100.095.092.9NAT1−36rs8190857A/TIntron.010.0.0.0.0.0.0.0NAT1 21rs4986992T/GSilent mutation5.0.0.0.0.0.0.0.0NAT1 342NAT/CSilent mutation.0.0.0.0.0.0.03.6NAT1 445rs4987076A/GMissense mutation V149I100.0100.0100.0100.095.0100.095.092.9NAT1 459rs4986990G/ASilent mutation.0.0.0.05.0.05.07.1NAT1 640rs4986783G/TMissense mutation S214A100.0100.0100.0100.095.0100.095.092.9NAT1 777rs4986991T/CSilent mutation5.0.0.0.0.0.0.0.0NAT1 (TAA)9NAins(TAA)3′ UTR.00.0.0.0.0.0.017.9NAT1 (TAA)5NAdel(TAATAATAA)3′ UTR.0.0.0.05.0.05.07.1NAT1 1088rs1057126T/A3′ UTR35.065.020.033.320.025.015.046.4NAT1 1095rs15561A/C3′ UTR60.035.080.058.370.065.075.046.4NAT1 1191rs4986993T/G3′ UTR60.035.080.058.370.065.075.046.4NAT1 1236rs4987077A/GIntergenic/unknown35.0.0.0.0.0.0.0.0NAT1 1277NAA/GIntergenic/unknown.0.0.0.05.0.05.07.1NAT1 1345rs8190862G/CIntergenic/unknown100.0100.0100.0100.095.0100.095.092.9NAT1 1377rs8190863C/TIntergenic/unknown.0.0.0.05.0.05.07.1NAT1 1454rs8190864C/TIntergenic/unknown.0.0.08.3.0.0.0.010KB 455NAG/AIntergenic/unknown5.0.0.0.0.0NA.0.010KB 546NAG/TIntergenic/unknown.05.0.0.0.0NA.0.010KB 633rs4921583C/AIntergenic/unknown15.045.030.033.350.0NA25.030.010KB 648rs4921585C/TIntergenic/unknown15.045.030.041.750.0NA25.030.010KB 666NAC/AIntergenic/unknown.0.0.0.05.0NA.0.010KB 884NAG/AIntergenic/unknown.0.0.0.0.0NA.05.010KB 904rs1389110T/CIntergenic/unknown38.960.030.050.050.0NA25.045.0100KB 314NAG/TIntergenic/unknown.011.1.0.0.0NA.0.0100KB 409NAT/CIntergenic/unknown5.016.7.0.0.0NA.0.0100KB 411NAT/GIntergenic/unknown.0.0.0.05.6NA5.0.0100KB 455NAG/CIntergenic/unknown10.0.0.0.0.0NA.0.0100KB 507rs12541267A/GIntergenic/unknown45.020.035.033.338.9NA40.045.0100KB 530rs13259523A/GIntergenic/unknown.0.0.0.05.6NA5.0.0100KB 595NAG/TIntergenic/unknown5.0.0.0.0.0NA.0.0NATP −10NAdel(GAAA…TAGT)Intergenic/unknown.0.0.0.0.05.0.0.0NATP 49NAA/GIntergenic/unknown10.05.0.0.0.0.0.0.0NATP 52NAC/TIntergenic/unknown5.05.0.0.0.0.0.0.0NATP 362NAC/TIntergenic/unknown100.090.0100.0100.0100.0100.0100.0100.0NATP 414NAdel(T)Intergenic/unknown.010.0.0.0.0.0.0.0NATP 417rs12334336T/CIntergenic/unknown5.010.0.0.0.0.0.0.0NATP 520NAA/GIntergenic/unknown.010.0.0.0.0.0.0.0NATP 631NAG/CIntergenic/unknown100.090.0100.0100.0100.0100.0100.0100.0NATP 685NAT/CIntergenic/unknown5.0.0.0.0.0.0.0.0NATP 698rs10088180A/GIntergenic/unknown75.070.075.066.785.045.060.060.7NATP 733NAT/GIntergenic/unknown.0.0.0.0.0.0.03.6NATP 745NAdel(G)Intergenic/unknown25.035.035.016.750.010.025.042.9NATP 754NAG/AIntergenic/unknown90.0100.0100.0100.0100.0100.0100.0100.0NATP 828NAT/CIntergenic/unknown15.015.0.0.0.05.00.0.0NATP 859NAins(T)Intergenic/unknown85.085.0100.0100.0100.0100.0100.0100.0NATP 876NAT/CIntergenic/unknown.05.0.0.0.0.0.0.0NATP 911NAG/AIntergenic/unknown.05.0.0.0.0.0.0.0NATP 1007NAG/AIntergenic/unknown.05.0.0.0.0.0.0.0NATP 1153NAC/TIntergenic/unknown.0.010.0.0.0.0.0.0NATP 1198NAT/CIntergenic/unknown10.015.020.016.710.015.010.028.6NATP 1251NAG/CIntergenic/unknown5.010.0.0.0.0.0.0.0NATP 1257NAins(A)Intergenic/unknown.0.0.0.0.0.0.03.6NATP 1433NAT/CIntergenic/unknown10.05.0.0.0.0.0.0.0NATP 1581NAG/CIntergenic/unknown10.05.0.0.0.0.0.0.0NATP 1602rs2172426T/CIntergenic/unknown60.035.045.025.055.050.055.053.6NATP 1616NAC/AIntergenic/unknown.05.0.0.0.0.0.0.0NATP 1686NAT/CIntergenic/unknown15.05.0.0.0.0.0.0.0NATP 1770rs12548816T/CIntergenic/unknown.0.010.08.35.030.010.010.7NATP 1794NAG/AIntergenic/unknown.010.0.0.0.0.0.0.0NATP 1827NAT/GIntergenic/unknown15.05.0.0.0.0.0.0.0NATP 1829NAT/GIntergenic/unknown15.05.0.0.0.0.0.0.0NATP 1851NAG/AIntergenic/unknown15.05.0.0.0.0.0.0.0NATP 1881rs13254216T/CIntergenic/unknown.0.010.08.35.0.05.0.0NATP 1903NAA/CIntergenic/unknown15
Referência(s)