Exact Tests of Hardy-Weinberg Equilibrium and Homogeneity of Disequilibrium across Strata
2006; Elsevier BV; Volume: 79; Issue: 6 Linguagem: Inglês
10.1086/510257
ISSN1537-6605
AutoresDaniel J. Schaid, Anthony Batzler, Gregory D. Jenkins, Michelle A.T. Hildebrandt,
Tópico(s)Evolution and Genetic Dynamics
ResumoDetecting departures from Hardy-Weinberg equilibrium (HWE) of marker-genotype frequencies is a crucial first step in almost all human genetic analyses. When a sample is stratified by multiple ethnic groups, it is important to allow the marker-allele frequencies to differ over the strata. In this situation, it is common to test for HWE by using an exact test within each stratum and then using the minimum P value as a global test. This approach does not account for multiple testing, and, because it does not combine information over strata, it does not have optimal power. Several approximate methods to combine information over strata have been proposed, but most of them sum over strata a measure of departure from HWE; if the departures are in different directions, then summing can diminish the overall evidence of departure from HWE. An exact stratified test is more appealing because it uses the probability of genotype configurations across the strata as evidence for global departures from HWE. We developed an exact stratified test for HWE for diallelic markers, such as single-nucleotide polymorphisms (SNPs), and an exact test for homogeneity of Hardy-Weinberg disequilibrium. By applying our methods to data from Perlegen and HapMap—a combined total of more than five million SNP genotypes, with three to four strata and strata sizes ranging from 23 to 60 subjects—we illustrate that the exact stratified test provides more-robust and more-powerful results than those obtained by either the minimum of exact test P values over strata or approximate stratified tests that sum measures of departure from HWE. Hence, our new methods should be useful for samples composed of multiple ethnic groups. Detecting departures from Hardy-Weinberg equilibrium (HWE) of marker-genotype frequencies is a crucial first step in almost all human genetic analyses. When a sample is stratified by multiple ethnic groups, it is important to allow the marker-allele frequencies to differ over the strata. In this situation, it is common to test for HWE by using an exact test within each stratum and then using the minimum P value as a global test. This approach does not account for multiple testing, and, because it does not combine information over strata, it does not have optimal power. Several approximate methods to combine information over strata have been proposed, but most of them sum over strata a measure of departure from HWE; if the departures are in different directions, then summing can diminish the overall evidence of departure from HWE. An exact stratified test is more appealing because it uses the probability of genotype configurations across the strata as evidence for global departures from HWE. We developed an exact stratified test for HWE for diallelic markers, such as single-nucleotide polymorphisms (SNPs), and an exact test for homogeneity of Hardy-Weinberg disequilibrium. By applying our methods to data from Perlegen and HapMap—a combined total of more than five million SNP genotypes, with three to four strata and strata sizes ranging from 23 to 60 subjects—we illustrate that the exact stratified test provides more-robust and more-powerful results than those obtained by either the minimum of exact test P values over strata or approximate stratified tests that sum measures of departure from HWE. Hence, our new methods should be useful for samples composed of multiple ethnic groups. Evaluating Hardy-Weinberg equilibrium (HWE) among marker-genotype proportions is basic to all studies of population genetic data. Some causes of departure from HWE are nonrandom mating, recent migrations, mutations, selection, undetected “silent” or deleted alleles in heterozygotes, and mixture of subpopulations that do not completely interbreed. Because HWE is expected to occur for most large, randomly mating populations, departures from HWE are often interpreted as genotype errors. Genotypes that significantly depart from HWE are often removed from analyses, although one should be cautious when stretches of markers in linkage disequilibrium depart from HWE.1Weir BS Hill WG Cardon LR Allelic association patterns for a dense SNP map.Genet Epidemiol. 2004; 27: 442-450Crossref PubMed Scopus (0) Google Scholar Current large-scale efforts to discover SNPs and to characterize their frequencies and correlation structure across the genome, as well as across different populations, use relatively small numbers of subjects from different ethnic groups. Hinds et al.2Hinds DA Stuve LL Nilsen GB Halperin E Eskin E Ballinger DG Frazer KA Cox DR Whole-genome patterns of common DNA variation in three human populations.Science. 2005; 307: 1072-1079Crossref PubMed Scopus (975) Google Scholar characterized >1.6 million SNPs among samples from three ethnic groups, using 23–24 subjects per group. The HapMap project has genotyped >3.7 million SNPs in four ethnic groups, using 45–60 independent subjects per group.3The International HapMap Consortium A haplotype map of the human genome.Nature. 2005; 437: 1299-1320Crossref PubMed Scopus (4776) Google Scholar When testing for HWE in these studies, researchers computed tests within ethnic group strata and used the smallest P value over all strata, to measure the quality of each SNP. A problem with this approach is that the sample size within each of the strata may not be sufficient to detect meaningful departures from HWE, in contrast to a test that combines the evidence for departure from HWE across all strata. Several methods have been proposed to combine information across strata, allowing for differences in allele frequencies, but none are exact tests. These proposed methods can lead to inflated type I error rates or loss of power. For this reason, we developed an efficient algorithm to compute exact tests for HWE that combine information across strata for diallelic markers, such as SNPs. To appreciate the limitations of past work on methods of testing HWE across strata, we briefly review some of the key aspects, because some points provide a deeper understanding of the issues and some developments are useful for our exact methods. For notation, we use A and B to represent the rare and common alleles, respectively, of a locus, with respective allele frequencies p and q=1−p (p≤q). As explained by Weir,4Weir B Genetic data analysis II. Sinauer Associates, Sunderland, MA1996Google Scholar the frequencies of the three genotypes can be expressed in terms of the allele frequencies and a measure of departure from HWE (coefficient of disequilibrium D):PAA=p2+D,PAB=2pq-2D,andPBB=q2+D.Departure from HWE is then provided by D=PAA−p2 or, equivalently, by D=(4PAAPBB−P2AB)/4. This latter expression is more commonly used in the literature. Haldane5Haldane HBS An exact test for randomness of mating.J Genet. 1954; 52: 631-635Crossref Google Scholar was the first to develop a stratified test for HWE. He did this by recognizing that D is expected to be zero when HWE is true. When HWE holds, the allele counts are sufficient statistics, and so the probability of the genotype counts, conditional on the allele counts, allows one to compute the mean and variance of the parameter of interest. Let NAA, NAB, and NBB denote the counts of the genotypes. To estimate D from a sample, it may be tempting to plug in the sample estimates,PˆAA=NAA/N,PˆAB=NAB/N, andPˆBB=NBB/N. However, because genotype counts are negatively correlated, this would lead to a biased estimate for small samples; the bias diminishes as the total sample size N increases. As emphasized by Smith,6Smith C A note on testing the Hardy-Weinberg Law.Ann Hum Genet. 1970; 33: 377-383Crossref Google Scholar this may not be an important bias for large samples, but, when adding contributions across strata, each of small sample size, the bias can be amplified. Hence, Haldane used an unbiased estimate of D,D∧=4NAANBB-NAB(NAB-1)N(N-1),so that testing D=0 can be based on the sample estimate h=4NAANBB−NAB(NAB−1). Haldane5Haldane HBS An exact test for randomness of mating.J Genet. 1954; 52: 631-635Crossref Google Scholar derived an unbiased variance of h when HWE is true. In contrast, Smith6Smith C A note on testing the Hardy-Weinberg Law.Ann Hum Genet. 1970; 33: 377-383Crossref Google Scholar derived the variance of h for when there are departures from HWE, illustrating how the variance of h depends on the population parameters p, q, and D. However, he did not derive an unbiased variance estimate; one cannot simply plug sample estimates into the variance formulas. To combine the h values over strata, Haldane first standardized each stratum's h by its SE, hk/√Var(hk), and then summed these standardized terms over the K strata to compute the combined statisticTHaldane=1K∑hk/Var(hk), which has an approximate standard normal distribution when HWE is true. A problem with this approach is first standardizing and then summing. A more powerful approach would be to first sum and then standardize, much like the way the Mantel-Haenszel test is constructed for testing a common odds ratio over strata7Mantel N Haenszel W Statistical aspects of the analysis of data from the retrospective study of disease.J Natl Cancer Inst. 1959; 22: 719-748PubMed Google Scholar or the way NPL statistics can be optimally combined across pedigrees.8McPeek MS Optimal allele-sharing statistics for genetic mapping using affected relatives.Genet Epidemiol. 1999; 16: 225-249Crossref PubMed Scopus (0) Google Scholar Hence, we propose the statistic T=(Σhk/√ΣVar(hk), which also has an approximate standard normal distribution. Positive values of T imply an excess of homozygotes, and negative values an excess of heterozygotes, making it simple to interpret significant departures from HWE. In contrast to Haldane's method, Smith6Smith C A note on testing the Hardy-Weinberg Law.Ann Hum Genet. 1970; 33: 377-383Crossref Google Scholar computed a weighted sum of the hk values, using weights proportional to the inverse of the variance. However, his derivations were a bit odd, because he assumed that the allele frequencies are constant over strata, which is counter to what we wish to assume. The methods by Haldane and Smith are appropriate if the hk values are all in the same direction (positive or negative) over strata, but they can cancel each other if this is not the case, which will weaken power. For this reason, others have assumed that the ratio θ=P2AB/4PAAPBB is constant over strata, much like the assumption of a constant odds ratio across stratified 2×2 tables in epidemiological studies.9Olson JM Testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1993; 57: 291-295Crossref PubMed Google Scholar, 10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar, 11Nam JM Testing a genetic equilibrium across strata.Ann Hum Genet. 1997; 61: 163-170Crossref PubMed Scopus (0) Google Scholar Nonetheless, the resulting test for HWE across strata, derived by Olson,9Olson JM Testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1993; 57: 291-295Crossref PubMed Google Scholar, 10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar is also based on a weighted sum of hk values. Nam11Nam JM Testing a genetic equilibrium across strata.Ann Hum Genet. 1997; 61: 163-170Crossref PubMed Scopus (0) Google Scholar derived score statistics based on likelihoods that depend on the θ parameter and showed that his combined tests for HWE had properties similar to the test proposed by Olson. Hence, the variety of proposed tests for HWE that combine information across strata are all based on the stratum-specific hk values, with merely slightly different ways of weighting the contribution from each stratum. Simulations (not shown) suggest that the type I error rate and power of the different methods are similar, and our simple T statistic would provide a powerful test for HWE when the sample sizes of the strata are not too small and departures from HWE are all in the same direction. Because summing hk values over strata can cancel each other when they differ in sign, Troendle and Yu12Troendle JF Yu KF A note on testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1994; 58: 397-402Crossref PubMed Scopus (0) Google Scholar proposed a statistic that is analogous to summing h2k/Var(hk) over strata. The resulting statistic has a χ2 distribution with K df. Although this method can have greater power when the hk values differ in sign, it is likely to have weak power in general, because of the many df. An alternative approach is to test whether the θk values significantly differ over the strata, because significant heterogeneity implies departure from HWE. To compute this type of statistic under the null hypothesis of homogeneity (yet allowing departure from HWE), one needs to estimate a common θ parameter. Using estimating equations, Olson and Foley10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar derived a consistent estimator for θ, whereas Nam11Nam JM Testing a genetic equilibrium across strata.Ann Hum Genet. 1997; 61: 163-170Crossref PubMed Scopus (0) Google Scholar used an iterative maximum-likelihood method. Both of these approaches, however, can run into undefined parameter estimates; Olson's θ is undefined when there are no AA homozygotes across all strata (or no BB homozygotes); similar problems occur for the maximum-likelihood estimator. In these cases, the test for homogeneity breaks down. Because of the above complications with large-sample statistical tests for HWE across strata or for homogeneity of departures from HWE across strata, exact methods are appealing. Instead of summing a measure of departure from HWE, an exact test evaluates the combined evidence over strata by considering the probability of genotype configurations when the null hypothesis is true; extreme departures in different directions are rare under HWE, giving a small P value, yet a sum of measures of departure from HWE could, in fact, completely miss this situation. Exact tests also avoid numerical problems (e.g., division by zero), and they provide appropriate control of the type I error rates. To date, only Olson and Foley10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar considered exact methods. However, because their methods allowed for an arbitrary number of alleles, exact computations were not feasible. Rather, they needed to rely on Markov Chain–Monte Carlo methods. Because of the broad use of SNPs, we present efficient computational methods to compute exact tests both for HWE over strata and for homogeneity of departures of HWE over strata. We demonstrate our methods by applying them to the SNP genotype data from Perlegen2Hinds DA Stuve LL Nilsen GB Halperin E Eskin E Ballinger DG Frazer KA Cox DR Whole-genome patterns of common DNA variation in three human populations.Science. 2005; 307: 1072-1079Crossref PubMed Scopus (975) Google Scholar and HapMap.3The International HapMap Consortium A haplotype map of the human genome.Nature. 2005; 437: 1299-1320Crossref PubMed Scopus (4776) Google Scholar The results illustrate the advantages of our exact stratified test for HWE over the minimum exact-test P value or other approximate methods. Furthermore, simulations confirm our empirical findings. To derive an exact stratified test for HWE, we use well-known methods for computing the probability of a sample of genotypes when HWE is true. In this case, Fisher4Weir B Genetic data analysis II. Sinauer Associates, Sunderland, MA1996Google Scholar showed that the allele counts NA and NB are sufficient statistics and that the probability of genotype counts, conditional on allele counts, can be expressed asP(NAA,NAB,NBB|NA,NB)=N!NA!NB!2NABNAA!NAB!NBB!(2N)!.(1) Because N, NA, and NB are all fixed, the only random genotype is the number of heterozygotes, so expression (1) can be written asP(NAB=x)=N!NA!NB!(2N)!×2x[(NA-x)/2]!x![N-(NA+x)/2]!.(2) An exact P value is computed by summing values from equation (2) over all values of x that generate probabilities equal to or smaller than do the observed number of genotypes, NAB. As emphasized by Wiggington et al.,13Wigginton JE Cutler DJ Abecasis GR A note on exact tests of Hardy-Weinberg equilibrium.Am J Hum Genet. 2005; 76: 887-893Abstract Full Text Full Text PDF PubMed Scopus (1077) Google Scholar when NA is odd, the possible values of x are 1, 3, …, NA, and, when NA is even, the possible values of x are 0, 2, …, NA. Furthermore, equation (2) can be computed efficiently by recursion:P(NAB=x+2)=P(NAB=x)4[(NA-x)/2][N-(NA+x)/2](x+2)(x+1). Now, to extend these ideas to strata, letN~AB be the vector of observed counts of AB heterozygotes for the different strata, and letx~ be a vector containing a configuration of possible values of heterozygotes for the different strata. The probability ofx~ under HWE is the product of expression (2) over the K strata, P(x~)=ΠP(xk). This allows us to compute an exact stratified P value byPvalue=∑x∧∈SP(x∧),where S is the set ofx~ configurations that have probabilities equal to or less than that of the observed configuration:S={x~:P(x~)≤P(N~AB)}. If mk is the number of possible values of x in stratum k, then the number of possiblex~ configurations is Πmk, which can be a very large number. A naive approach to compute the exact P value is to evaluate all possible configurations, which is inefficient. Rather, we first compute P(x) for all possible values of x within each stratum. This avoids having to recompute P(x) many times. Using recursion makes this fast, and using log-probabilities avoids numerical imprecision. Because it is of critical importance, we order the log-probabilities such that we can stop the summation for the P values as soon as possible. To do this, we sort the log-probabilities into increasing order within each stratum, using quick sort. Then, we begin to evaluate different possiblex~ configurations by summing, across strata, the log-probabilities for values of x in thex~ vector. If this sum is less than or equal to the log-probability of the observed data, then we exponentiate it and add it to the running sum for the P value. The prior sorting of the log-probabilities within each stratum allows us to skip over computations that would generateP(x~)>P(N~AB) and, hence, would not contribute to the sum for the P value. This is explained in greater detail by use of an example in appendix A. An advantage of our approach is that, when small P values are used as a quality-control (QC) filter (e.g., P<.001 as used elsewhere2Hinds DA Stuve LL Nilsen GB Halperin E Eskin E Ballinger DG Frazer KA Cox DR Whole-genome patterns of common DNA variation in three human populations.Science. 2005; 307: 1072-1079Crossref PubMed Scopus (975) Google Scholar, 3The International HapMap Consortium A haplotype map of the human genome.Nature. 2005; 437: 1299-1320Crossref PubMed Scopus (4776) Google Scholar), we can stop computations early when the computed P value exceeds a specified threshold. Olson and Foley10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar derived a test of homogeneity of disequilibrium across strata, on the basis of the assumption that θ=P2AB/4PAAPBB is constant over strata. They showed that the sufficient statistics for this test are the allele counts within strata, as well as the total (across strata) genotype counts. As in the exact test for HWE, when conditioning on the sufficient statistics, we need to focus on onlyx~ configurations of counts of heterozygotes across the strata. However, by additionally conditioning on the total genotype counts, we require the sum of the elements of thex~ vector to equal the observed total number of heterozygotes. This leads to fewer possiblex~ vectors than those possible when testing HWE over strata. Under the assumption of constant θ, the probability of anx~ configuration isP(x∧)=Q(x∧)∑x∧**∈GQ(x∧*),(3) where G is the set of possiblex~ configurations (each summing to the total number of observed heterozygotes) andQ(xˆ)=ΠkNk!NAA,k!xk!NBB,k!=ΠkNk![(NA,k-xk)/2]!xk![(NB,k-xk)/2]!.An exact P value is the sum of the configuration probabilities that are equal to or less than the observed configuration (denotedxobs~),Pvalue=∑x∧∈SP(x∧),whereS={x~:P(x~)≤P(x~obs),x~ɛG}. To efficiently compute the exact P value, we first enumerate possible xk values for each stratum (for now, ignoring the constraint that the xk values must sum to the total number of observed heterozygotes) and use recursion to determine the contribution of the xk values to Q(x~). To see this, let q(xk) denote the ratio of factorials for stratum k (i.e., Q(x~)=Πq(xk)). It is easy to verify thatq(xk+2)=q(xk)[(NA,k-xk)/2][(NB,k-xk)/2](xk+2)(xk+1).By precomputing these q(xk) values, we merely need to look up their values as we determine Q(x~) for differentx~ configurations. Some differences between our method to compute the P value for homogeneity versus our method to compute the exact P value for HWE combined over strata are that (1) we need to consider all possible configurations, because the sum of Q(x~) over all configurations is used in the denominator of P(x~) (expression [3]), and (2) the constraint that the elements ofx~ must sum to the total number of observed heterozygotes reduces the number of possiblex~ vectors. This is used to our advantage. Further details of this algorithm are explained by an example in appendix A. We applied our exact methods to SNPs in the Perlegen2Hinds DA Stuve LL Nilsen GB Halperin E Eskin E Ballinger DG Frazer KA Cox DR Whole-genome patterns of common DNA variation in three human populations.Science. 2005; 307: 1072-1079Crossref PubMed Scopus (975) Google Scholar and HapMap3The International HapMap Consortium A haplotype map of the human genome.Nature. 2005; 437: 1299-1320Crossref PubMed Scopus (4776) Google Scholar data sets. For the Perlegen data, we used 1,585,674 SNPs from all chromosomes. For the X chromosome, we used males and females for the pseudoautosomal regions and only females for other regions on X. The Perlegen data have a total of 71 subjects from three ethnic groups: 23 African Americans, 24 European Americans, and 24 Han Chinese. Furthermore, the Perlegen data were “cleaned” by a number of criteria, including exact tests for HWE within each of the ethnic groups. SNPs were given a poor quality score if the smallest P value across the three ethnic groups was <.001. Hence, the Perlegen data are useful to evaluate whether combining information across strata detects significant departure from HWE that was missed by using the minimum P value. To provide a more complete comparison of using the minimum P value versus the exact stratified P value, we also applied our exact methods to SNPs in the HapMap data, using only the autosomes. These data have a total of 210 independent subjects from four ethnic groups: 60 Yoruba from Ibadan, Nigeria; 60 U.S. residents with northern and western European ancestry (CEPH samples); 45 Han Chinese; and 45 Japanese. Note that the offspring of “trios” were not used in our tests of HWE. For the HapMap data, instead of using the cleaned data, we used the “redundant-unfiltered” genotype data. This allows us to evaluate our methods on data that did not have genotypes removed because of prior tests of HWE within strata. For this data, various QC flags were used to indicate reasons why SNPs failed the QC criteria, including an exact test for HWE within each of the strata; P<.0001 in any of the strata was flagged as a failure. For our analyses, we did not eliminate SNPs that failed for this reason. Rather, we eliminated SNPs that failed the QC criteria for any reason not indicated by a HWE failure. For the duplicate samples, we coded a genotype as “missing” if the duplicates did not agree and then removed the duplicates for analyses. This resulted in the examination of 3,798,286 SNPs. To compare the results from different statistical tests, we compare the values of −log10(P value), denoted lgP, so that small P values give large values of lgP. The contrast of using our exact stratified P value versus the minimum exact P value for the three strata of the Perlegen data is illustrated in figure 1. This figure illustrates that a large number of SNPs have an exact stratified lgP>4 yet a minimum P value over strata with lgP 12. In all of these cases, the strata had no heterozygotes and either one or two rare homozygotes. These results empirically emphasize the inadequacy of the normal distribution for Haldane’s stratified statistic when there are sparse data, leading to P values that are likely much too small.Figure 4Perlegen data. Exact versus Haldane's stratified tests: plot of −log10(P value).View Large Image Figure ViewerDownload Hi-res image Download (PPT) The HapMap data provide an unbiased comparison of using the minimum exact-test P value over strata versus using the exact stratified test because the “uncleaned” data were available. Figure 5 illustrates that the lgP value for the exact stratified test tends to be larger, sometimes much larger, than that based on the minimum of exact test P values, implying a significant gain in power by using the exact stratified test. One should be cautious, however, when interpreting figure 5, because almost 3.8 million points are plotted, and so the density of points that represent acceptable SNPs (i.e., P>.0001) cannot be easily viewed. When this threshold was used, there were 2,095 SNPs significant by the minimum exact-test P value and not by the exact stratified test and 15,147 significant by the exact stratified test and not by the minimum P value over strata. This illustrates the greater sensitivity of the exact stratified test. Note that these results are for the redundant-unfiltered genotype data. To evaluate the quality of the cleaned data that most investigators use, we also applied our methods to the 3,751,020 cleaned autosomal SNPs. For these SNPs, there were 2,006 that are significant by our exact stratified test, suggesting that these SNPs have suspicious quality. Note that we did not correct for multiple testing when using the minimum P value over the four strata; had we done so, the minimum P value would increase, accentuating the greater power of the exact stratified test. Like the Perlegen results displayed in figure 2, we found for the HapMap data that the lgP values for the exact stratified test tended to be larger than the lgP values for the homogeneity test, emphasizing that using the homogeneity test as a way to test for HWE is not likely to be useful (results not shown). Also, for the HapMap data, Haldane's and Olson's stratified tests gave similar results (results not shown). To demonstrate the need to use exact methods when there are sparse data (because of small strata sizes and rare alleles), we performed a limited set of simulations. For these, we evaluated the type I error rates of the exact stratified test, our version of Haldane's test, Olson's test, and the omnibus χ2 statistic proposed by Troendle and Yu.12Troendle JF Yu KF A note on testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1994; 58: 397-402Crossref PubMed Scopus (0) Google Scholar For all simulations, we used 25 subjects per stratum and either 3 or 5 strata. A total of 10,000 replicates were used for each simulation. The type I error rates presented in table 1 illustrate that the exact test is slightly conservative when the rare-allele frequency is P=.05 but gives the correct type I error rate when P=.20. In contrast, the other statistics have inflated type I error rates for P=.05, with the Troendle and Yu statistic12Troendle JF Yu KF A note on testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1994; 58: 397-402Crossref PubMed Scopus (0) Google Scholar having grossly inflated type I error rates (likely from multiple df and sparse counts). Nonetheless, the asymptotic statistics gave approximately correct type I error rates for P=.20.Table 1Simulation Type I Error RatesNominal Type I Error RateNo. of Strata, Rare- Allele Frequency, and StatisticaEach stratum had 25 subjects. Haldane and Olson statistics gave identical results.P=.05P=.01P=.0013:.05:Exact.0243.0044.0004Haldane/Olson.0679.0374.0197Troendle and Yu.0681.0602.0277.20:Exact.0475.0119.0013Haldane/Olson.0428.0088.0014Troendle and Yu.0548.0177.00415:.05:Exact.0317.0071.0009Haldane/Olson.0622.0276.0140Troendle and Yu.1046.0986.0362.20:Exact.0532.0110.0010Haldane/Olson.0474.0089.0010Troendle and Yu.0593.0193.0038a Each stratum had 25 subjects. Haldane and Olson statistics gave identical results. Open table in a new tab Simulations for power were conducted for only P=.20 because all tests have the correct type I error rate for this situation, again restricted to 25 subjects per stratum. Results for power are presented in table 2, for when the departure from HWE is in the same direction and magnitude across all strata. In this case, the Haldane and Olson statistics had the greatest power, as expected, because they were derived under the assumption of constant departure from HWE across strata. However, the decreases in power for the other tests were generally small. Furthermore, the power was approximately the same for the exact test and the χ2 statistic of Troendle and Yu.12Troendle JF Yu KF A note on testing the Hardy-Weinberg law across strata.Ann Hum Genet. 1994; 58: 397-402Crossref PubMed Scopus (0) Google Scholar Results for power when the departure from HWE differed over strata are presented in table 3. In this case, the exact test and Troendle and Yu's statistic had similar power that was greater than that for Haldane's and Olson's tests.Table 2Simulation Power When Departure from HWE is the Same for All StrataNominal Type I Error RateNo. of Strata, Fraction of Maximum HWD, and StatisticaHWD = departure from HWE, in terms of fraction (f) of maximum departure (Dmax=p(1−p), where p is rare-allele frequency), so that the genotype frequencies are PAA=p2+D, PAB=2p(1−p)−2D, and PAB=(1−p)2+D, where D=fDmax. Rare-allele frequency was 0.20, with 25 subjects per stratum, and Haldane and Olson statistics gave identical results.P=.05P=.01P=.0013:.2:Exact.3635.1669.0462Haldane/Olson.4088.2295.0884Troendle and Yu.3563.1866.0687.3:Exact.6507.4090.1719Haldane/Olson.7005.5084.2797Troendle and Yu.6381.4328.2187.5:Exact.9572.8778.6885Haldane/Olson.9737.9287.8154Troendle and Yu.9550.8891.73925:.2:Exact.4928.2681.0921Haldane/Olson.5763.3623.1682Troendle and Yu.4658.2775.1194.3:Exact.8096.6085.3377Haldane/Olson.8783.7331.5149Troendle and Yu.7836.6066.3758.5:Exact.9961.9805.9232Haldane/Olson.9984.9932.9720Troendle and Yu.9943.9796.9353a HWD = departure from HWE, in terms of fraction (f) of maximum departure (Dmax=p(1−p), where p is rare-allele frequency), so that the genotype frequencies are PAA=p2+D, PAB=2p(1−p)−2D, and PAB=(1−p)2+D, where D=fDmax. Rare-allele frequency was 0.20, with 25 subjects per stratum, and Haldane and Olson statistics gave identical results. Open table in a new tab Table 3Simulation Power When Departure from HWE Differs over StrataNominal Type I Error RateNo. of Strata, Fraction of Maximum HWD, and StatisticaHWD = departure from HWE, in terms of fraction (f) of maximum departure (Dmax=p(1−p), where p is rare-allele frequency), so that the genotype frequencies are PAA=p2+D, PAB=2p(1−p)−2D, and PAB=(1−p)2+D, where D=sign fDmax. For three strata, sign is positive for strata 1 and 3 and negative for stratum 2. For five strata, sign is positive for strata 3–5 and negative for strata 1–2. Rare-allele frequency was 0.20, with 25 subjects per stratum, and Haldane and Olson statistics gave identical results.P=.05P=.01P=.0013:.2:Exact.2443.0924.0205Haldane/Olson.1930.0785.0206Troendle and Yu.2492.1120.0355.3:Exact.4470.2250.0694Haldane/Olson.3392.1712.0554Troendle and Yu.4517.2589.1060.5:Exact.8514.6640.3902Haldane/Olson.6824.4778.2490Troendle and Yu.8589.7062.47555:.2:Exact.2939.1232.0305Haldane/Olson.2180.0924.0244Troendle and Yu.2905.1425.0492.3:Exact.5433.3068.1100Haldane/Olson.3899.2052.0725Troendle and Yu.5422.3338.1515.5:Exact.9339.8131.5712Haldane/Olson.7569.5644.3151Troendle and Yu.9325.8350.6426a HWD = departure from HWE, in terms of fraction (f) of maximum departure (Dmax=p(1−p), where p is rare-allele frequency), so that the genotype frequencies are PAA=p2+D, PAB=2p(1−p)−2D, and PAB=(1−p)2+D, where D=sign fDmax. For three strata, sign is positive for strata 1 and 3 and negative for stratum 2. For five strata, sign is positive for strata 3–5 and negative for strata 1–2. Rare-allele frequency was 0.20, with 25 subjects per stratum, and Haldane and Olson statistics gave identical results. Open table in a new tab The time to compute the exact stratified test depends on the number of strata, K, and the number of rare alleles, NA,k, within each stratum; the larger the values of K and NA,k, the more time the tests require for computation. To evaluate the practical time limits for computing the exact stratified test, we varied the number of strata from 2 to 5 and the sample size per stratum from Nk=20 to 100 (constant over strata). For all situations, we evaluated the worst-case scenario by setting NA,k to its largest possible value, NA,k=Nk−1. All computations were performed on a Sun workstation (SUNW [Ultra-80]) with 4 GB RAM (random-access memory) and a 450-MHz processor. Timing results are given in seconds in table 4. For up to five strata and for sample sizes <50 per stratum, our software will compute within a few seconds (from ∼0.001 to 10 s). In contrast to these worst-case scenarios, the average time per genotype for the HapMap data was 0.025 s. Although we illustrate computation times for up to 100 subjects per stratum, the exact test is likely not necessary for this situation, and the asymptotic tests should suffice, as long as the minor-allele frequencies are not too small.Table 4Timing of hweStrataSample Size per Stratum and No. of StrataTime (s)20:2.003.014.015.1150:2.003.024.4459.78100:2.013.1146.155304 Open table in a new tab Application of our exact stratified test for HWE to the Perlegen and HapMap data sets provides an empirical comparison of our new methods with the common approach that uses the minimum exact-test P value. Both data sets emphasize the greater power of the exact stratified test, which makes intuitive sense because it simultaneously evaluates HWE over all strata rather than independently testing each stratum. The exact stratified test also accounts for testing multiple strata; a Bonferroni correction would be needed to control the type I error rate when using the minimum P value over the strata. Although a number of approximate stratified tests of HWE have been proposed, our applications illustrate that our version of Haldane's test gives results nearly identical to those of Olson's stratified test for HWE, suggesting that the different ways of weighting the contribution from each stratum do not have major influences on the tests. By comparing results from Haldane's stratified test with those from the exact stratified test, we found that using the standard normal distribution to approximate the distribution of Haldane's test can give exceptionally small P values when there are sparse genotype counts, which suggests that the normal distribution is not adequate and that the exact stratified test is more reliable. Finally, both the Perlegen and HapMap data illustrated that the exact stratified test for HWE is much more powerful to detect departures from HWE than the exact test for homogeneity, echoing the simulation results of Olson and Foley.10Olson JM Foley M Testing for homogeneity of Hardy-Weinberg disequilibrium using data sampled from several populations.Biometrics. 1996; 52: 971-979Crossref PubMed Google Scholar Our simulation results confirmed that the exact stratified test provides the correct type I error rate, whereas the tests proposed by Haldane, Olson, and Troendle and Yu can have inflated type I error rates in the presence of sparse data (i.e., small strata and rare alleles). Furthermore, our simulations confirmed that the exact test provides the greatest power when the departure from HWE is in different directions across strata. Finally, the simulations suggest that when the strata sizes are not small and the frequency of the rare allele is at least 5%, the omnibus test of Troendle and Yu would be a good substitute for the exact stratified test. Although our work was motivated by the relatively small ethnic groups within the Perlegen and HapMap data sets and by the potential for using these data sets for planning large-scale genome association studies in large “homogeneous” ethnic groups, our exact tests should prove useful for many genetic studies. Some examples are follow-up studies in multiple ethnic groups, each of which may not be large (note that association analyses would need to account for the different ethnic groups, such as a Mantel-Haenszel stratified analysis), population genetic studies in multiple ethnic groups from a geographic region, or studies in which apparently homogeneous ethnic groups can be clustered into smaller ethnic subsets on the basis of many measured markers.14Price AL Patterson NJ Plenge RM Weinblatt ME Shadick NA Reich D Principal components analysis corrects for stratification in genome-wide association studies.Nat Genet. 2006; 38: 904-909Crossref PubMed Scopus (6859) Google Scholar In conclusion, our simulations and the application of exact and approximate stratified tests for HWE to more than five million SNPs, with strata sizes ranging from 23 to 60 subjects, provide convincing results that the exact stratified test provides the most-robust and most-powerful results. Furthermore, efficient computational algorithms for SNP genotype data, which we developed in the C programming language, allow the exact stratified test to be computed within reasonable computing time for sample sizes on the order of the HapMap data (e.g., strata sizes ranging from 45 to 60 subjects over four strata). The C source code for our software, called “hweStrata,” is available from our Web site. This work was supported by U.S. Public Health Service, National Institutes of Health, contract grants GM065450 and GM61388 (The Pharmacogenetics Research Network). The constructive criticisms from two anonymous reviewers helped to improve the evaluation and presentation of the proposed exact stratified test.
Referência(s)