Common Variants in the Trichohyalin Gene Are Associated with Straight Hair in Europeans
2009; Elsevier BV; Volume: 85; Issue: 5 Linguagem: Inglês
10.1016/j.ajhg.2009.10.009
ISSN1537-6605
AutoresSarah E. Medland, Dale R. Nyholt, Jodie N. Painter, Brian P. McEvoy, Allan F. McRae, Gu Zhu, Scott D. Gordon, Manuel A. R. Ferreira, Margaret J. Wright, Anjali K. Henders, Megan Campbell, David L. Duffy, Narelle K. Hansell, Stuart MacGregor, Wendy S. Slutske, Andrew C. Heath, Grant W. Montgomery, Nicholas G. Martin,
Tópico(s)RNA Research and Splicing
ResumoHair morphology is highly differentiated between populations and among people of European ancestry. Whereas hair morphology in East Asian populations has been studied extensively, relatively little is known about the genetics of this trait in Europeans. We performed a genome-wide association scan for hair morphology (straight, wavy, curly) in three Australian samples of European descent. All three samples showed evidence of association implicating the Trichohyalin gene (TCHH), which is expressed in the developing inner root sheath of the hair follicle, and explaining ∼6% of variance (p = 1.5 × 10−31). These variants are at their highest frequency in Northern Europeans, paralleling the distribution of the straight-hair EDAR variant in Asian populations. Hair morphology is highly differentiated between populations and among people of European ancestry. Whereas hair morphology in East Asian populations has been studied extensively, relatively little is known about the genetics of this trait in Europeans. We performed a genome-wide association scan for hair morphology (straight, wavy, curly) in three Australian samples of European descent. All three samples showed evidence of association implicating the Trichohyalin gene (TCHH), which is expressed in the developing inner root sheath of the hair follicle, and explaining ∼6% of variance (p = 1.5 × 10−31). These variants are at their highest frequency in Northern Europeans, paralleling the distribution of the straight-hair EDAR variant in Asian populations. Hair morphology is one of the more conspicuous features of human variation and is particularly diverse among people of European ancestry, for which around 45% of individuals have straight hair, 40% have wavy hair, and 15% have curly hair.1Loussouarn G. Garcel A.L. Lozano I. Collaudin C. Porter C. Panhard S. Saint-Leger D. de La Mettrie R. Worldwide diversity of hair curliness: a new method of assessment.Int. J. Dermatol. 2007; 46: 2-6Crossref PubMed Scopus (65) Google Scholar The degree of curliness is correlated with the distribution of hair keratins and cell type within the hair fiber, with the number of mesocortical cells decreasing as the curl intensifies.2Thibaut S. Barbarat P. Leroy F. Bernard B.A. Human hair keratin network and curvature.Int. J. Dermatol. 2007; 46: 7-10Crossref PubMed Scopus (69) Google Scholar Recent studies have identified Asian-specific alleles of the EDAR and FGFR2 genes that are associated with thick, straight hair, suggesting that these variants arose after the divergence of Asians and Europeans.3Fujimoto A. Nishida N. Kimura R. Miyagawa T. Yuliwulandari R. Batubara L. Mustofa M.S. Samakkarn U. Settheetham-Ishida W. Ishida T. et al.FGFR2 is associated with hair thickness in Asian populations.J. Hum. Genet. 2009; 54: 461-465Crossref PubMed Scopus (27) Google Scholar, 4Fujimoto A. Kimura R. Ohashi J. Omi K. Yuliwulandari R. Batubara L. Mustofa M.S. Samakkarn U. Settheetham-Ishida W. Ishida T. et al.A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness.Hum. Mol. Genet. 2008; 17: 835-843Crossref PubMed Scopus (148) Google Scholar However, the genetic variants influencing hair curliness in Europeans (which has been shown to be highly heritable5Medland S.E. Zhu G. Martin N. Estimating the heritability of hair curliness in twins of European ancestry.TRHG. 2009; 12: 514-518PubMed Google Scholar) are unknown. We conducted genome-wide association analyses in three Australian family samples: one sample of adolescent twins and their siblings (1649 individuals from 837 families) and two samples of adult twin pairs (S1, 1945 individuals from 1210 families; S2, 1251 individuals from 845 families) ascertained from the general population (Table 1).5Medland S.E. Zhu G. Martin N. Estimating the heritability of hair curliness in twins of European ancestry.TRHG. 2009; 12: 514-518PubMed Google Scholar In the adolescent sample, hair curliness was rated on a three-point scale (Straight, Wavy, or Curly). In the adult samples, participants reported whether their hair was Straight or Curly (S1) or Straight, Wavy, or Curly (S2). To account for the differences in phenotype collection and age across the samples, each sample was analyzed independently and meta-analysis was used for combining the three sets of results. These studies were performed with the approval of the appropriate ethics committees and the informed consent of all participants.Table 1Characteristics of the Three CohortsCohort 1Cohort 2Cohort 3Birth years1982–19961903–19641965–1972Age mean (range)14 (9–24)47 (29–94)39 (30–42)No. of females8301483697No. of males819462554Percentage of females with straight hair63.5%72.6%49.1%Percentage of males with straight hair71.5%69.3%56.5%Percentage of females with wavy hair27.9%-35.7%Percentage of males with wavy hair21.6%-30.9%Percentage of females with curly hair8.6%27.4%15.2%Percentage of males with curly hair6.9%30.7%12.6% Open table in a new tab The genotypic data used in the current study derives from a larger genotyping project involving seven waves of genotyping that drew participants from our 1988 and 1990 adult health and lifestyle studies6Hansell N.K. Agrawal A. Whitfield J.B. Morley K.I. Zhu G. Lind P.A. Pergadia M.L. Madden P.A. Todd R.D. Heath A.C. et al.Long-term stability and heritability of telephone interview measures of alcohol consumption and dependence.Twin Res. Hum. Genet. 2008; 11: 287-305Crossref PubMed Scopus (37) Google Scholar and adolescent melanoma risk factors study.7Zhu G. Duffy D.L. Eldridge A. Grace M. Mayne C. O'Gorman L. Aitken J.F. Neale M.C. Hayward N.K. Green A.C. et al.A major quantitative-trait locus for mole density is linked to the familial melanoma gene CDKN2A: a maximum-likelihood combined linkage and association analysis in twins and their sibs.Am. J. Hum. Genet. 1999; 65: 483-492Abstract Full Text Full Text PDF PubMed Scopus (187) Google Scholar, 8Zhu G. Montgomery G.W. James M.R. Trent J.M. Hayward N.K. Martin N.G. Duffy D.L. A genome-wide scan for naevus count: linkage to CDKN2A and to other chromosome regions.Eur. J. Hum. Genet. 2007; 15: 94-102Crossref PubMed Scopus (63) Google Scholar The genotypic data from each project are described in Table 2. Standard quality-control filters were applied to the genotyping from each project, restricting the imputation to samples and SNPs with high data quality (Table 2). Individuals were screened for non-European ancestry, resulting in a sample of 16,140 genotyped individuals (Figure S2, available online). So that bias was not introduced to the imputed data, a set of SNPs common to the seven subsamples was used for imputation (n = 274,604). Imputation was undertaken with the use of the phased data from the HapMap samples of European ancestry (CEU; build 36, release 22) and MACH.9Li Y. Abecasis G. Mach 1.0: Rapid haplotype reconstruction and missing genotype inference.Am. J. Hum. Genet. 2006; S79: 2290Google ScholarTable 2Summary Information for the Seven Waves of Genotyping and the Quality Control UndertakenProject 1: ALCO CIDRProject 2: ALCO deCODEProject 3: MIG deCODEProject 4: EUTWINProject 5: ADOL deCODEProject 6: GL_CIDRProject 7: WH deCODEPrimary phenotypeAlcohol use (population sample)Alcohol use (population sample)Migraine (case/control sample)Lipid levels (population sample)Melanoma risk factors (population sample)Glaucoma (population sample)Womens' health (case/control sample)Genotyping labCIDRdeCODEdeCODEUniversity of HelsinkideCODECIDRdeCODEIlluminia SNP platformHumanCNV370-Quadv3HumanCNV370-Quadv3Human610-QuadHuman 317KHuman610-QuadHuman610-QuadHuman610-QuadNo. of genotyped samples4241261199946243916572360No. of genotyped SNPs343,955344,962592,385318,210592,392589,296562,193BeadStudio GenCall score < 0.724,49427,45946,931NAaGenCall data were not available for this sample.47,41836,87757,589SNPs with call rate < 0.9511,584753780385021844712,45533,459SNPs with HWE failure p < 10−643181194122167284115,4741763SNPs with MAF < 0.01/ only 1 observed allele7874897633,34726433,34728,60724,509No. of SNPs after QC323093321,267530,922312,937529,379531,042518,948Percentage of genotyped SNPs93.93%93.13%89.62%98.34%89.36%90.11%92.31%For each project, DNA was extracted in accordance with standard protocols. Across projects, participants were genotyped on the Illumina 317K, 370K, or 610K SNP platforms, and genotypes were called with the Illumina BeadStudio software. After the quality control (QC) of the individual projects, the data from the seven waves of genotyping were integrated. As shown in Figure S1, a number of samples were duplicated among the various genotyping projects, allowing for cross-project QC. After integration of the data sets, the data were screened for missingness within individuals (>5%, taking into account the number of SNPs that were genotyped for each individual), pedigree and sex errors, and Mendelian errors (genotypes for all family members for a given SNP were removed upon detection of errors). After QC, in cases where one individual from a monozygotic twin pair had been genotyped, duplicate genotypes were assigned to the ungenotyped cotwin, resulting in a sample of 16,507 individuals. After screening for non-European ancestry (Figure S2), this resulted in a final sample of 16,140 individuals. HWE denotes Hardy-Weinburg equilibrium.a GenCall data were not available for this sample. Open table in a new tab For each project, DNA was extracted in accordance with standard protocols. Across projects, participants were genotyped on the Illumina 317K, 370K, or 610K SNP platforms, and genotypes were called with the Illumina BeadStudio software. After the quality control (QC) of the individual projects, the data from the seven waves of genotyping were integrated. As shown in Figure S1, a number of samples were duplicated among the various genotyping projects, allowing for cross-project QC. After integration of the data sets, the data were screened for missingness within individuals (>5%, taking into account the number of SNPs that were genotyped for each individual), pedigree and sex errors, and Mendelian errors (genotypes for all family members for a given SNP were removed upon detection of errors). After QC, in cases where one individual from a monozygotic twin pair had been genotyped, duplicate genotypes were assigned to the ungenotyped cotwin, resulting in a sample of 16,507 individuals. After screening for non-European ancestry (Figure S2), this resulted in a final sample of 16,140 individuals. HWE denotes Hardy-Weinburg equilibrium. So that we could take full advantage of the information available in the ordinal scale, the data were analyzed via a multifactorial threshold model that describes discrete traits as reflecting an underlying normal distribution of liability (or predisposition). Liability, which represents the sum of all the multifactorial effects, is assumed to reflect the combined additive effects of a large number of genes and environmental factors, each of small effect, and is characterized by phenotypic discontinuities that occur when the liability reaches a given threshold.10Neale M.C. Cardon L.R. Methodology for Genetic Studies of Twins and Families. Kluwer Academic Publishers, Dordrecht, the Netherlands1992Crossref Google Scholar A total test of association was used, in which the dosage (MACH mldose) data for each SNP in turn were included within the threshold model, resulting in an additive test of association. In addition, fixed effects of sex and age (both linear and quadratic effects) and age-by-sex interactions were included with the threshold models in all data analyses, such that the trait value for individual j from family i was parameterized as: xij=βdose+βage+βage2+βsex+βsex-age+μ. The relatedness between the participants was explicitly modeled, accounting for the sex of relative pairs, and the phenotypic variances were constrained to unity. The association test statistic was computed by comparing the fit (minus twice log-likelihood) of the full model, which included the effect of the given SNP, to that of a nested model, in which the SNP effect had been dropped from the model. The difference in log-likelihoods follows an asymptotic chi-square distribution with the degrees of freedom equal to the difference in estimated parameters between the two models (in this case one). The genomic inflation factors of the three samples ranged from 0.98 to 1.02 (Figure S3), indicating that the test correctly controlled for the relatedness of the participants and that potential technical and stratification artifacts had a negligible impact on the results. Four highly correlated single-nucleotide polymorphisms (SNPs) (rs17646946, rs11803731, rs4845418, rs12130862; r2 > 0.8, D′ > 0.95 within the HapMap CEU sample) on chromosome 1q21.3 (Figure 1B) reached our genome-wide significance threshold of 5 × 10−8, which corrects for ∼1 million independent common variants in the genome11International Hap Map ConsortiumThe International HapMap Project.Nature. 2003; 426: 789-796Crossref PubMed Scopus (4688) Google Scholar (Table 3, Figure S4). The association was found in all three samples, suggesting that the effect is robust to the age differences between the samples and that the liability threshold model accounted for the differences in the phenotypic definition across the samples. Meta-analysis of the three samples using an N (individuals)-weighted analysis in Metal (see Web Resources) resulted in highly significant p values for SNPs that are within this region and fall on a haplotype tagged by the directly genotyped SNP rs17646946 (p = 1.5 × 10−31) (Figures 1A and 1B, Table 3). The association in the 1q21.3 region is centered on the Trichohyalin gene TCHH and accounted for ∼6% of the variance (Figure 1E, Table 3). Further analysis showed that association at the haplotype level did not offer additional predictive power. Including the best SNPs as covariates in the analyses yielded no further evidence of association, completely accounting for the signal at this locus (Figure S5). We found neither evidence for epistasis between these SNPs and any other SNP across the genome nor any heterogeneity between sexes (Figures S5 and S6). Finally, although analysis of copy-number variation (CNV) across the region in the adolescent sample found evidence for CNV in 18 individuals, it was too infrequent to explain the observed effect (Table S2). Table S3 lists all SNPs with a combined p value of less than 1 × 10−5 for the meta-analysis. A second region of suggestive association was observed on chromosome 4q21.21 (rs1268789; p = 6.58 × 10−8), centered on the Fraser syndrome 1 gene FRAS1. We also examined evidence for association within the meta-analysis for the list of 170 candidate genes published by Fujimoto et al.4Fujimoto A. Kimura R. Ohashi J. Omi K. Yuliwulandari R. Batubara L. Mustofa M.S. Samakkarn U. Settheetham-Ishida W. Ishida T. et al.A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness.Hum. Mol. Genet. 2008; 17: 835-843Crossref PubMed Scopus (148) Google Scholar (Table S4). In addition to the association observed in the TCHH region, strong association signals were observed in WNT10A, associated with odonto-onycho-dermal dysplasia, which is characterised by dry hair and a broad range of ectodermal phenotypes12Adaimy L. Chouery E. Megarbane H. Mroueh S. Delague V. Nicolas E. Belguith H. de Mazancourt P. Megarbane A. Mutation in WNT10A is associated with an autosomal recessive ectodermal dysplasia: the odonto-onycho-dermal dysplasia.Am. J. Hum. Genet. 2007; 81: 821-828Abstract Full Text Full Text PDF PubMed Scopus (166) Google Scholar (2q35; rs7349332; p = 1.36 × 10−6).Table 3Details of the Genome-wide Significant SNPs in the 1q21.3 Region across the Three Samplesrs17646946rs11803731rs4845418rs12130862Position (bp)150,329,391150,349,949150,402,854150,293,639Genotyped or imputedgenotypedimputedimputedimputedMinor (reference) alleleATCTMajor alleleGAGAMinor allele frequency18.3%18.4%17.3%18.1%Hardy-Weinburg equilibrium p value0.730.750.700.79Rsq (imputation accuracy metric)-0.920.960.98Adolescent sample (n = 1649)Allelic effect (β)aThe allelic β reported here should be interpreted with reference to the liability threshold model, which maps the data onto a standard normal distribution in which the cutpoints between categories are mapped against the z distribution. For example, a β of .41 indicates that the threshold dividing the straight from wavy categories is moved 0.41 z units to the right for each risk allele that an individual possesses.0.410.420.420.39P-value1.24 × 10−111.76 × 10−113.68 × 10−117.59 × 10−11Adult S1 (n = 1945)Allelic effect (β)aThe allelic β reported here should be interpreted with reference to the liability threshold model, which maps the data onto a standard normal distribution in which the cutpoints between categories are mapped against the z distribution. For example, a β of .41 indicates that the threshold dividing the straight from wavy categories is moved 0.41 z units to the right for each risk allele that an individual possesses.0.500.500.490.45P-value8.18 × 10−132.51 × 10−122.22 × 10−119.69 × 10−11Adult S2 (n = 1251)Allelic effect (β)aThe allelic β reported here should be interpreted with reference to the liability threshold model, which maps the data onto a standard normal distribution in which the cutpoints between categories are mapped against the z distribution. For example, a β of .41 indicates that the threshold dividing the straight from wavy categories is moved 0.41 z units to the right for each risk allele that an individual possesses.0.440.440.430.42P-value7.91 × 10−111.37 × 10−101.15 × 10−98.16 × 10−10Cross-Sample CalculationsExplained variancebCalculated as [2p(1 − p)]β2, in which p is the minor allele frequency and β is the additive allelic effect. averaged across samples6.11%6.11%5.79%5.22%Meta analysis (p value)1.50 × 10−313.18 × 10−314.43 × 10−293.12 × 10−28a The allelic β reported here should be interpreted with reference to the liability threshold model, which maps the data onto a standard normal distribution in which the cutpoints between categories are mapped against the z distribution. For example, a β of .41 indicates that the threshold dividing the straight from wavy categories is moved 0.41 z units to the right for each risk allele that an individual possesses.b Calculated as [2p(1 − p)]β2, in which p is the minor allele frequency and β is the additive allelic effect. Open table in a new tab Of the four most-associated SNPs in the 1q21 region, we focused on rs11803731 (p = 3.2 × 10−31), because this is a coding, nonsynonymous variant located in the third exon of TCHH (although more work is required for confirmation that this is the causal variant). The T allele at rs11803731 is the derived state and shows a striking geographic specificity to Europe and western-central Asia, reaching its highest frequency in Northern Europeans (Figure 1E), suggesting that the variant arose somewhere in this broad region. The modern frequency and distribution of de novo mutations will generally be determined by random genetic drift and migration. However, because rs11803731 influences a highly visible phenotype, it is an intuitively obvious target for natural or sexual selection.13Chernova O. Evolutionary aspects of hair polymorphism.Biol. Bull. 2006; 33: 43-52Crossref Scopus (14) Google Scholar The EDAR gene that controls hair thickness shows one of the most convincing signatures of positive selection in the East Asian genome.4Fujimoto A. Kimura R. Ohashi J. Omi K. Yuliwulandari R. Batubara L. Mustofa M.S. Samakkarn U. Settheetham-Ishida W. Ishida T. et al.A scan for genetic determinants of human hair morphology: EDAR is associated with Asian hair thickness.Hum. Mol. Genet. 2008; 17: 835-843Crossref PubMed Scopus (148) Google Scholar rs11803731 is among the top 2.5% most-differentiated SNPs across the genome between Europeans and other Hapmap II populations (gauged by the FST-based locus-specific branch length test14Shriver M.D. Kennedy G.C. Parra E.J. Lawson H.A. Sonpar V. Huang J. Akey J.M. Jones K.W. The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs.Hum. Genomics. 2004; 1: 274-286Crossref PubMed Scopus (169) Google Scholar). Although previous analysis of extended haplotype homozygosity patterns in the Human Genome Diversity Project (HGDP) cohort also shows tentative evidence of genetic hitchhiking for the 1q21.3 region in some European populations (ref. 15Pickrell J.K. Coop G. Novembre J. Kudaravalli S. Li J.Z. Absher D. Srinivasan B.S. Barsh G.S. Myers R.M. Feldman M.W. et al.Signals of recent positive selection in a worldwide sample of human populations.Genome Res. 2009; 19: 826-837Crossref PubMed Scopus (520) Google Scholar and Figure S7), the overall evidence of selection is ambiguous. However, the genetic signatures of positive selection at individual loci, as detected with current tests, will vary depending on the timing, the strength of the selective event, the genomic characteristic of the region, and the genetic architecture (number, frequency, and effect size of causal loci) of the phenotype in question16Sabeti P.C. Schaffner S.F. Fry B. Lohmueller J. Varilly P. Shamovsky O. Palma A. Mikkelsen T.S. Altshuler D. Lander E.S. Positive natural selection in the human lineage.Science. 2006; 312: 1614-1620Crossref PubMed Scopus (739) Google Scholar and thus may not be as obvious as those associated with other superficial traits. For example, the OCA2 gene region is a well-established target of selection17Voight B.F. Kudaravalli S. Wen X. Pritchard J.K. A map of recent positive selection in the human genome.PLoS Biol. 2006; 4: e72Crossref PubMed Scopus (193) Google Scholar and is known to influence human pigmentation traits, especially eye color.18Duffy D.L. Montgomery G.W. Chen W. Zhao Z.Z. Le L. James M.R. Hayward N.K. Martin N.G. Sturm R.A. A three-single-nucleotide polymorphism haplotype in intron 1 of OCA2 explains most human eye-color variation.Am. J. Hum. Genet. 2007; 80: 241-252Abstract Full Text Full Text PDF PubMed Scopus (165) Google Scholar, 19Sturm R.A. Duffy D.L. Zhao Z.Z. Leite F.P. Stark M.S. Hayward N.K. Martin N.G. Montgomery G.W. A single SNP in an evolutionary conserved region within intron 86 of the HERC2 gene determines human blue-brown eye color.Am. J. Hum. Genet. 2008; 82: 424-431Abstract Full Text Full Text PDF PubMed Scopus (253) Google Scholar The effect of the rs11803731 variant, replacement of a leucine by a methionine at position 790 of the TCHH protein, was predicted by in silico analyses with the programs PolyPhen20Ramensky V. Bork P. Sunyaev S. Human non-synonymous SNPs: server and survey.Nucleic Acids Res. 2002; 30: 3894-3900Crossref PubMed Scopus (1823) Google Scholar and PMut.21Ferrer-Costa C. Gelpí J. Zamakola L. Parraga I. de la Crux X. Orozco M. PMUT: a web-based tool for the annotation of pathological mutations on proteins.Bioinformatics. 2005; 21: 3176-3178Crossref PubMed Scopus (387) Google Scholar PolyPhen predicted the L790M change to be “benign,” whereas PMut predicted this change to be “neutral.” No results were returned for other prediction programs, including SIFT22Ng P.C. Henikoff S. SIFT: Predicting amino acid changes that affect protein function.Nucleic Acids Res. 2003; 31: 3812-3814Crossref PubMed Scopus (3483) Google Scholar (see Web Resources) (with either the SNP ID or the protein sequence used) and SNPs3D23Lee S.C. Kim I. Marekov L. O'Keefe E. Parry D. Steiner P. The structure of human trichohyalin.J. Biol. Chem. 1993; 268: 12164-12174Abstract Full Text PDF PubMed Google Scholar (see Web Resources) which contained no record of the rs11803731 SNP. Such predictions do not preclude a functional role, given that the effect of the SNP may be regulatory rather than structural, particularly as the amino acid replacement falls outside of α-helical regions.23Lee S.C. Kim I. Marekov L. O'Keefe E. Parry D. Steiner P. The structure of human trichohyalin.J. Biol. Chem. 1993; 268: 12164-12174Abstract Full Text PDF PubMed Google Scholar Surface-exposed methionines can be oxidized by reactive oxygen species posttranslationally, which if left unrepaired can result in changes in protein structure and activity and can lead to altered protein regulation.24Stadtman E.R. Moskovitz J. Levine R. Oxidation of methionine residues of proteins: biological consequences.Antioxid. Redox Signal. 2003; 5: 577-582Crossref PubMed Scopus (268) Google Scholar Alternatively, rs11803731 may be associated with structural variation. TCHH is a single-stranded α-helical protein with two or three highly repetitive regions, depending on the species (Figure S8). In sheep, the reference protein (CAA79165.1) is 1549 amino acids long, but variation in the number of complete and partial repeats in the C-terminal repeat region is seen across different strains.25Fietz M.J. McLaughlan C.J. Campbell M.T. Rogers G.E. Analysis of the sheep trichohyalin gene: potential structural and calcium-binding roles of trichohyalin in the hair follicle.J. Cell Biol. 1993; 121: 855-865Crossref PubMed Scopus (65) Google Scholar, 26McLaren R.J. Rogers G.R. Davies K.P. Maddox J.F. Montgomery G.W. Linkage mapping of wool keratin and keratin-associated protein genes in sheep.Mamm. Genome. 1997; 8: 938-940Crossref PubMed Scopus (64) Google Scholar In the human TCHH protein, repeat lengths range from approximately 6 to 30 amino acids, corresponding to 18 to 90 bp of DNA sequence. A number of SNPs and insertion or deletion polymorphisms are present, particularly in the first and third repeat regions (dbSNP; see Web Resources), and this gene might harbor allelic length variants, as seen in sheep and in another highly repetitive gene in the human chromosome 1p21 region, involucrin (IVL), where alleles differ across human populations in the number of both short tandem repeats and single base changes within repeated sequence.27Djian P. Delhomme B. Green H. Origin of the polymorphism of the involucrin gene in Asians.Am. J. Hum. Genet. 1995; 56: 1367-1372PubMed Google Scholar, 28Urquhart A. Gill P. Tandem-repeat internal mapping (TRIM) of the involucrin gene: repeat number and repeat-pattern polymorphism within a coding region in human populations.Am. J. Hum. Genet. 1993; 53: 279-286PubMed Google Scholar Such length variation has not been reported for TCHH, and it remains to be determined experimentally whether such variation is common, affects the structure or length of the protein, and/or is tagged by SNPs flanking the repeated regions. In conclusion, we report a quantitative trait locus that affects hair form in Europeans. The association accounts for ∼6% of the variance in hair morphology in this group and falls within the Trichohyalin gene, which has a known role in hair formation. The patterns of allele frequencies are striking, with the highest frequency of these variants observed in Northern Europeans (Figure 1), paralleling the observation of the straight-hair EDAR variant in Asian populations (Figure S9). We thank the twins and their families for their participation. We also thank Dixie Statham, Ann Eldridge, Marlene Grace, Kerrie McAloney (sample collection); Lisa Bowdler, Steven Crooks (DNA processing); David Smyth, Harry Beeby, and Daniel Park (IT support). Funding was provided by the Australian National Health and Medical Research Council (241944, 339462, 389927, 389875, 389891, 389892, 389938, 442915, 442981, 496739, 552485, 552498), the Australian Research Council (A7960034, A79906588, A79801419, DP0770096, DP0212016, DP0343921), the FP-5 GenomEUtwin Project (QLG2-CT-2002-01254), and the U.S. National Institutes of Health (NIH grants AA07535, AA10248, AA13320, AA13321, AA13326, AA14041, MH66206). A portion of the genotyping on which this study was based (Illumina 370K scans on 4300 individuals) was carried out at the Center for Inherited Disease Research, Baltimore (CIDR), through an access award to our late colleague Dr. Richard Todd (Psychiatry, Washington University School of Medicine, St Louis). Statistical analyses were carried out on the Genetic Cluster Computer, which is financially supported by the Netherlands Scientific Organization (NWO 480-05-003). S.E.M., D.R.N., A.F.M., M.A.R.F., S.M., D.L.D., and G.W.M. are supported by the National Health and Medical Research Council (NHMRC) Fellowship Scheme. Download .pdf (1.54 MB) Help with pdf files Document S1. Nine Figures and Four Tables The URLs for data presented herein are as follows:1000 Genomes browser, http://www.1000genomes.orgdbSNP, http://www.ncbi.nlm.nih.gov/sites/entrez?db=snpGenetic Cluster Computer, http://www.geneticcluster.orgMACH, http://www.sph.umich.edu/csg/abecasis/MACH/index.htmlMetal, http://www.sph.umich.edu/csg/abecasis/Metal/index.htmlMx, http://www.vcu.edu/mx/Pmut, http://mmb2.pcb.ub.es:8080/PMut/PolyPhen, http://genetics.bwh.harvard.edu/pph/SNPs3D, http://www.snps3d.org/Sorting Intolerant from Tolerant (SIFT), http://sift.jcvi.org/
Referência(s)