Genetics of Cardiovascular Diseases
2007; Lippincott Williams & Wilkins; Volume: 116; Issue: 15 Linguagem: Inglês
10.1161/circulationaha.106.661751
ISSN1524-4539
AutoresFrançois Cambien, Laurence Tiret,
Tópico(s)Congenital heart defects research
ResumoHomeCirculationVol. 116, No. 15Genetics of Cardiovascular Diseases Free AccessReview ArticlePDF/EPUBAboutView PDFView EPUBSections ToolsAdd to favoritesDownload citationsTrack citationsPermissions ShareShare onFacebookTwitterLinked InMendeleyReddit Jump toFree AccessReview ArticlePDF/EPUBGenetics of Cardiovascular DiseasesFrom Single Mutations to the Whole Genome François Cambien, MD and Laurence Tiret, PhD François CambienFrançois Cambien From INSERM UMR S 525 and Université Pierre et Marie Curie, Paris, France. and Laurence TiretLaurence Tiret From INSERM UMR S 525 and Université Pierre et Marie Curie, Paris, France. Originally published9 Oct 2007https://doi.org/10.1161/CIRCULATIONAHA.106.661751Circulation. 2007;116:1714–1724When the search for genes that predispose to cardiovascular diseases (CVD) started >20 years ago, it was anticipated that genetic polymorphisms might be analogous to the already known CVD risk factors and could be incorporated in a risk model such as the Framingham score1 to assess the risk of an individual and adopt preventive or therapeutic measures accordingly. However, despite years of intensive research, not a single genetic risk factor is used for risk assessment. The new strategy of genome-wide association (GWA) studies (for example, see http://www.wtccc. org.uk/) coupled with the availability of very large cohorts of patients2 is starting to reveal novel genetic factors that contribute to disease risk. Whether these variants will be clinically more useful than those that were derived from the study of candidate genes still needs to be demonstrated. As time passes, the interest for genetic research on common CVD moves progressively from the direct expectation of risk stratification to the more fundamental understanding of disease origins and pathophysiology and their indirect diagnostic and therapeutic implications.The objective of the present review is not to provide an exhaustive account of the numerous studies conducted on the genetics of CVD (eg, Arnett et al3), but to introduce a few basic notions required to understand the language of genetics and genomics (see Appendix) and illustrate with a limited number of examples the important insights provided by genetic research into the causes and mechanisms of CVD. We will also discuss the new GWA strategy and why this approach is likely to have a considerable impact on biomedicine and human disease understanding. Finally, we will try to explain the unsuccessful search for genetic markers of risk and why phenotypic biomarkers are likely to be clinically more useful.The Basis of Genetic VariationDuring the past decade, considerable progress has been achieved in the knowledge of the human genome and the characterization of its natural variability.4,5 The 20 000 to 25 000 protein coding genes that the human genome comprises represent only 30% of its sequence, the remainder being intergenic sequences that may contain important elements for the regulation of gene expression. In a typical human gene, 5% of the sequence is composed of coding exons that are in part translated into a protein, the remainder being covered by introns and regulatory regions located upstream (5′) and downstream (3′) of the coding sequence.The most common type of human sequence variation consists of differences in individual base pairs termed single nucleotide polymorphisms (SNPs). Other sequence variations comprise variable numbers of short or long repetitions of the same motif in tandem such as mini- and microsatellites,6 insertions or deletions of various lengths, and structural variants that affect large chromosomal regions.7 The vast majority of these sequence variations are located in nonfunctional regions of the genome and have no phenotypic impact; these are said to be neutral and are usually termed markers. However, when variations occur within coding sequences or regulatory regions, they may affect the protein sequence or the level of gene expression and translate into observable phenotypic effects.Mendelian Versus Complex InheritanceThe spectrum of the genetic variants that predispose to CVD spans from rare, highly deleterious mutations responsible for Mendelian diseases to common polymorphisms with weak effects that, alone or in combination, modulate the risk of common diseases (the "common variant–weak effect–common disease" model). In this latter case, the term "complex disease" is often used to denote the fact that the pattern of familial aggregation differs from that of Mendelian inheritance of a single genetic defect.From an epidemiological perspective, rare deleterious mutations (eg, those that cause familial hypercholesterolemia [FH]) confer an important risk of coronary heart disease (CHD) in mutation carriers, but their impact at the population level is low. Conversely, polymorphisms such as the apolipoprotein E (APOE) polymorphism, because they are frequent, may have a population impact that is far from negligible despite a weak effect at the individual level. This duality, which relates to the epidemiological notions of absolute, relative, and attributable risks, has important medical and public health implications but is less crucial when the interest lies in the identification of pathophysiological pathways.Mutations Responsible for Mendelian DiseasesMutations are usually identified by linkage analysis conducted in families with several affected members over different generations. Regions that potentially harbor a disease-causing gene are identified by testing of the cosegregation of the disease with genetic markers that tag specific regions of the genome. This strategy uses genetic markers (ie, panels of microsatellites or large sets of SNPs regularly spaced throughout the genome) and tests whether particular alleles are cotransmitted with the disease at a higher frequency than expected by chance. The success of linkage studies depends on the availability of phenotypically well-characterized families that include a sufficiently large number of informative affected individuals. When a disease-linked region of the genome has been successfully mapped by linkage analysis, finding the responsible gene and sequence variation is not trivial because the region may sometimes encompass tens or hundreds of genes. However, thanks to the improved annotation of the human genome sequence and the possible design of dense SNP arrays that target the regions of interest, the discovery of the responsible sequence mutation may be accelerated by linkage disequilibrium (LD) mapping.8 Although exceptions exist (eg, within isolated populations derived from a small number of founders), mutations that are associated with Mendelian diseases are rare (much 1%) are the focus of most contemporary genetic studies that target complex diseases. Common SNPs are estimated to number >10 million in the human genome.9 Because polymorphisms have common alleles, numerous combinations of susceptibility alleles at several loci in a particular individual are possible, and some of them may affect the risk of CVD in a way that cannot be predicted from the separate effect of each variant. This is the major obstacle to the characterization of the genetics of complex traits and the rationale for the proposal to explore systems of genes rather than single genes.10,11 An important feature of polymorphisms, compared with rare deleterious mutations, is that they have an ancient origin. This explains why they are usually found in most human populations albeit often with different allele frequencies.Because complex diseases do not follow a clear pattern of Mendelian inheritance, the strategy used to identify their genes of predisposition is usually not based on family studies but on a radically different approach called "genetic association" analysis. This approach relies on the existence of LD among physically close polymorphic sites in the genome, which implies that even if a polymorphism causally involved in the disease process is not directly observed, its association may be captured by a measured proxy polymorphism in LD with it. This is the basis of association studies that test the statistical association between genetic markers (the term "marker" denotes that no a priori causal role is assumed) and the disease in the population. The principle of genetic association studies is described in Figure 1. Initially, association studies focused on markers of candidate genes. Thanks to various initiatives, in particular the "HapMap" Project,13 increasingly dense genome-wide panels of common SNPs are now available that provide a powerful resource of markers (or tag SNPs) (Figure 1) for association studies. Contemporary association studies often encompass sets of genes that encode components of biological systems, chromosome regions, or even the whole genome. Download figureDownload PowerPointFigure 1. Principle of genetic association studies. The schema represents a genomic region that contains 12 SNPs. The 3 SNPs in black are genotyped directly (these are the tag SNPs). The 6 SNPs in gray are captured through linkage disequilibrium (LD) with the tag SNPs (as denoted by arrows). The 3 SNPs in white are neither genotyped nor captured by tag SNPs (uncaptured SNPs), and so disease association with any of these uncaptured SNPs would be missed. The gray star represents a SNP causally associated with disease. It has 2 alleles (S1 and S2) and is in LD with a tag SNP that has 2 alleles (M1 and M2). The LD is reflected by the fact that the 2 SNPs generate only 3 haplotypes instead of the 4 possible because the haplotype M1S2 is never observed. As a consequence of this LD, the association of the causal SNP with disease could be detected through an indirect association with the tag SNP. Adapted from Kruglyak,12 with permission from the publisher. Copyright © 2005, Nature Publishing Group.The HapMap (Haplotype Map) ProjectThe primary goal of the International HapMap Project13 (http://www.hapmap.org/) was to create a public resource of common SNPs to capture most of the common human genome sequence variability. A second objective was to characterize the LD structure of the genome on the basis of the analysis of these SNPs. Because of the strong LD displayed by most regions of the genome, the combination of alleles at neighboring SNPs, called haplotypes, generates much less diversity than would be expected if they were uncorrelated. Recent studies have shown that the human genome is organized into a succession of distinct haplotype blocks that are ancestrally conserved.14–17 By resequencing the genome of 270 individuals from populations with African, Asian, and European ancestry, the HapMap Project has identified a set of SNPs that tag most of the common haplotypes in the human genome.18,19 This resource is used to search for polymorphisms associated with susceptibility to common diseases. For this purpose, genotyping arrays built with tag SNPs that encompass the whole genome or specific regions of interest are used; Figure 1 explains the principle.Variants of "Intermediate" to Low Frequency Associated With Non-Mendelian TraitsBetween the rare mutations responsible for Mendelian diseases and identified by family studies and the common polymorphisms targeted in current association studies, genetic variants that have a low frequency ( 3) probably exist in significant numbers. These variants are presently difficult to characterize because they do not generate evident familial patterns of disease that would make them identifiable by linkage studies, and they are missed in the current candidate gene or genome-wide sequencing strategies, which use a limited number of individuals for polymorphism screening. Rare functional variants are difficult to tag with common markers such as SNPs. Their systematic characterization is therefore out of the scope of studies that rely on LD such as GWA studies and will depend on the availability of new high-throughput sequencing technologies and large DNA banks of patients and controls. Rare variants associated with non-Mendelian traits may prove to be clinically important as they may confer a significant increase in risk and therefore constitute potential diagnostic and prognostic tools. Interest for these variants has recently grown after the discovery of a number of them in the PCSK9 and ABCA1 genes.Some Examples Related to Lipoproteins That Illustrate the Strength of Genetics to Unravel Mechanisms of DiseaseGenetic studies have been instrumental in the understanding of the mechanisms involved in the regulation of plasma lipoproteins. We will focus on a few examples that illustrate the broad range of frequency and effects of gene variants that affect lipid metabolism.The APOE GeneThe heritability of plasma low-density lipoprotein (LDL)–cholesterol (LDLc) has been estimated to be >50%.20 Epidemiological data show a striking parallel between plasma LDLc levels and the risk of CHD that is observed over a wide range of LDLc levels. This is why common polymorphisms that affect plasma LDLc may contribute to the risk of CHD. Such associations have been reported for several genes involved in lipid metabolism,21 the best example being APOE22; apoE plays an important role in the transport of lipids to tissue and cells. It is present in several lipoproteins and binds with high affinity to the LDL receptor. The APOE gene is polymorphic with 2 common nonsynonymous (amino acid changing) polymorphisms that generate 3 alleles (haplotypes) termed ε2, ε3, and ε4. These 3 alleles have variable frequencies across populations; ε3 is the most common and ε2 is the least common.23 The 3 corresponding encoded isoforms of the protein, E2, E3, and E4, have different functional properties; the E2 isoform is associated with lower, and the E4 isoform with higher, LDLc levels than E3. In a recent metaanalysis, E4 carriers, who represent >20% of the population, were shown to have a 40% higher risk of CHD compared with E3E3 homozygotes, whereas the relationship between E2 and risk was less obvious.24 This is an example of genetic variation that has an important effect at the population level but has little relevance in the assessment of individual risk, at least when considered alone.LDL Receptor GeneDespite the relatively low frequency of FH compared with the common forms of hyperlipidemias, its study has provided important insights into the mechanisms of cholesterol metabolism and opened new perspectives for the prevention of CHD.25 Mutations in the coding sequence of the LDL receptor gene (LDLR) may considerably reduce or abolish the function of the LDL receptor and lead to an important rise of circulating LDLc, which in turn is associated with a commensurate increase in CHD risk. More than 700 different mutations of LDLR responsible for FH have been reported, some of them clustered in particular populations.26 Mutations affect the function of the receptor in various ways according to their type and their position in the protein sequence, and an important heterogeneity is present in clinical manifestations even in individuals who carry the same mutation as a consequence of differences in genetic and environmental backgrounds. Currently, the clinical diagnosis of FH is based on personal and family history, physical examination, and laboratory findings. However, it has been suggested that the diagnosis of FH should be based on the identification of the genetic defect because statin therapy needs to be initiated in young carriers of a LDLR mutation even if their plasma LDLc is normal.27 However, no general agreement exists on this approach because the risk of CHD is about the same in phenotypically defined FH patients with or without mutation in the LDLR gene.28 The clinical benefit of the genetic diagnosis over the careful monitoring of LDLc levels, which is required anyway, is therefore questionable.Familial Defective ApoB100Familial defective apoB100 is another form of FH in which LDL binds defectively to the LDL receptor, which results in increased circulating LDLc levels and premature atherosclerosis.29 In contrast with the myriad of LDLR mutations that cause FH, the molecular defect responsible for familial defective apoB100 is a single mutation (R3500Q) in the gene encoding apoB, the main apolipoprotein in LDL that binds to the LDL receptor.30 Although the molecular diagnosis of familial defective apoB100 is theoretically easier than diagnosis of LDLR mutations that cause FH because a single variant is responsible for the trait, it is still the direct measurement of LDLc that appears the most appropriate to evaluate the risk of CHD and monitor the drug response in familial defective apoB100 patients.Proprotein Convertase Subtilisin/Kexin 9Recently, the careful study of families with several members affected by dominant forms of hypercholesterolemia despite absence of mutation in the LDLR gene and lack of the APOB3500 variant led to the mapping of a locus on chromosome 1p32 and the subsequent identification of missense mutations in the proprotein convertase subtilisin/kexin 9 (PCSK9) gene.31PCSK9 was subsequently found to play a major role in the LDL/LDLR pathway, even if the exact mechanism of its influence remains incompletely understood. Mice in which the PCSK9 gene has been inactivated exhibit an increased hepatic LDLR level, accelerated LDL clearance, and an important reduction of plasma LDLc.32 The PCSK9 mutations associated with FH are gain-of-function mutations (variants that confer an increased or extra functionality) that possibly affect the autocatalytic property of the pro-PCSK9 protein and promote the degradation of LDL receptors in hepatocytes. In addition to these extremely rare mutations, several more frequent nonsynonymous variants of the PCSK9 gene are associated with an impaired function of the protein that results in a reduction of plasma LDLc caused by accelerated LDL clearance. These variants of "intermediate" frequency have a substantial impact on plasma LDLc and CHD risk. For example, it has been estimated from the Atherosclerosis Risk in Communities (ARIC) study that 3% of African Americans were carriers of PCSK9 nonsynonymous variants, which were associated with a mean reduction of 30% of LDLc and a parallel significant reduction of CHD risk.33 This effect is comparable to the lowering effect of statins on LDLc. The PCSK9 gene also carries common noncoding polymorphisms that affect plasma LDLc; their effect at the individual level is much weaker than that of the coding variants of "intermediate" frequency just discussed,34 but their impact at the population level may be nonnegligible.ATP-Binding Cassette Transporter 1Another striking example of a Mendelian disorder that has contributed to the discovery of new processes involved in lipid metabolism and atherosclerosis is Tangier disease, a very rare recessive deficit of high-density lipoprotein–cholesterol (HDLc) metabolism caused by mutations in the ATP-binding cassette transporter 1 (ABCA1) gene.35–37ABCA1 encodes a protein that regulates the cellular efflux of cholesterol and phospholipids to an apolipoprotein transporter. Several mutations responsible for Tangier disease have been identified, all of which result in a complete or partial loss of function that leads to an accumulation of cellular cholesterol, low plasma HDLc levels, and increased risk of CHD. Apart from these very rare mutations, numerous coding variants of "intermediate" to low frequency in the ABCA1 gene may contribute to a significant fraction of the low HDLc levels in the population. In the Dallas Heart Study, 20 of 128 individuals in the bottom 5% of the HDLc distribution were carriers of nonsynonymous variants in the ABCA1 gene (unknown before as common SNPs) versus only 2 of 128 individuals in the top 5% of the HDL distribution.38 This finding was replicated in an independent study, and biochemical studies indicated that most of the variants associated with low HDLc were functionally important.38 The results that pertain to variants of "intermediate" or low frequency and the similar results for PCSK9 raise the interesting possibility that the contribution of rare variants to common traits may be more important than initially thought. Common polymorphisms in the ABCA1 gene, including several nonsynonymous changes, have also been identified by systematically resequencing the gene in a limited number of individuals, and some of these polymorphisms have been shown to be associated with plasma HDLc or apoA1 in the population at large.39Gene–Environment InteractionThe phenotypic expression of a genotype is dependent on a host of factors that include genetic background, the stage of development of the organism, age, gender, physiological and pathological conditions, the intake of food and drugs, and physical activity.40,41 The importance of these interactions considerably mitigates the concept of genetic determinism and provides perspectives for interference with the pejorative impact of genetics on disease susceptibility through modifiable factors.From a research perspective, the presence of interaction complicates the detection of relevant associations that may be masked if they are not investigated in appropriate conditions. Except in the domain of pharmacogenetics, very little progress has been made in our understanding of gene–environment interaction. This is partly related to the difficulty of accurate measurement of most environmental factors (drug intake is a clear exception) as compared with genetic factors, and to the generally low power of studies to analyze combinations of factors in presence of interaction. Prospective studies might be more appropriate than case-control studies to investigate gene–environment interactions because they are less prone to biases as a result of modifications in environmental exposure induced by disease onset.42 Lack of appropriate accounting for gene–environment interactions may explain some of the failure to replicate genetic associations. Whether the recently initiated projects of huge biobanks such as the UK Biobank (http://www.ukbiobank.ac.uk/) will help resolve the pending issues of gene–gene and gene–environment interaction remains to be shown. Actually, the pattern of interactions among factors that affect disease risk may be so complex that completely different approaches such as system genetics may be more helpful.10,11PharmacogeneticsThe response to a drug is a phenotype that is under the influence of both genetic and nongenetic factors. For reasons that we shall attempt to explain below, many common genetic variants have a strong influence on drug efficacy and toxicity. This may obviously have an important impact on patient care. Pharmacogenetics has 3 major specificities when compared with most other areas of gene–environment studies: the direct medical relevance, the relative ease of exposure measurement, the strength of the genetic effects.CYP2D6 as an ExampleFor many drug-metabolizing enzymes, phenotyping tests were available prior to the possibility to directly assess their genetic variability at the molecular level. CYP2D6, with the extensive metabolizer and poor metabolizer inherited phenotypes, is an example. The poor metabolizer phenotype is associated with a considerable increase in the maximum concentration and area under the curve for a large number of drugs. These include the β-blockers metoprolol, timolol, and propranolol, for which the same dose leads to a greater lowering of heart rate and blood pressure in subjects with the poor metabolizer phenotype. The genetic variability of CYP2D6 is under the influence of a large number of genetic variants, some of them common, that may be simultaneously present in an individual and whose distribution may considerably vary across ethnic groups.43 Many metabolizing enzymes exhibit a similar pattern of genetic variability.44Because CYP2D6 polymorphisms affect the metabolism of so many drugs, a tendency currently exists in the pharmaceutical industry to stop the development of therapeutic agents that are metabolized by CYP2D6. One concern is that the generalization of this attitude to other drug-metabolizing enzymes might lead to the rejection of a large number of drugs that would be efficient and safe in subgroups of patients identified by genetic testing. A striking example is provided by the pharmacogenetics of warfarin, a drug that has been in use since the 1950s but would probably have been abandoned in early development nowadays as a consequence of its pharmacogenetic features.The Pharmacogenetics of WarfarinWarfarin and, more generally, vitamin K antagonists are widely used oral anticoagulants whose prescription is complicated by their narrow and highly variable therapeutic range. The dose requirement and risk of bleeding are influenced by intake of vitamin K, illness, age, gender, concurrent medication, body surface, and genetics. Besides the possible or demonstrated influence of a large number of genes,45 warfarin's effect is influenced by 2 major genes, one involved in its biotransformation (CYP2C9) and the other involved in its mechanism of action (VKORC1). The gene that encodes CYP2C9, the main metabolizing enzyme of warfarin, is highly polymorphic with many alleles that exhibit different functional properties and different frequencies across populations. In individuals of European descent, CYP2C9*1 is the most common allele, whereas CYP2C9*2 and CYP2C9*3 have a frequency of 12% and 8%, respectively, and a reduced activity relative to CYP2C9*1, which implies that carriers of the CYP2C9*2 or CYP2C9*3 form (≈40% of the Europeans) treated by warfarin would normally require a lower dose of the drug.46 In individuals of African and Asian origins, the CYP2C9*2 and CYP2C9*3 alleles are less frequent than in Europeans, but other functional alleles are found predominantly in these 2 ethnic groups that also affect the drug response. VKORC1, the other major gene that influences warfarin metabolism, is the vitamin K cycle enzyme that controls regeneration of reduced vitamin K. Warfarin exerts its pharmacological effect by inhibition of VKORC1. The VKORC1 gene carries several common polymorphisms in its regulatory regions, such as the −1639G/A polymorphism (or similarly −1173T/C, which is in strong LD with it), which strongly correlate with warfarin response. A regression model that incorporates polymorphisms of the 2 genes as well as age, height, and gender has been proposed that accounts for >50% of the variability of warfarin response in Europeans and may be used as a dosing algorithm in this population.47Large-Scale Genotyping of Drug-Metabolizing EnzymesIt is now possible to design genotyping devices that allow the simultaneous testing of a large number of variants that affect drug metabolism.48 Such tools may be very useful in the early development of drugs, and no major technological obstacle exists to improving them to a point where they will allow testing of most SNPs that affect drug metabolism. However, some limitations may reduce the clinical applicability of such tools. First, SNPs only represent a part of the genetic variation that affect drug metabolism. Variable number of tandem repeat or structural polymorphisms may not be easily tagged by SNPs. Second, a major gene effect (where a single variant dominates all other effects) cannot always be assumed, and it may be difficult to translate a complex pattern of variation that involves many different SNPs into an accurate prediction of drug response. Third, the previous point is further complicated by the possible influence of nongenetic cofactors.Evolutionary Aspects of Drug-Metabolizing EnzymesThe conjunction of a strong effect and a high frequency that distinguishes the variants that affect drug metabolism from most of those that affect disease phenotypes is likely to have an evolutionary explanation. Many drug-metabolizing enzymes are highly genetically polymorphic within and across species. A good example is offered by the CYP2D gene family. In mice, 9 CYP2D genes exist. In humans, only 1 CYP2D gene (CYP2D6) is present, and it is highly polymorphic. Because CYP2D enzymes have a high affinity for plant toxins, it has been proposed that they are essential for the survival of mice in their specific dietary environment. During hominization on the other hand, as a consequence of changes in food selection, the detoxifying potential of CYP2D enzymes became less essential for survival, and, with no selection pressure applied on CYP2D gene products, accumulation of mutations resulted in a high degree of polymorphism and ultimately in the degradation and loss of function of most CYP2D genes.43Going Further With GWA StudiesTraditionally, genetic association studies focused on candidate genes selected on the basis of their biological function. After the recent availability of panels of SNPs that tag the whole genome and their incorporation into high-density genotyping microarrays, it is now possible to conduct GWA studies to investigate the genetic component of common diseases and quantitative traits without relying on any prior biological hypothesis. The rationale of this approach is that, if unknown disease-predisposing variants are present somewhere in the genome, they may be detected through their LD with tagging SNPs represented on the genotyping array (Figure 1). This approach offers a great potential for the discovery of new causes and mechanisms of disease.49 The GWA strategy h
Referência(s)