Schizophrenia Genetics Comes of Age
2014; Cell Press; Volume: 83; Issue: 4 Linguagem: Inglês
10.1016/j.neuron.2014.08.015
ISSN1097-4199
AutoresAnna C. Need, David B. Goldstein,
Tópico(s)Genetics and Neurodevelopmental Disorders
ResumoThe new GWAS from the Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014Schizophrenia Working Group of the Psychiatric Genomics ConsortiumNature. 2014; 511: 421-427Crossref PubMed Scopus (5067) Google Scholar clearly validates a genetic approach to understanding schizophrenia. The challenge now remains to track down the contributing genes and to develop appropriate models to elucidate the biological effects of the contributing variants. The new GWAS from the Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014Schizophrenia Working Group of the Psychiatric Genomics ConsortiumNature. 2014; 511: 421-427Crossref PubMed Scopus (5067) Google Scholar clearly validates a genetic approach to understanding schizophrenia. The challenge now remains to track down the contributing genes and to develop appropriate models to elucidate the biological effects of the contributing variants. Researchers from the Psychiatric Genetics Consortium (PGC) last week published one of the largest and most successful genome-wide association studies (GWAS) to date, including 36,989 patients with schizophrenia and 113,075 controls (Schizophrenia Working Group of the Psychiatric Genomics Consortium, 2014Schizophrenia Working Group of the Psychiatric Genomics ConsortiumNature. 2014; 511: 421-427Crossref PubMed Scopus (5067) Google Scholar). With over 100 genomic regions implicated in the risk of schizophrenia, it seems a good time to assess how GWAS has changed psychiatric genetics and where the field should go from here. First, it is important to put this study into context. The PGC scientists, as much as anyone, have helped to take a field plagued by inconsistency, and perhaps even a degree of charlatanism, and wrestle it into a position of statistical discipline and rigor. Before GWAS, the standard practice for investigating schizophrenia genetics (as well as many other areas) was to pick a candidate gene (usually based on dopamine or glutamate pathways or linkage studies) and compare the frequency of genetic variants in cases and controls. Any difference with a p value < 0.05 would be reported as an association with schizophrenia, regardless of the number of polymorphisms that might have been tested. "Replications" were liberally defined, including studies of other variants in the same gene, different phenotypes, effects in the opposite direction, "trending" (i.e., > 0.05) p values, and associations seen in partitions of a data set. Beyond all of these obvious statistical transgressions, these studies often entirely ignored well-established causes of spurious associations such as population stratification. Labs would churn out separate papers for gene after gene with no correction for multiple testing, and, on top of all of that, there was a publication bias against negative findings. This practice resulted in many hundreds of reported variant associations with schizophrenia and scores of falsely implicated genes. In contrast, those involved in GWAS established clear standards (McCarthy et al., 2008McCarthy M.I. Abecasis G.R. Cardon L.R. Goldstein D.B. Little J. Ioannidis J.P. Hirschhorn J.N. Nat. Rev. Genet. 2008; 9: 356-369Crossref PubMed Scopus (2108) Google Scholar) that helped eliminate false-positive claims. This work has allowed schizophrenia researchers to clear the decks of the false-positive debris and concentrate on real associations. In this latest study, there are over 100 genomic regions within which we know there are definitive genetic risk factors for schizophrenia. Critically, this work gives us confidence to commit ourselves to a genetic approach to schizophrenia etiology, and it lends support to the idea of studying schizophrenia (including schizoaffective disorder) as a diagnosis. This itself is important, as many have maintained that schizophrenia is too diverse to be studied as a single entity (Jablensky, 2006Jablensky A. Mol. Psychiatry. 2006; 11: 815-836Crossref PubMed Scopus (185) Google Scholar) and that it would be better to study endophenotypes such as working memory or fMRI patterns, or separate patients based on symptoms or drug response. The challenge to this approach is gathering enough samples of any particular type or with any particular set of measures to perform a reasonably powered study, and, more fundamentally, the question of whether the traits to be studied really are appropriately considered as biologically related to schizophrenia. The recent PGC paper demonstrated that genetic results did not differ between groups that recruited samples based solely on clinician diagnosis compared to groups that performed lengthy research-based diagnostic interviews. This suggests that a diagnosis of schizophrenia is a perfectly good phenotype for genetic study. Indeed, the "hockey-stick model" in which very few real associations are found until a particular sample size is breached (Figure 1) is in keeping with GWAS of other complex traits such as hypertension (Newton-Cheh et al., 2009Newton-Cheh C. Johnson T. Gateva V. Tobin M.D. Bochud M. Coin L. Najjar S.S. Zhao J.H. Heath S.C. Eyheramendy S. et al.Wellcome Trust Case Control ConsortiumNat. Genet. 2009; 41: 666-676Crossref PubMed Scopus (987) Google Scholar). For these reasons, and more, the work of the PGC represents a clear turning point in the history of schizophrenia genetics. To understand what was discovered, it is essential to appreciate that GWAS typically identifies genomic regions, not precise risk factors. The vast majority of the 108 identified loci reflect the presence of real risk factors presumably near the associated variants. Exactly which variants, and how many, are responsible for the detected risk, however, remains unknown. The reason for this is that GWAS depends on "indirect association," leveraging association among variants in the genome to identify genome regions carrying inherited risk factors. The authors estimate that the 108 loci collectively "implicate" a total of 350 genes. If we include the extended MHC region (which was accepted as one large associated locus), the number of implicated protein-coding genes would exceed 600. Analyzing these "implicated genes," the authors note some interesting findings. First, and to everyone's relief, the 108 loci include DRD2, the gene that codes for the dopamine D2 receptor, a key target of all existing antipsychotic drugs. Until now, there has been very little overlap between genes we hypothesized to be involved in schizophrenia based on the biology of antipsychotics and the genetic hits from GWAS and other genome-wide studies. While GWAS, as noted, only implicate regions, not genes, it seems more than a little unlikely that DRD2 is in one of the implicated regions by chance. Second, the authors noted the presence of several genes encoding calcium channel subunits and proteins involved in synaptic plasticity within their associated regions, hinting at a convergence between their GWAS study and studies of de novo mutations in patients with schizophrenia (Fromer et al., 2014Fromer M. Pocklington A.J. Kavanagh D.H. Williams H.J. Dwyer S. Gormley P. Georgieva L. Rees E. Palta P. Ruderfer D.M. et al.Nature. 2014; 506: 179-184Crossref PubMed Scopus (1127) Google Scholar, Purcell et al., 2014Purcell S.M. Moran J.L. Fromer M. Ruderfer D. Solovieff N. Roussos P. O'Dushlaine C. Chambert K. Bergen S.E. Kähler A. et al.Nature. 2014; 506: 185-190Crossref PubMed Scopus (1000) Google Scholar). Following this success in identifying what we consider are real association signals, the field continues to face significant challenges in translating these GWAS findings into novel insights into the schizophrenia pathophysiology and, critically, into new validated drug targets. Perhaps most fundamental in this challenge is the question of exactly which genes are now implicated. The authors have focused attention on a set of 350 protein-coding genes that contain variants in linkage disequilibrium (LD) with the signals of risk that have been detected, or over 600 if the extended MHC region is included (as it should be). The first complication is that this set of genes is, in fact, arbitrary. To pinpoint this set of 350 genes, the authors identified the region enclosing all SNPs that are strongly associated with the index SNP, where strongly associated was defined as having an r2 > 0.6. Under some assumptions (in particular if the causal variants are assumed to be represented) this is well motivated. Thus, if we assume that the variants responsible for the association are usually common, this may capture most (though certainly not all) of the actual causal variants. On the other hand, this cut-off would often miss rarer variants that cause or contribute to the association signal. How commonly rare variants underlie GWAS signals remains a point of significant contention in the GWAS field (Anderson et al., 2011Anderson C.A. Soranzo N. Zeggini E. Barrett J.C. PLoS Biol. 2011; 9: e1000580Crossref PubMed Scopus (87) Google Scholar, Goldstein, 2011Goldstein D.B. PLoS Biol. 2011; 9: e1001008Crossref PubMed Scopus (32) Google Scholar, Wray et al., 2011Wray N.R. Purcell S.M. Visscher P.M. PLoS Biol. 2011; 9: e1000579Crossref PubMed Scopus (132) Google Scholar). Whatever the reality of the frequencies of the causal variants, it is clear that the association cut-off used would have a strong impact on the number of genes "implicated." For example, if we were to select a much more liberal cut-off of r2 > 0.1 instead of 0.6, then the number of genes considered to be implicated in this study (excluding the MHC region) would be 800 instead of 350 (based on DBG lab internal association data in a separate data set of 750 samples, results would differ depending in particular on the number of samples used in the analysis). Since there is no way of showing with confidence that all risk factors would be captured in the r2 > 0.6 cut-off, this simple illustration makes clear that even the number of genes implicated in this study remains unknown. Finally, it is worth noting that capturing causal variants in a region does not necessarily mean identifying the relevant genes, since some causal variants may be regulatory and not physically included in the sequence of a gene. The authors also performed a screen for credible candidate variants, turning up very few in coding regions, with the interesting inclusion of a variant reported in an earlier study as having a strong splicing effect in the ADAMTLS3 gene but at that time with unconvincing association evidence (Need et al., 2009Need A.C. Ge D. Weale M.E. Maia J. Feng S. Heinzen E.L. Shianna K.V. Yoon W. Kasperaviciūte D. Gennarelli M. et al.PLoS Genet. 2009; 5: e1000373Crossref PubMed Scopus (374) Google Scholar). Whether that represents a real causal contribution to risk remains to be resolved. Overall, however, this analysis is consistent with the idea that many of the contributing risk factors may be regulatory. The most sobering lesson from the GWAS era is how difficult it is to move from a GWAS signal to the secure identification of causal variants. The NHGRI website (http://www.genome.gov/gwastudies) tracks independent GWAS signals for common diseases and traits and currently records 7,300 associations with a p value < 5 × 10−8. Of these, only a tiny fraction have been tracked to causal variants—perhaps only around 20 in total depending on the criteria used to declare variants causal. This emphasizes the significant challenge we have in moving from GWAS signals of risk to understanding the biological effects of particular mutations. In moving from genetic associations to biology, there are two broad strategies available to the community. First, we could assume that we can identify (however imprecisely, as emphasized above) many of the implicated genes. We could then further assume that the causal variants that affect these genes are generally regulatory in nature (and there is some support for this assumption, as the authors note, and as supported by other work [Nicolae et al., 2010Nicolae D.L. Gamazon E. Zhang W. Duan S. Dolan M.E. Cox N.J. PLoS Genet. 2010; 6: e1000888Crossref PubMed Scopus (910) Google Scholar]), and we could focus on up- and downregulation of these genes in experimental models to look for relevant effects. The difficulty of this approach is clear: if it is generally true that the effects on risk are due to common regulatory variants of small effect, we will probably need to simultaneously perturb the expression of many genes to generate effects strong enough to be clearly identified in experimental models. The concern is only compounded when we consider that the number of genes "implicated" will only grow as the PGC continues to increase the sample size to capture smaller and smaller effect sizes. How long will it be until a significant fraction of the genes in the genome is encompassed by an associated region (Figure 1)? The second approach would be to take encouragement from the GWAS findings that a genetic approach to schizophrenia works but, instead of trying to work directly with the GWAS results, try to find rarer variants of stronger effect and to study those variants in experimental models. It is worth emphasizing that in the effort to track down precise causal variants, the GWAS signals may provide valuable leads. In particular, there is already evidence of an overlap between GWAS signals and rare mutations of major effect discovered through trio sequencing. It is also essential to appreciate that our experimental models of schizophrenia are very immature. However, encouragingly, there are new and exciting opportunities for modeling the effect of mutations. It is now possible to edit mutations of interest virtually at will into stem cells, to differentiate these to neurons, and to study the properties of those neurons, either individually or in selected combinations in in vitro cultures using multielectrode arrays or newer optogenetic approaches (Cho and Sohal, 2014Cho K. Sohal V.S. Hum. Mol. Genet. 2014; (Published online May 13, 2014)https://doi.org/10.1093/hmg/ddu225Crossref Scopus (13) Google Scholar, Cohen and Kohn, 2011Cohen M.R. Kohn A. Nat. Neurosci. 2011; 14: 811-819Crossref PubMed Scopus (618) Google Scholar, Venkatachalam et al., 2014Venkatachalam V. Brinks D. Maclaurin D. Hochbaum D. Kralj J. Cohen A.E. J. Am. Chem. Soc. 2014; 136: 2529-2537Crossref PubMed Scopus (25) Google Scholar). It is also possible to generate mouse models carrying mutations of interest, and even combinations of mutations, dramatically faster than was possible even two years ago. But in both of these cases, it is entirely unclear what phenotypes to look for, either in vitro or in vivo. Given the need to both validate appropriate models and study the implicated variants, it seems clear that starting with the mutations with the strongest effects we can find will be surest path to real insight. For these reasons, the PGC results clearly encourage us in our efforts to take a genetic approach to schizophrenia. But it may still be the case that the rarer mutations of larger effect are the ones that should be the focus of our experimental efforts. We would like to thank Jonathan Keebler and Slavé Petrovski for LD calculations, and Mark Daly and Ben Neale for insightful comments on the manuscript.
Referência(s)