Artigo Acesso aberto Revisado por pares

Paralog dependency indirectly affects the robustness of human cells

2019; Springer Nature; Volume: 15; Issue: 9 Linguagem: Inglês

10.15252/msb.20198871

ISSN

1744-4292

Autores

Rohan Dandage, Christian R. Landry,

Tópico(s)

Photosynthetic Processes and Mechanisms

Resumo

Article24 September 2019Open Access Source DataTransparent process Paralog dependency indirectly affects the robustness of human cells Rohan Dandage Rohan Dandage orcid.org/0000-0002-6421-2067 Département de Biologie, Université Laval, Québec, QC, Canada Département de Biochimie, Microbiologie et Bio-Informatique, Université Laval, Québec, QC, Canada Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada The Québec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université Laval, Québec, QC, Canada Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada Search for more papers by this author Christian R Landry Corresponding Author Christian R Landry [email protected] orcid.org/0000-0003-3028-6866 Département de Biologie, Université Laval, Québec, QC, Canada Département de Biochimie, Microbiologie et Bio-Informatique, Université Laval, Québec, QC, Canada Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada The Québec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université Laval, Québec, QC, Canada Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada Search for more papers by this author Rohan Dandage Rohan Dandage orcid.org/0000-0002-6421-2067 Département de Biologie, Université Laval, Québec, QC, Canada Département de Biochimie, Microbiologie et Bio-Informatique, Université Laval, Québec, QC, Canada Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada The Québec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université Laval, Québec, QC, Canada Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada Search for more papers by this author Christian R Landry Corresponding Author Christian R Landry [email protected] orcid.org/0000-0003-3028-6866 Département de Biologie, Université Laval, Québec, QC, Canada Département de Biochimie, Microbiologie et Bio-Informatique, Université Laval, Québec, QC, Canada Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada The Québec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université Laval, Québec, QC, Canada Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada Search for more papers by this author Author Information Rohan Dandage1,2,3,4,5 and Christian R Landry *,1,2,3,4,5 1Département de Biologie, Université Laval, Québec, QC, Canada 2Département de Biochimie, Microbiologie et Bio-Informatique, Université Laval, Québec, QC, Canada 3Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Québec, QC, Canada 4The Québec Network for Research on Protein Function, Engineering, and Applications (PROTEO), Université Laval, Québec, QC, Canada 5Centre de Recherche en Données Massives (CRDM), Université Laval, Québec, QC, Canada *Corresponding author. Tel: +1 418 656 3954; E-mail: [email protected] Molecular Systems Biology (2019)15:e8871https://doi.org/10.15252/msb.20198871 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract The protective redundancy of paralogous genes partly relies on the fact that they carry their functions independently. However, a significant fraction of paralogous proteins may form functionally dependent pairs, for instance, through heteromerization. As a consequence, one could expect these heteromeric paralogs to be less protective against deleterious mutations. To test this hypothesis, we examined the robustness landscape of gene loss-of-function by CRISPR-Cas9 in more than 450 human cell lines. This landscape shows regions of greater deleteriousness to gene inactivation as a function of key paralog properties. Heteromeric paralogs are more likely to occupy such regions owing to their high expression and large number of protein–protein interaction partners. Further investigation revealed that heteromers may also be under stricter dosage balance, which may also contribute to the higher deleteriousness upon gene inactivation. Finally, we suggest that physical dependency may contribute to the deleteriousness upon loss-of-function as revealed by the correlation between the strength of interactions between paralogs and their higher deleteriousness upon loss of function. Synopsis Meta-analysis of genome-wide loss of function screens in human cell lines reveals that deleteriousness of heteromeric paralogs can be largely explained by their inter-dependency and the unique features of paralog gene expression. Not all paralogs provide robustness to gene inactivation. Interacting paralogs are more deleterious upon gene inactivation. The deleteriousness of the heteromeric interacting paralogs is caused by paralog dependency in an indirect manner. Introduction After a gene duplication event and before they become functionally distinct, paralogs are redundant and can mask each other's inactivating mutations (Pickett & Meeks-Wagner, 1995; Brookfield, 1997; Diss et al, 2014). This mutational robustness does not provide an advantage strong enough by itself to cause the maintenance of paralogs by natural selection unless mutation rate or population size is exceptionally large (van Nimwegen et al, 1999). Nevertheless, paralogous genes affect how biological systems globally respond to loss-of-function (LOF) mutations. For instance, the early analysis of growth rate of the yeast gene deletion collection revealed that genes with duplicates are enriched among the ones that have a weak effect on fitness when deleted (Gu et al, 2003). Likewise, singletons (genes with no detectable homologous sequence in the genome) tend to be overrepresented among genes whose deletion is lethal. Further studies in yeast also showed that redundancy could be maintained for millions of years, making the impact of duplication long lasting (Dean et al, 2008). A parallel observation in humans showed that genes are less likely to be involved in diseases if they have a paralog, and the probability of disease association for a gene decreases with increasing sequence similarity with its closest homolog in the genome (Hsiao & Vitkup, 2008). These observations, along with smaller scale observations made in classical genetics (Pickett & Meeks-Wagner, 1995; Diss et al, 2014), strongly demonstrate that redundancy allows paralogs to compensate for each other's LOF at the molecular level. The buffering ability of paralogs is however not universal (Ihmels et al, 2007), and opposite results have been reported. For instance, Chen et al (2013b) reported an enrichment of human diseases among paralogous genes, particularly among the ones with higher functional similarity. The authors explained this result with a model in which redundancy reduces the efficacy of purifying selection, leading to the maintenance of disease alleles that could have lower penetrance, for instance, through noise in gene expression. Other authors have shown that the retention of whole-genome duplicates could be biased toward genes that are more likely to bear autosomal-dominant deleterious mutations (Singh et al, 2012). In this case, the maintenance of paralogs would be associated with greater susceptibility to disease mutations, contrary to the robustness expected from gene redundancy. A better understanding of whether and how paralogs can compensate for each other's deleterious mutations therefore requires a better understanding of the mechanisms involved. This would improve our understanding of evolution and also accelerate the development of medical interventions because redundancy is often a major obstacle in this context (Lavi, 2015). The mechanisms by which paralogs compensate for each other's LOF mutations are for most cases not known in details (Pickett & Meeks-Wagner, 1995; Diss et al, 2014), but likely involve active and passive mechanisms, from transcriptional to post-translational ones. For instance, it was shown for a small fraction of paralogous gene pairs that a member of a pair is upregulated by some feedback mechanism upon the deletion of the second copy (Kafri et al, 2005). Although it may have important consequences, the occurrence of this phenomenon is however very likely limited. Indeed, a systematic assessment of this mechanism at the protein level in yeast found that it could take place only for a very small set of paralogous genes (DeLuna et al, 2010). Another potential mechanism of compensation takes place at the level of protein–protein interactions (PPI) (reviewed by Diss et al, 2014), whereby paralogs replace each other with respect to their binding partners through ancestrally preserved binding ability. Evidence for this mechanism was recently reported by Diss et al (2013, 2017). The model proposed is that paralogs appear to have different binding partners in wild-type cells, because they mutually exclude each other from binding with potential partners. This is due to differential binding affinity or expression levels of the paralogs that tilts binding competition toward one paralog or the other. Upon deletion, the mutual exclusion is relieved and compensation becomes apparent. Results consistent with this observation were obtained by Ori et al (2016) in mammalian cells. The authors showed that some paralogs can replace each other through changes in expression within protein complexes, supporting the fact that paralogs have preserved the ability to interact with the same partners. Another study reported observations consistent with this model using proteomics analyses of cancer cell lines (Gonçalves et al, 2017). In this case, an increased copy number for one gene led to increased protein abundance and a decrease in abundance of its paralogs, as if a feedback mechanism was affecting the balance between paralogs. This feedback is likely due to post-translational regulation that leads to the degradation of the displaced paralogs from protein complexes, also called protein attenuation (Ishikawa et al, 2017; Taggart & Li, 2018). This observation suggests that the two paralogs would have overlapping binding partners and the balance would be determined by their relative affinity and abundance, as observed in one recent meta-analysis study (Sousa et al, 2019). Finally, Rajoo et al (2018) examined the composition of the yeast nuclear pore complex and, similarly to the Diss et al study (Diss et al, 2013), found that paralogous proteins can at least partially replace each other in situ upon deletion and change in abundance. A major determinant that limits the ability of paralogs to compensate is their functional divergence, which can be approximated by sequence divergence (Hsiao & Vitkup, 2008; Li et al, 2010). Other factors could also play a role, for instance, cross-dependency, which has been brought to light only recently. DeLuna et al (2010) looked at protein abundance of yeast paralogs when their sister copies are deleted, and found that six of the 29 pairs studied displayed negative responsiveness: Upon deletion, the remaining paralog showed a decreased protein abundance. In half of these cases, the paralogs heteromerized (physically interacted with each other), suggesting that protein abundance may depend on their physical interactions. The control of protein abundance through interactions was also recently elucidated in the context of human cells (Sousa et al, 2019). The consequences of these decreases in abundance were not investigated further but one could imagine that this would directly affect the compensating ability of paralogs, because the deletion of one copy of a pair leads to a LOF of the second, thereby essentially acting as a dominant negative effect. A recent study by Diss et al (2017) directly examined paralog compensation at the level of protein–protein interactions. Among more than 50 pairs of paralogs, they showed that not all paralogs could compensate in the yeast protein interaction network. About 20 pairs showed dependency, i.e., one paralog lost some or all its interaction partners upon the loss of the second. Diss et al found that dependent pairs were enriched for pairs that form heteromers and, in some cases, the dependency could be explained by a strong decrease in protein abundance upon deletion, consistent with the observation of DeLuna et al (2010). Altogether, these observations raise the possibility that heteromerization of paralogs may reflect their physical and functional dependency, which as a consequence would reduce the ability of paralogous genes to compensate for each other's loss. One could therefore predict that the protection that paralogous genes provide against the effect of LOF mutations would be contingent on whether their products form heteromeric complexes with each other or not. These genes would have fitness effects that are closer to that of single copy genes (singletons) than that of typical duplicates. Here, we examine these predictions by re-analyzing a set of well-curated pairs of human paralogous genes (Singh et al, 2015; Lan & Pritchard, 2016) and recent large-scale genome-wide CRISPR-Cas9 screens in which the effect of gene LOF on cell proliferation was examined in more than 450 cancer cell lines (Wang et al, 2015; DepMap, 2018) and a primary cell line (Shifrut et al, 2018). The meta-analysis of the effect of gene LOF on cell proliferation, mRNA expression from 374 cell lines, protein expression from 49 cell lines and protein–protein interactions (Table EV1) revealed patterns which strongly support our hypothesis that paralogs that assemble are less protective, but through factors other than heteromerization itself. Results Paralogous genes protect against the effect of gene LOF across all cell lines We used two datasets of paralogous genes, one of relatively young paralogs, largely derived from small-scale duplications (Lan & Pritchard, 2016) and another set of relatively old paralogs most likely derived from whole-genome duplication (Data ref: Ohnolog, 2018; total of 3,132 pairs of paralogs, see Materials and Methods, Dataset EV1). We first examined whether paralogous genes protect against the deleterious effects of LOF mutations in a set of 455 human cell lines from three independent CRISPR-Cas9 genome-wide LOF screens (Table EV1). Such experiments yield a CRISPR score (CS) per gene which is an estimate of the relative depletion of guide RNAs (gRNAs) during the genome-wide CRISPR-Cas9 screening experiment. CS therefore reflects the relative deleteriousness of LOF on cell proliferation (Fig EV1): A lower CS value indicates more deleteriousness and vice versa. These datasets are (i) CS1 from four cell lines (Wang et al, 2015), (ii) CS2 from 450 cell lines (Meyers et al, 2017; DepMap, 2018), (iii) CS2.1 from 450 cell lines (DepMap, 2018), and (iv) CS3 from 1 primary cell line (Shifrut et al, 2018; see Dataset EV2 for cell-line information, Dataset EV3 for gene-wise CS values). All the CS values capture the essentiality of the genes which, in the case of cancer cell lines, are found to be largely independent of the role of the genes in cancerogenesis (Fig EV1). Because the estimation of CS of the paralogs could be confounded by gRNAs that match to more than one gene due to their sequence similarities, we recomputed scores for the CS1, CS2.1, and CS3 datasets by considering only the gRNAs that uniquely align to the genome (see Materials and Methods). Dataset CS2 and dataset CS2.1 constitute data from the same set of cell lines (biologically identical), but analyzed differently. CS2 takes copy-number variation effects in each cell line into account (used directly as computed by the authors; Meyers et al, 2017), while CS2.1 is analyzed by utilizing only the uniquely aligned gRNAs (see Materials and Methods). CS values among datasets CS1 and CS2/CS2.1 are well correlated, indicating reproducible measurements of fitness effects across platforms, methodologies, cell lines, and cell types (Appendix Fig S1). The weaker correlation with dataset CS3 values (Spearman correlation coefficient ranges from 0.19 to 0.21), however, could be attributed to the difference in the physiology of the primary and cancer cell lines itself, although technical factors could also be responsible. Click here to expand this figure. Figure EV1. Distribution of CS values in the 4 CS datasets A–D. The locations of essential and non-essential genes [taken as a union set of genes reported by DepMap, 2018 and BAGEL (Hart & Moffat, 2016)] are denoted on the distributions. The locations of the cancer drivers, oncogenes, and tumor suppressors are also denoted on the distribution (derived from Lever et al, 2019). Source data are available online for this figure. Download figure Download PowerPoint As expected, we find that paralogs buffer the effect of gene LOF. Genes with paralogs have relatively higher CS values than singletons (see Materials and Methods for classification of singletons), for the three biologically independent datasets considered (Fig 1A). To confirm that these effects were systematic and were not driven by few cases of cell lines with strong effects, we compared the mean CS for paralogs and singletons across cell lines (see Fig 1B for analysis with CS2.1 and Appendix Fig S2A for analysis with CS2 dataset). All cell lines systematically showed stronger buffering effects for the inactivation of paralogs compared to singletons, with no exception. The same results were observed for the comparison of paralogs with genes that are not in the set of paralogs nor classified as singletons, denoted as "unclassified" (see Fig 1C for analysis with CS2.1 and Appendix Fig S2B for CS2 dataset). These results are therefore highly reproducible and cell-line independent. However, the trend showed some dependence on molecular features such as mRNA expression levels, as we discuss below. Figure 1. The LOF of paralogs is less deleterious than that of singletons in human cell lines A. LOF data derived from genome-wide CRISPR-Cas9 screening experiments. The deleteriousness of LOF of a gene on cell proliferation is estimated from the depletion of gRNAs in the experiment. The extent of depletion is measured as a CRISPR score (CS, see Materials and Methods). CS values across cell lines from three biologically independent datasets—CS1 (Wang et al, 2015), CS2/CS2.1 (Meyers et al, 2017; DepMap, 2018), and CS3 (Shifrut et al, 2018) are shown. Genes that are not in the paralog datasets but that were not identified as singletons in the stringent identification of singletons are denoted as "unclassified". Relatively higher CS of paralogs compared to singletons indicates that they are relatively less deleteriousness. P-values from two-sided Mann–Whitney U tests are shown. On the violin plots, the medians of the distributions are shown by a horizontal black line and quartiles by a vertical thick black line. For clarity, the upper and lower tails of the distributions are not shown. B, C. (B) Comparisons of CS values between paralogs and singletons and (C) between paralogs and unclassified genes (neither clearly a paralog nor a singleton). CS data for 4 (CS1) + 450 (CS2.1) + 1 (CS3) cell lines is shown. Each point represents the mean CS for a class (singleton, paralog, or unclassified) in an individual cell line. All points are below the diagonal (dashed gray line), showing that the effect is systematic and largely cell-line independent. Similar plots are shown for CS2 dataset in Appendix Fig S2. D. Older paralogs tend to be more essential than younger ones and are therefore less protective (i.e., more deleterious upon LOF). On the y-axis, the age groups are ordered in increasing distance of phylogenetic node of duplication relative to common ancestor, i.e. Opisthokonta. Sets of essential and non-essential genes were derived from the union of gene sets reported by DepMap (2018) and BAGEL (Hart & Moffat, 2016; See Materials and Methods). P-value from a two-sided Mann–Whitney U test is shown. The boxes represent the first and third quartiles (Q1 and Q2) of the distribution, and the upper and lower whiskers extend up to Q3 + 1.5*interquartile range and Q1 − 1.5*interquartile range, respectively. The central horizontal line represents the median of the distributions containing 65 data points in the case of essential paralogs and 235 data points in the case of non-essential paralogs. Source data are available online for this figure. Source Data for Figure 1 [msb198871-sup-0011-SDataFig1.xlsx] Download figure Download PowerPoint Older paralogs tend to be less protective In order to determine the effect of paralog age on deleteriousness, we compared the essential and non-essential sets of genes in terms of their age group of duplications retrieved from Ensembl Compara (Herrero et al, 2016; see Materials and Methods). We find that, albeit with a weak difference, older paralogs are more likely to be classified as essential genes and thus have potentially more deleterious effects upon LOF than younger paralogs (Fig 1D, see Materials and Methods for the classification of essential genes). This result underscores similar findings from earlier studies showing that the more diverged paralogs are, the less likely they are to buffer each other's loss, in the context of human diseases or yeast gene deletions (Hsiao & Vitkup, 2008; Li et al, 2010; Plata & Vitkup, 2014). Heteromeric paralogs emerge from ancestral homomers The model in which paralogous genes are dependent on each other considers that interacting paralogs derive from ancestral homomeric proteins (Bridgham et al, 2008; Baker et al, 2013; Kaltenegger & Ober, 2015; Diss et al, 2017). We can assume that when the two paralogs individually form a homomer, the ancestral protein was most likely also a homomer. Therefore, we can infer that heteromers of paralogs are derived from ancestral homomers, if each paralog also forms a homomer. Homomeric, in the context of this study, refers to the assembly of a protein with itself while heteromers of paralogs or heteromeric paralogs refer to paralogous proteins that assemble with each other. We used two sources of PPI, BioGRID (Chatr-Aryamontri et al, 2015, 2017) and IntAct (Orchard et al, 2014), to define homomeric genes or heteromeric gene pairs based on PPI (see Materials and Methods). Further, the subsets were defined based on all PPI (henceforth, this dataset will be referred to as "all PPI") or direct physical interactions only (henceforth, this dataset will be referred to as "direct PPI"). Considering all PPIs (see Materials and Methods for the difference between "all PPI" and "direct PPI"), paralogs are 8.13 times more likely to form heteromeric pairs (Fisher's exact test, P-value < 1.4e-14) if they also both form homomers than if none of them does. The likelihood is 48.88 times for heteromers defined by "direct PPI"s only (P-value < 5.5e-18; see Appendix Table S1 for the numbers of pairs in each category). We can therefore generally assume that pairs of heteromeric paralogs are more likely to be derived from ancestral homomers, consistent with previous observations (Wagner, 2003; Pereira-Leal et al, 2007). Paralogs that form heteromers have stronger effects on cell proliferation when inactivated Next, we investigated the effect of LOF of paralogs that form heteromers and those that do not. Consistent with the dependency hypothesis, the LOF of heteromeric paralogs seems to cause relatively more deleterious effect on cell proliferation than the LOF of non-heteromeric paralogs, across all 4 CS datasets (Fig 2A, similar analysis with "direct PPI"s is shown in Appendix Fig S3). We also observe that the effect is consistent across cell lines by looking at the mean CS of heteromers or non-heteromers within each cell line (Fig 2B), with a majority of cell lines showing stronger effects for the LOF of paralogs forming heteromers. This trend is clearly observed across all the CS datasets and irrespective of the source of the PPI used for the definition of the heteromeric paralogs (similar analysis as that of the Fig 2B with all the rest of the combinations of the PPI sources and CS datasets is shown in Appendix Fig S4). A similar analysis with paralogs that are both heteromers and homomers compared with paralogs which are only homomers shows that interacting paralogous are relatively more deleterious (Fig 2C). This trend is also clearly observed across all the CS datasets and irrespective of the source of the PPI used for the definition of the subsets of paralogs (similar analysis as that of the Fig 2B with all the rest of the combinations of the PPI sources and CS datasets is shown in Appendix Fig S4). The effects are therefore not due to homomerization but due to heteromerization (Fig 2C). These results support the hypothesis that interacting pairs of paralogs are less likely to buffer each other's LOF. Figure 2. The LOF of paralogs that form heteromers is more deleterious than the LOF of non-heteromers The effect of LOF on cell proliferation (CS values) is relatively more deleterious in the case of heteromeric paralogs than non-heteromers, across all 4 CS datasets. P-values from two-sided Mann–Whitney U tests are shown. Similar plot for heteromers defined with direct PPI only is shown in Appendix Fig S3. Mean CS values of heteromeric paralogs and non-heteromers (defined by "all PPI"s from BioGRID source) are shown across cell lines. Each point represents the mean CS value for a class in an individual cell line. All the points are above the diagonal (dashed gray line), showing that the effect is systematic and largely independent of cell line. Similar plots for both PPI sources and CS2 dataset are shown in Appendix Fig S4. Similar to panel (B), but comparing paralogs that form heteromers and homomers to those that form homomers only (defined by "all PPI"s from BioGRID source). This result shows that the difference between heteromers and non-heteromers is not caused by the fact that heteromers are also enriched for homomers. Similar plots for both PPI sources and CS2 dataset are shown in Appendix Fig S4. Paralogs that form heteromers tend to have been duplicated earlier in evolution. The age of the paralog pairs is shown in terms of synonymous substitutions per site (dS) (see Materials and Methods), a proxy for age. Data are shown for interactions derived from "all PPI", and those that are more likely to detect "direct PPI". P-values from two-sided Mann-Whitney U tests are shown. Paralogs that form heteromers tend to be more deleterious upon LOF than other paralogs. Data from CS2.1 are shown, largely independent of the age of the paralog. In the legends, paralogs are ordered by their age. The CS values per class of paralogs (heteromer or not) and their age group are aggregated by taking median across cell lines. Note that while heteromers are more deleterious in most of the age groups, in the case of 2 out of 10 age groups a reverse trend is observed. Distributions of the CS values per class of paralogs (heteromer or not) and their age group for this analysis are shown in Appendix Fig S5A. Similar analysis with dataset CS2 and for heteromers detected with "direct PPI"s only is shown in Appendix Fig S5 B–D. P-values from two-sided Mann-Whitney U tests are shown. Data information: On the violin plots (panel A and D), the medians of the distributions are denoted by a horizontal black line, while the quartiles of the distributions from the median value are indicated by a vertical thick black line. For clarity, the upper and lower tails of the distributions are not shown in panel (A). Source data are available online for this figure. Source Data for Figure 2 [msb198871-sup-0012-SDataFig2.xlsx] Download figure Download PowerPoint One potential confounding factor with this analysis is the fact that the frequency of heteromers could covary with the age of paralogs, which we showed above to affect at least partially the essentiality of the gene (Fig 1D). Heteromers are indeed older than the non-heteromeric paralogs (Fig 2D), albeit only in the case of the heteromers defined by "all PPI"s. We therefore looked at CS values of paralog LOF corrected for age, by using age groups. We observed that for all age groups, except for two, CS values for heteromers are indeed lower than for non-heteromers, suggesting that this effect is largely independent from age (Fig 2E). The reason for the inconsistency between the two age groups is however unclear, the potential confounding factors could be the DNA sequence divergence between paralogs and the ability of gRNA to target one gene specifically. Molecular functions enriched for heteromeric paralogs tend to be more critical for cell proliferation It is possible that the effects detected are due to specific gene functions that would be particularly associated with heteromeric paralogs. We first examined whether heteromers of paralogs are enriched for particular function among all paralogs. We found that heteromers of paralogs are enriched for gene sets containing proteins that have catalytic activity and known to directly interact/regulate with each other such as kinase binding (Breitkreutz et al, 2010) as well as DNA binding proteins from the histone deacetylase binding gene set (see Dataset EV4 gene sets and GO terms used in the analysis, Dataset EV5 for enrichment

Referência(s)