Causal integration of multi‐omics data with prior knowledge to generate mechanistic hypotheses
2021; Springer Nature; Volume: 17; Issue: 1 Linguagem: Inglês
10.15252/msb.20209730
ISSN1744-4292
AutoresAurélien Dugourd, Christoph Kuppe, Marco Sciacovelli, Enio Gjerga, Attila Gábor, Kristina B. Emdal, Vítor Vieira, Dorte B. Bekker‐Jensen, Jennifer Kranz, Eric M. Bindels, Ana S.H. Costa, Abel Sousa, Pedro Beltrão, Miguel Rocha, Jesper V. Olsen, Christian Frezza, Rafael Kramann, Julio Sáez-Rodríguez,
Tópico(s)Microbial Metabolic Engineering and Bioproduction
ResumoMethod27 January 2021Open Access Transparent process Causal integration of multi-omics data with prior knowledge to generate mechanistic hypotheses Aurelien Dugourd Aurelien Dugourd orcid.org/0000-0002-0714-028X Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, GermanyThese authors contributed equally to this work Search for more papers by this author Christoph Kuppe Christoph Kuppe orcid.org/0000-0003-4597-9833 Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, Germany Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The NetherlandsThese authors contributed equally to this work Search for more papers by this author Marco Sciacovelli Marco Sciacovelli orcid.org/0000-0003-2958-4292 MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UKThese authors contributed equally to this work Search for more papers by this author Enio Gjerga Enio Gjerga orcid.org/0000-0001-8042-0395 Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Search for more papers by this author Attila Gabor Attila Gabor Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Search for more papers by this author Kristina B. Emdal Kristina B. Emdal Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Vitor Vieira Vitor Vieira Centre of Biological Engineering, University of Minho - Campus de Gualtar, Braga, Portugal Search for more papers by this author Dorte B. Bekker-Jensen Dorte B. Bekker-Jensen Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Jennifer Kranz Jennifer Kranz Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Department of Urology and Pediatric Urology, St. Antonius Hospital Eschweiler, Academic Teaching Hospital of RWTH Aachen, Eschweiler, Germany Department of Urology and Kidney Transplantation, Martin Luther University, Halle (Saale), Germany Search for more papers by this author Eric.M.J. Bindels Eric.M.J. Bindels Department of Hematology, Erasmus MC, Rotterdam, The Netherlands Search for more papers by this author Ana S.H. Costa Ana S.H. Costa MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK Search for more papers by this author Abel Sousa Abel Sousa Institute for Research and Innovation in Health (i3s), Porto, Portugal European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK Search for more papers by this author Pedro Beltrao Pedro Beltrao orcid.org/0000-0002-2724-7703 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK Search for more papers by this author Miguel Rocha Miguel Rocha Centre of Biological Engineering, University of Minho - Campus de Gualtar, Braga, Portugal Search for more papers by this author Jesper V. Olsen Jesper V. Olsen orcid.org/0000-0002-4747-4938 Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Christian Frezza Corresponding Author Christian Frezza [email protected] orcid.org/0000-0002-3293-7397 MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK Search for more papers by this author Rafael Kramann Corresponding Author Rafael Kramann [email protected] orcid.org/0000-0003-4048-6351 Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, Germany Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The Netherlands Search for more papers by this author Julio Saez-Rodriguez Corresponding Author Julio Saez-Rodriguez [email protected] orcid.org/0000-0002-8552-8976 Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Molecular Medicine Partnership Unit, European Molecular Biology Laboratory, Heidelberg University, Heidelberg, Germany Search for more papers by this author Aurelien Dugourd Aurelien Dugourd orcid.org/0000-0002-0714-028X Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, GermanyThese authors contributed equally to this work Search for more papers by this author Christoph Kuppe Christoph Kuppe orcid.org/0000-0003-4597-9833 Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, Germany Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The NetherlandsThese authors contributed equally to this work Search for more papers by this author Marco Sciacovelli Marco Sciacovelli orcid.org/0000-0003-2958-4292 MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UKThese authors contributed equally to this work Search for more papers by this author Enio Gjerga Enio Gjerga orcid.org/0000-0001-8042-0395 Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Search for more papers by this author Attila Gabor Attila Gabor Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Search for more papers by this author Kristina B. Emdal Kristina B. Emdal Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Vitor Vieira Vitor Vieira Centre of Biological Engineering, University of Minho - Campus de Gualtar, Braga, Portugal Search for more papers by this author Dorte B. Bekker-Jensen Dorte B. Bekker-Jensen Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Jennifer Kranz Jennifer Kranz Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Department of Urology and Pediatric Urology, St. Antonius Hospital Eschweiler, Academic Teaching Hospital of RWTH Aachen, Eschweiler, Germany Department of Urology and Kidney Transplantation, Martin Luther University, Halle (Saale), Germany Search for more papers by this author Eric.M.J. Bindels Eric.M.J. Bindels Department of Hematology, Erasmus MC, Rotterdam, The Netherlands Search for more papers by this author Ana S.H. Costa Ana S.H. Costa MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK Search for more papers by this author Abel Sousa Abel Sousa Institute for Research and Innovation in Health (i3s), Porto, Portugal European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK Search for more papers by this author Pedro Beltrao Pedro Beltrao orcid.org/0000-0002-2724-7703 European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK Search for more papers by this author Miguel Rocha Miguel Rocha Centre of Biological Engineering, University of Minho - Campus de Gualtar, Braga, Portugal Search for more papers by this author Jesper V. Olsen Jesper V. Olsen orcid.org/0000-0002-4747-4938 Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark Search for more papers by this author Christian Frezza Corresponding Author Christian Frezza [email protected] orcid.org/0000-0002-3293-7397 MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK Search for more papers by this author Rafael Kramann Corresponding Author Rafael Kramann [email protected] orcid.org/0000-0003-4048-6351 Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, Germany Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The Netherlands Search for more papers by this author Julio Saez-Rodriguez Corresponding Author Julio Saez-Rodriguez [email protected] orcid.org/0000-0002-8552-8976 Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany Molecular Medicine Partnership Unit, European Molecular Biology Laboratory, Heidelberg University, Heidelberg, Germany Search for more papers by this author Author Information Aurelien Dugourd1,2,3,4, Christoph Kuppe3,4,5, Marco Sciacovelli6, Enio Gjerga1,2, Attila Gabor1, Kristina B. Emdal7, Vitor Vieira8, Dorte B. Bekker-Jensen7, Jennifer Kranz3,9,10, Eric.M.J. Bindels11, Ana S.H. Costa6,15, Abel Sousa12,13, Pedro Beltrao13, Miguel Rocha8, Jesper V. Olsen7, Christian Frezza *,6, Rafael Kramann *,3,4,5 and Julio Saez-Rodriguez *,1,2,14 1Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Heidelberg University, Heidelberg, Germany 2Faculty of Medicine, Joint Research Centre for Computational Biomedicine (JRC-COMBINE), RWTH Aachen University, Aachen, Germany 3Faculty of Medicine, Institute of Experimental Medicine and Systems Biology, RWTH Aachen University, Aachen, Germany 4Division of Nephrology and Clinical Immunology, Faculty of Medicine, RWTH Aachen University, Aachen, Germany 5Department of Internal Medicine, Nephrology and Transplantation, Erasmus Medical Center, Rotterdam, The Netherlands 6MRC Cancer Unit, Hutchison/MRC Research Centre, University of Cambridge, Cambridge, UK 7Faculty of Health and Medical Sciences, Proteomics Program, Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Copenhagen, Denmark 8Centre of Biological Engineering, University of Minho - Campus de Gualtar, Braga, Portugal 9Department of Urology and Pediatric Urology, St. Antonius Hospital Eschweiler, Academic Teaching Hospital of RWTH Aachen, Eschweiler, Germany 10Department of Urology and Kidney Transplantation, Martin Luther University, Halle (Saale), Germany 11Department of Hematology, Erasmus MC, Rotterdam, The Netherlands 12Institute for Research and Innovation in Health (i3s), Porto, Portugal 13European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, UK 14Molecular Medicine Partnership Unit, European Molecular Biology Laboratory, Heidelberg University, Heidelberg, Germany 15Present address: Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA *Corresponding author. Tel: +44 1223 330608; E-mail: [email protected] *Corresponding author. Tel: +49 241 80 37750; E-mail: [email protected] *Corresponding author. Tel: +49 622 15451210; E-mail: [email protected] Molecular Systems Biology (2021)17:e9730https://doi.org/10.15252/msb.20209730 PDFDownload PDF of article text and main figures. Peer ReviewDownload a summary of the editorial decision process including editorial decision letters, reviewer comments and author responses to feedback. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info Abstract Multi-omics datasets can provide molecular insights beyond the sum of individual omics. Various tools have been recently developed to integrate such datasets, but there are limited strategies to systematically extract mechanistic hypotheses from them. Here, we present COSMOS (Causal Oriented Search of Multi-Omics Space), a method that integrates phosphoproteomics, transcriptomics, and metabolomics datasets. COSMOS combines extensive prior knowledge of signaling, metabolic, and gene regulatory networks with computational methods to estimate activities of transcription factors and kinases as well as network-level causal reasoning. COSMOS provides mechanistic hypotheses for experimental observations across multi-omics datasets. We applied COSMOS to a dataset comprising transcriptomics, phosphoproteomics, and metabolomics data from healthy and cancerous tissue from eleven clear cell renal cell carcinoma (ccRCC) patients. COSMOS was able to capture relevant crosstalks within and between multiple omics layers, such as known ccRCC drug targets. We expect that our freely available method will be broadly useful to extract mechanistic insights from multi-omics studies. SYNOPSIS A new approach integrates multi-omics datasets with a prior knowledge network spanning signaling, metabolism and allosteric regulations. Application to a kidney cancer patient cohort captures relevant cross-talks among deregulated processes. A causal multi-omics network is built by integrating multiple ressources spanning signaling, metabolism and allosteric regulations. Transcriptomics, phosphoproteomics and metabolomics data are integrated in a set of coherent mechanistic hypotheses using CARNIVAL, a tool contextualizing causal networks. This set of coherent mechanistic hypotheses can be mined to identify disease mechanisms and therapeutic targets. A network built for a cohort of kidney cancer patients shows coherence with other studies and known therapeutic targets. Introduction “Omics” technologies measure at the same time thousands of molecules in biological samples, from DNA, RNA, and proteins to metabolites. Omics datasets are an essential component of systems biology and are made possible by the popularization of analytical methods such as next-generation sequencing or mass spectrometry. Omics data have enabled the unbiased characterization of the molecular features of multiple human diseases, particularly in cancer (preprint: Jelinek & Wu, 2012; Iorio et al, 2016; Subramanian et al, 2017). It is becoming increasingly common to characterize multiple omics layers in parallel, with so-called “trans-omics analysis”, to gain biological insights spanning multiple types of cellular processes (Sciacovelli et al, 2016; Kawata et al, 2018; Vitrinel et al, 2019). Consequently, many tools are developed to analyze such data (Tenenhaus et al, 2014; Argelaguet et al, 2018; Sharifi-Noghabi et al, 2019; Singh et al, 2019; Liu et al, 2019c), mainly by adapting and combining existing “single omics” methodologies to multiple parallel datasets. These methods identify groups of measurements and derive integrated statistics to describe them, effectively reducing the dimensionality of the datasets. These methods are useful to provide a global view of the data, but additional processing is required to extract mechanistic insights from them. To extract mechanistic insights from datasets, some methods (such as pathway enrichment analysis) use prior knowledge about the players of the process being investigated. For instance, differential changes in the expression of the genes that constitute a pathway can be used to infer the activity of that pathway. Methods that a priori define groups of measurements based on known regulated targets (that we call footprints (Dugourd & Saez-Rodriguez, 2019)) of transcription factors (TFs; Alvarez et al, 2016; Garcia-Alonso et al, 2019), kinases/phosphatases (Wiredja et al, 2017), and pathway perturbations (Schubert et al, 2018) provide integrated statistics that can be interpreted as a proxy of the activity of a molecule or process. These methods seem to estimate more accurately the status of processes than classic pathway methods (Cantini et al, 2018; Schubert et al, 2018; Dugourd & Saez-Rodriguez, 2019). Since each of these types of footprint methods works with a certain type of omics data, finding links between them could help to interpret them collectively in a mechanistic manner. For example, one can use a network diffusion algorithm, such as TieDIE (Paull et al, 2013), to connect different omics footprints together (Drake et al, 2016). This approach provides valuable insights, but diffusion (or random walk) based algorithms do not typically take into account causal information (such as activation/inhibition) that is available and are essential to extract mechanistic information. TieDIE partially addressed this problem by focusing the diffusion process on causally coherent subparts of a network of interest, but it is thus limited to local causality. Recently, we proposed the CARNIVAL tool (Liu et al, 2019b) to systematically generate mechanistic hypotheses connecting TFs through global causal reasoning supported by Integer Linear Programming. CARNIVAL connects activity perturbed nodes such as drug targets with deregulated TFs activities by contextualizing a signed and directed Prior Knowledge Network (PKN). We had hypothesized how such a method could potentially be used to connect footprint-based activity estimates across multiple omics layers (Dugourd & Saez-Rodriguez, 2019). In this study, we introduce COSMOS (Causal Oriented Search of Multi-Omics Space). This approach connects TF and kinase/phosphatases activities (estimated with footprint-based methods) as well as metabolite abundances with a novel PKN spanning across multiple omics layers (Fig 1). COSMOS uses CARNIVAL’s Integer Linear Programming (ILP) optimization strategy to find the smallest coherent subnetwork causally connecting as many deregulated TFs, kinases/phosphatases, and metabolites as possible. The subnetwork is extracted from a novel integrated PKN spanning signaling, transcriptional regulation, and metabolism of > 117,000 edges. CARNIVAL’s ILP formulation effectively allows to evaluate the global network's causal coherence given a set of known TF, kinases/phosphatases activities and metabolite abundances. While we showcase this method using transcriptomics, phosphoproteomics, and metabolomics inputs, COSMOS can theoretically be used with any other additional inputs, as long as they can be linked to functional insights (for example, a set of deleterious mutations). As a case study, we generated transcriptomics, phosphoproteomics, and metabolomics datasets from kidney tumor tissue and corresponding healthy tissue out of nine clear cell renal cell carcinoma (ccRCC) patients. We estimated changes of activities of TFs and kinase/phosphatases as well as metabolite abundance differences between tumor and healthy tissue. We integrated multiple curated resources of interactions between proteins, transcripts, and metabolites together to build a meta PKN. Next, we contextualized the meta PKN to a specific experiment. To do so, we identified causal pathways from our prior knowledge that connect the observed changes in activities of TFs, kinases, phosphatases, and metabolite abundances between tumor and healthy tissue. These causal pathways can be used as hypothesis generation tools to better understand the molecular phenotype of kidney cancer. We also refactored all functions to run the COSMOS approach into an R package. Figure 1. Overview of analysis pipeline From left to right: We sampled and processed 11 patient tumors and healthy kidney tissues from the same kidney through RNA-sequencing and 9 of those same patients through mass spectrometry to characterize their transcriptomics, phosphoproteomics, and metabolomics profiles. We calculated differential abundance for each detected gene, phosphopeptide, and metabolite. We estimated kinase and transcription factor activities using the differential analysis statistics and footprint-based methods. We used the estimated activities alongside the differential metabolite abundances to contextualize (i.e., extract the subnetwork that better explains the phenotype of interest) a generic trans-omics causal prior knowledge network (meta PKN). Download figure Download PowerPoint Results Building the multi-omics dataset To build a multi-omics dataset of renal cancer, we performed transcriptomics, phosphoproteomics, and metabolomics analyses of renal nephrectomies and adjacent normal tissues of 11 renal cancer patients (for details on the patients see Dataset EV1). First, we processed the different omics datasets to prepare for the analysis. For the transcriptomics dataset, 15,919 transcripts with average counts > 50 were kept for subsequent analysis. In the phosphoproteomics dataset, 14,243 phosphosites detected in at least four samples were kept. In the metabolomics dataset, 107 metabolites detected across 16 samples were kept. Principal component analysis (PCA) of each omics dataset independently showed a clear separation of healthy and tumor tissues on the first component (transcriptomics: 40% of explained variance (EV), phosphoproteomics: 26% of EV, metabolomics: 28% of EV, Fig EV1), suggesting that tumor sample displayed molecular deregulations spanning across signaling, transcription and metabolism. Each omics dataset was independently submitted to differential (tumor vs. healthy tissue) analysis using LIMMA (Ritchie et al, 2015). Consistently with the PCA, a volcano plot overlapping the results of the differential analysis of each omics showed that the transcriptomics dataset led to larger differences and smaller P-values than phosphoproteomics and metabolomics extracted from the same samples (Fig EV2). This is further apparent by the number of hits under a given false discovery rate (FDR, Benjamini & Hochberg, 1995) threshold. We obtained 6,699 transcripts and 21 metabolites significantly regulated with FDR < 0.05. While only 11 phosphosites were found under 0.05 FDR, 447 phosphosites had an FDR < 0.2. This result confirmed the deep molecular deregulations of tumors spanning across signaling, transcription, and metabolism. Then, the differential statistics for all tested (not just the ones under the FDR threshold) transcripts, phosphopeptides, and metabolites were used for further downstream analysis, as explained below. Click here to expand this figure. Figure EV1. First two components of PCA of omic datasets of kidney tumor and healthy samples PCA of metabolomics (107 metabolites), phosphoproteomics (14,243 phosphosites) and transcriptomics (15,919 transcripts) datasets for tumor and healthy tissues samples. For each omics dataset, PCA is run independently on normalised datasets and the first two components are plotted. Each omics shows a clear separation between tumor and healthy tissue. Download figure Download PowerPoint Click here to expand this figure. Figure EV2. Differential analysis of kidney tumor and healthy sample Combined volcano plot of metabolomics (107 metabolites), phosphoproteomics (14,243 phosphosites) and transcriptomics (15,919 transcripts), displaying the surface occupied by the points of each type of omic data. X axis represent the log2 fold change between healthy and tumour. Y axis represents the −log10 P-value of a unpaired moderated t-statistic obtained from using LIMMA. Download figure Download PowerPoint Footprint-based transcription factor, kinase, and phosphatase activity estimation We then performed computational footprint analysis to estimate the activity of proteins responsible for changes observed in specific omics datasets. By the term “activity”, we refer to a quantifiable proxy of the function of a protein, estimated based on the footprint left by said activity. This definition can apply, but is not limited to, an enzyme’s catalytic activity. Footprint-based activity estimation (Dugourd & Saez-Rodriguez, 2019) relies on the concept that the measured abundances of molecules (such as phosphopeptides or transcripts) can be used as a proxy of upstream (direct or indirect) regulator activities responsible for those changes (Rhodes et al, 2005; Casado et al, 2013; Ochoa et al, 2016). In the case of TF activity estimation, this means that measured changes in the abundances of transcripts give us information about the changes of activities of the transcription factors that regulate their abundance. An activity estimation only depends on the changes of the abundances measured in its target transcripts, not its own transcript abundance. In this study, we used the VIPER algorithm (Alvarez et al, 2016) to estimate the activity of transcription factors and kinases based on transcript and phosphopeptide abundances changes, respectively. For transcriptomics and phosphoproteomics data, this analysis estimates transcription factor and kinases/phosphatase activity, respectively. 24,347 transcription factors (TFs) to target interactions (i.e., transcript under the direct regulation of a transcription factor) were obtained from DoRothEA (Garcia-Alonso et al, 2019), a meta-resource of TF-target interactions. Those TF-target interactions span over 365 unique transcription factors. In parallel, 33,616 interactions of kinase/phosphosphate and their phosphosite targets (i.e., phosphopeptides directly (de)phosphorylated by specific kinases (phosphatases)) were obtained from OmniPath (Türei et al, 2016) kinase substrate network, a meta-resource focused on curated information on signaling processes. Only TFs and kinases/phosphatases with at least 10 and 5 detected substrates, respectively, were included. This led to the activity estimation of 328 TFs and 174 kinases. In line with the results of the differential analysis, where fewer phosphosites were deregulated than transcripts, TF activities displayed a stronger deregulation than kinases. TF activity scores reached a maximum of 8.7 standard deviations (sd) for Transcription Factor Spi-1 Proto-Oncogene (SPI1) (compared to the null score distribution; sd compared to null is also referred to as a normalized enrichment score, NES), while kinase activity scores reached a maximum of 4.6 NES for Casein Kinase 2 Alpha 1 (CSNK2A1). In total, 102 TFs and kinases/phosphatase had an absolute score over 1.7 NES (P < 0.05) and were considered significantly deregulated in kidney tumor samples (Fig 2A). The presence of several known signatures of ccRCC corroborated the validity of our analysis. For instance, hypoxia (HIF1A), inflammation (STAT2, Fig 2B), and oncogenic (MYC, Cyclin Dependent Kinase 2 and 7 (CDK2/7, (Fig 2C)) markers were up-regulated in tumors compared to healthy tissues (Zeng et al, 2014; Schödel et al, 2016; Clark et al, 2020). Furthermore, among suppressed TFs we identified, the HNF4A gene has been previously associated with ccRCC (Lucas et al, 2005). Figure 2. Differentially regulated transcription factor, kinase, and phosphatase activities cancer vs. healthy tissue Bar plot displaying the normalized enrichment score (NES, proxy of activity change) of the 25 up- or down-regulated TF and top 25 up- or down-regulated kinase and phosphatases activities between kidney tumor and adjacent healthy tissue. Right panel shows the 10 most changing RNA abundances of the STAT2 regulated transcripts . Left panel shows the change of abundances of all STAT2 regulated transcripts that were used to estimate its activity change. X-axis represents log fold change of regulated transcripts multiplied by the sign of regulation (−1 for inhibition and 1 for activation of transcription). Y-axis represents the significance of the log fold change (−log10 of P-value, LIMMA moderated unpaired t-test P-values). The black line is defined by the following function when fold change is negative : y = abs(hAss − 1 + x/(x + vAss)); and y = abs(hAss − 1 + x/(x − vAss)) when fold change is positive. abs() is the absolute value, hAss is the horizontal asymptote (hAss = 1.3) and vAss is the vertical asymptote (vAss = 0.3). Right panel sho
Referência(s)