Editorial Acesso aberto Revisado por pares

State of the art in data integration and network biology (special issue)

2023; Wiley; Volume: 23; Issue: 21-22 Linguagem: Inglês

10.1002/pmic.202300325

ISSN

1615-9861

Autores

Andrew Emili,

Tópico(s)

Computational Drug Discovery Methods

Resumo

Data integration and network biology are intertwined computational approaches for examining disparate experimental data sets as means of achieving a more holistic understanding of complex biological systems. Modern biomedical research produces an increasing amount and variety of high-throughput data types, ranging from hundreds of high-resolution protein structural models to spatiotemporally resolved expression profiles of tens of thousands of gene products. For example, proteomics, the large-scale study of protein structure and function, can provide pivotal insights into biological systems at the level of protein associations, folding, modifications, subcellular localization, and abundance within a cell or tissue over multiple time points. Unlike the genome, which is relatively static since an organism's genetic material tends to change little over the course of an individual's lifetime, the proteome is highly dynamic in response to changing physiological needs, as is metabolism, which reflects how different metabolite compounds are converted and utilized by diverse and overlapping enzymatic processes within a cell. Since different ‘omic layers provide orthogonal molecular information, integrating diverse datasets is essential to achieving a comprehensive understanding of how biological systems adapt to extrinsic stimuli and intrinsic cues, ranging from signal transduction cascades to adaptive metabolic remodeling. To assist with data integration, a plethora of bioinformatic pipelines have been devised to facilitate and automate the analysis of disparate data types, ranging from quantitative expression profiles through to post-translational modification landscapes. Network biology, the study of biological systems as dynamic networks of interacting components, has emerged as a powerful unifying framework for integrative data analysis. Computational network analysis techniques adapted from graph theory that use nodes and edges to represent binary associations can extract insights or hypotheses regarding causal dependencies for further experimental assessment. For instance, protein-protein interaction networks can reveal how large sets of proteins or enzymes carry out their cellular roles together within larger adaptive biochemical systems, such as metabolic pathways. Graph-based network visualization software tools, such as Cytoscape, can be used to display multiple datatypes, while more sophisticated statistical methods, including machine learning algorithms, are increasingly being used to merge and interpret different experimental data types with the aim of identifying recurrent patterns that bridge disparate ‘omic layers. Such an integrative framework also allows for a deeper mechanistic investigation of pathobiology, which is essential for rational drug discovery and the development of personalized medicines. For example, combining the genomic information obtained from a cancer patient together with the proteomic profile of their tumor can reveal biochemical maladaptations driving neoplastic growth, leading to the identification of potentially actionable targets that can be neutralized therapeutically. Public annotation repositories that systematically curate the literature, such as NCBI, EMBL, UniProt, or STRING, can then be cross-referenced to validate the reliability of computational inferences against independent lines of evidence. Especially notable techniques for generating and leveraging a vast body of biomedical information to empower the emerging domain of integrative data science are covered in this Special Issue in the form of primary Research Articles, Review Articles, a Technical Brief, and an expert Viewpoint. Diamandis and colleagues provide an informed perspective into the importance of integrating multidimensional models to decipher the vast intra-tumoral heterogeneity present in glioblastoma and likely other tumors of the central nervous system, while the review from Demir and associates offers a generalizable framework for leveraging prior biological information from curated sources, using case studies to illustrate the advantages and pitfalls for network-based integrative omics data analyses. In a clinically relevant flourish, Gramolini and coauthors describe disparate strategies for multi-omic analyses of diseased tissue samples that illustrate the utility of a network biology foundation to explore in an iterative step-wise manner the pathobiological basis of multifactorial cardiovascular disorders. Li et al. report primary research establishing the impact of integrating both intracellular and extracellular proteomic profiles to achieve a more in-depth mechanistic understanding of communication among multi-cellular systems, using a robust model of prostate cancer to illustrate key points, while the work of Sardiu and team illustrates the power of an integrative machine learning framework for identifying dynamic cytometric patterns that encapsulate key biological transitions. Crovella and colleagues review state-of-the-art computational methods, including innovative new machine-learning techniques such as deep learning, to predict protein-protein interactions on a large-scale with high accuracy and coverage, while Foster and friends capable demonstrate the utility of powerful global mass spectrometry screening techniques, such as data-independent data acquisition, as workhorse experimental platforms for large-scale interactome mapping efforts. In a complementary technical brief, Figeys and associates describe an economical isotope labeling approach for increasing the throughput of multi-sample proteomic surveys while ensuring data reliability. Highlighting the flexible application of computational and experimental methods, Youssef and partners deploy quantitative multiplex co-fractionation mass spectrometry followed by comparative scoring to examine how Escherichia coli remodels its protein interaction networks in response to changing environmental conditions, which is accompanied by a commentary by Low exploring biochemical rewiring of metabolic networks. In terms of maladaptative metabolic switches, Seifert and coworkers describe techniques for measuring the marked shift towards succinate-producing microbes after the microbiome is challenged with a widely used antibiotic, while Li and colleagues review recent advances in the experimental and integrative computational characterization of protein citrullination, which modulates a diverse array of cellular functions, outlining its increasingly recognized relevance to biomedical research. We trust the accompanying dozen manuscripts presented in this Special Issue will serve as valuable resources to support, inform and guide our sage readers efforts to exploit ‘omics data integration approaches and network biology tools to advance their own biomedical research programs. Sincerely, Andrew Emili, PhD Professor in Network Biology Knight Cancer Institute, Portland OR

Referência(s)
Altmetric
PlumX