Compounds With Multitarget Activity: Structure-Based Analysis and Machine Learning
2020; Future Science Ltd; Volume: 2; Issue: 3 Linguagem: Inglês
10.4155/fdd-2020-0014
ISSN2631-3316
AutoresChristian Feldmann, Jürgen Bajorath,
Tópico(s)Synthesis and biological activity
ResumoFuture Drug DiscoveryVol. 2, No. 3 EditorialOpen AccessCompounds with multitarget activity: structure-based analysis and machine learningChristian Feldmann & Jürgen BajorathChristian FeldmannDepartment of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany & Jürgen Bajorath *Author for correspondence: Tel.: +49 228 7369 100; Fax: +49 228 7369 101; E-mail Address: bajorath@bit.uni-bonn.dehttps://orcid.org/0000-0002-0557-5714Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, GermanyPublished Online:26 Jun 2020https://doi.org/10.4155/fdd-2020-0014AboutSectionsPDF/EPUB ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinkedInRedditEmail Keywords: active compoundsbiological targetsmachine learningmultitarget activitypolypharmacologypromiscuityprotein classesx-ray crystallographyIn drug discovery, small molecules with multitarget activity (also termed 'promiscuity') are of increasing interest for interfering with derailed cellular signaling pathways or perturbed target networks implicated in complex multifactorial diseases [1–4]. The use of promiscuous compounds for therapeutic intervention gives rise to polypharmacology [3,4]. Going beyond the identification of new cellular targets for known active compounds and drug repurposing [5–7], an ultimate goal of polypharmacological drug discovery is the design and synthesis of new small molecules with predefined multitarget activity [3,4]. A prerequisite for consistently reaching this goal beyond serendipity is rationalizing compound promiscuity at the molecular level of detail [8], given that 'we can only design what we fundamentally understand'. However, rationalizing promiscuity is generally complicated, both from an experimental and computational perspective [8,9]. A variety of assay interference effects can give rise to false positive compound activity annotations [9–12], incompleteness of experimental measurements causes likely false negatives [8,13] and varying levels of data confidence affect activity assignments [8,14]. Moreover, many puzzling structure–promiscuity relationships have been uncovered in the presence of high data confidence and among extensively tested compounds that are difficult to rationalize [15,16]. Structural analogs often display large differences in the number of targets they are active against [15,16]. Such analogs represent 'promiscuity cliffs' [16]. In network representations, promiscuity cliffs often form pathways of analogs with alternating low and high promiscuity, providing equally interesting and challenging instances of structure–promiscuity relationships [16] that are as of yet little explored. The complexity of analyzing compound promiscuity makes it difficult to rationalize molecular origins of multitarget activity and translate relevant knowledge into practical compound design strategies.Structure-based exploration of multitarget activityThe arguably highest level of confidence in studying compounds with multitarget activity at the atomic level of detail is achieved by analyzing x-ray structures of ligand–target complexes [17,18]. However, compared with compound activity data, structural data are sparse, despite the rapid increase in x-ray structures over the past decade. Currently, the Protein Data Bank (RCSB) [19] contains nearly 165,000 entries.At the structural level, promiscuity can be viewed from a ligand [17,18] or target perspective [17,20], which is not mutually exclusive. Several studies have revealed that promiscuous ligands often bind to similar binding sites in proteins [17,21,22]. If proteins belong to the same family they typically have similar binding sites (and functions), which favor compound binding to related targets. However, many promiscuous compounds interact with targets from different families, including distantly related or unrelated proteins [8]. Numerous x-ray structures of complexes between proteins and promiscuous compounds have been determined, but binding characteristics of promiscuous ligands across different targets are usually 'hidden'. A systematic analysis of the RCSB identified 702 compounds with molecular weight >300 Da in x-ray complexes with proteins from different families [23]. A subset of 168 of these multifamily ligands was also found in analog series from medicinal chemistry sources having a variety of target annotations. These analog series yielded 133 unique series-based scaffolds, providing structurally confirmed templates for polypharmacology-directed compound design [23].Given the large number of structurally characterized multifamily ligands, an interesting question was how binding modes of such compounds in different protein environments might compare. Excluding endogenous ligands and molecules binding to metabolizing enzymes, 142 multifamily ligands were identified in the RCSB for which at least one numerically specified potency value (≤10 μM) was available in compound repositories [24] (such as ChEMBL [25]). The 142 ligands from x-ray structures included a subset of 91 compounds that bound to functionally unrelated targets having different binding domains. These multifamily ligands were often active in the nanomolar range, typically more hydrophilic than randomly selected x-ray ligands and available in complexes with proteins from up to 16 families. Their binding modes and interaction patterns in different binding sites were systematically compared. About half of these ligands, which included both rigid and flexible molecules, were found to bind with similar conformations to their targets (<1 Å atomic root mean square deviation) but formed different interaction 'hotspots' [24]. Hence, in these cases, similar binding modes of multifamily ligands produced different target-specific interaction patterns.Ligand binding modes were further investigated by searching for x-ray ligands in complexes with proteins from different classes, termed as multiclass ligands [26]. Protein classes were defined following the Gene Ontology scheme [27]. Accordingly, each of these classes combined a number of functionally related protein families. Thus, multiclass ligands represented promiscuous compounds binding to 'maximally diverse' protein environments. The systematic analysis identified 70 multiclass ligands that originated from 331 complex x-ray structures involving 255 unique proteins from 19 different Gene Ontology classes. Multiclass ligands, which included 25 previously identified multifamily ligands, were mostly hydrophilic in nature and also included both rigid and flexible compounds. Binding modes of individual ligands in proteins from different classes were subjected to pairwise shape comparisons. The resulting shape similarities differed significantly. Some multiclass ligands were found to adopt the same binding mode in different protein classes, whereas others displayed distinct binding modes. For a given ligand, similar and distinct binding modes in different proteins also occurred [26]. Hence, findings for multiclass ligands further extended observations made for multifamily ligands. In summary, x-ray structures confirmed 'specific' binding of promiscuous compounds to distinct targets and revealed a variety of binding characteristics.Corroborating evidence through machine learningEvidence for the presence of 'true' structure–promiscuity relationships was also provided by machine learning (ML). A variety of ML models have been introduced to flag compounds with assay interference potential and frequent hitters [28,29]. These models aim to identify compounds with undesired 'artificial' promiscuity. However, it was possible to derive ML models that systematically distinguished truly promiscuous compounds from others [30]. For these studies, promiscuity cliffs originating from extensively assayed compounds and kinase inhibitors with high-confidence activity data served as a source of compounds with multitarget activity and nonpromiscuous or inactive analogs. This made it possible to generate training and test sets containing structurally analogous compounds that were either promiscuous or nonpromiscuous, which provided challenging conditions for ML. Different ML methods were applied to build classification models based upon chemical structure and nearest neighbor analysis was carried out as a control. The models reached more than 70% prediction accuracy in distinguishing promiscuous and nonpromiscuous compounds and nearest neighbor analysis often approached the accuracy level of ML. In addition, feature relevance analysis identified substructures in promiscuous test compounds that determined successful predictions [30]. Of note, maximally achieved accuracy of less than 80% using different ML models indicated that predictions did not entirely depend on chemical structure and that secondary factors such as data incompleteness might play a role. However, taken together, the results of ML exercises provided firm evidence for the presence of structural patterns that systematically distinguished promiscuous from nonpromiscuous or inactive compounds.ConclusionCompounds with defined multitarget activity are of high relevance for drug discovery. However, promiscuity is complicated to assess and even more difficult to predict. Systematic analyses of x-ray structures of ligand–target complexes have confirmed promiscuous binding events in distinct protein binding sites at the atomic level of detail. Moreover, they identified a variety of ligands capable of interacting with distantly related or unrelated targets and yielded template structures for multitarget drug design.What have we learned about these compounds and their promiscuity-conferring binding modes? First, multifamily and multiclass ligands are far from being small, rigid and mostly hydrophobic chemical entities that might act as 'low-potency glue' in different binding sites. By contrast, many of them are flexible, hydrophilic and have high potency against distinct targets. Multifamily ligands often adopt similar binding modes in different targets but form distinct interaction patterns, which rationalize promiscuity. On the other hand, various multiclass ligands frequently bind with different shapes to their targets, providing another structural rationale for promiscuity.Promiscuous compounds most likely possess characteristic structural features. ML also supports the presence of structural patterns that systematically differentiate between multitarget ligands and nonpromiscuous compounds. Otherwise, ML models discussed herein could not possibly be predictive. However, we are still unable to consistently grasp such patterns and generalize them. The sample size of structurally characterized ligands with activity against distinct targets is still too small to put us in a position to learn the molecular language of promiscuity. Ultimately, we will need to understand this language to confidently design multitarget ligands. However, we are beginning to understand promiscuous binding events at the atomic level of detail and identify leading molecular actors. This will keep compound promiscuity analysis scientifically interesting, at different levels, and give us hope that we will be able to gain further knowledge concerning the molecular origins of promiscuity that can then be translated into viable compound design strategies.Financial & competing interests disclosureThe authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.No writing assistance was utilized in the production of this manuscript.Open accessThis work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/Papers of special note have been highlighted as: • of interest; •• of considerable interestReferences1. Ramsay RR, Popovic-Nikolic MR, Nikolic K, Uliassi E, Bolognesi ML. A perspective on multitarget drug discovery and design for complex diseases. Clin. Transl. Med. 7(1), 3 (2018). • Review of multitarget drug discovery for treating multifactorial diseases.Crossref, Google Scholar2. Mei Y, Yang B. Rational application of drug promiscuity in medicinal chemistry. Future Med. Chem. 10(15), 1835–1851 (2018).Link, CAS, Google Scholar3. Bolognesi ML. Harnessing polypharmacology with medicinal chemistry. ACS Med. Chem. Lett. 10(3), 273–275 (2019).Crossref, CAS, Google Scholar4. Proschak E, Stark H, Merk D. Polypharmacology by design: a medicinal chemist's perspective on multitargeting compounds. J. Med. Chem. 62(2), 420–444 (2019). • Extensive review of multitarget compounds in practical medicinal chemistry.Crossref, CAS, Google Scholar5. Keiser MJ, Setola V, Irwin JJ et al. Predicting new molecular targets for known drugs. Nature 462(7270), 175–181 (2009).Crossref, CAS, Google Scholar6. Corsello SM, Bittker JA, Liu Z et al. The drug repurposing hub: a next-generation drug library and information resource. Nat. Med. 23(4), 405–408 (2017).Crossref, CAS, Google Scholar7. Pushpakom S, Iorio F, Eyers PA et al. Drug repurposing: progress, challenges and recommendations. Nature Rev. Drug Discov. 18(1), 41–58 (2019). •• Up-to-date review of drug repurposing strategies.Crossref, CAS, Google Scholar8. Hu Y, Bajorath J. Entering the 'big data' era in medicinal chemistry: molecular promiscuity analysis revisited. Future Sci. OA 3(2), FSO179 (2017). • Comprehensive account of compound promiscuity analysis.Link, CAS, Google Scholar9. Bajorath J. Activity artifacts in drug discovery and different facets of compound promiscuity. F1000Res. 3, 233 (2014).Crossref, Google Scholar10. Auld DS, Thorne N, Nguyen DT, Inglese J. A specific mechanism for nonspecific activation in reporter-gene assays. ACS Chem. Biol. 3(8), 463–470 (2008). •• Instructive example of an assay-based interference mechanisms.Crossref, CAS, Google Scholar11. Baell J, Walters MA. Chemistry: chemical con artists foil drug discovery. Nature 513(7519), 481–483 (2014).Crossref, CAS, Google Scholar12. Aldrich C, Bertozzi C, Georg GI et al. The ecstasy and agony of assay interference compounds. ACS Cent. Sci. 3(3), 143–147 (2017).Crossref, CAS, Google Scholar13. Mestres J, Gregori-Puigjané E, Valverde S, Solé RV. Data completeness: the Achilles heel of drug–target networks. Nat. Biotechnol. 26(9), 983–984 (2008).Crossref, CAS, Google Scholar14. Stumpfe D, Tinivella A, Rastelli G, Bajorath J. Promiscuity of inhibitors of human protein kinases at varying data confidence levels and test frequencies. RSC Adv. 7(65), 41265–41271 (2017).Crossref, CAS, Google Scholar15. Hu Y, Jasial S, Gilberg E, Bajorath J. Structure–promiscuity relationship puzzles: extensively assayed analogs with large differences in target annotations. AAPS J. 19(3), 856–864 (2017).Crossref, Google Scholar16. Dimova D, Bajorath J. Rationalizing promiscuity cliffs. ChemMedChem 13(6), 490–494 (2018).Crossref, CAS, Google Scholar17. Sturm N, Desaphy J, Quinn RJ, Rognan D, Kellenberger E. Structural insights into the molecular basis of the ligand promiscuity. J. Chem. Inf. Model. 52(9), 2410–2421 (2012). •• Insights into promiscuity from the perspective of binding site similarity and ligand conformational adaptability.Crossref, CAS, Google Scholar18. Gilberg E, Bajorath J. Recent progress in structure-based evaluation of compound promiscuity. ACS Omega 4(2), 2758–2765 (2019).Crossref, CAS, Google Scholar19. Westbrook J, Feng Z, Chen L, Yang H, Berman HM. The protein data bank and structural genomics. Nucleic Acids Res. 31(1), 489–491 (2003).Crossref, CAS, Google Scholar20. Rao S, Gurbani D, Du G et al. Leveraging compound promiscuity to identify targetable cysteines within the kinome. Cell Chem. Biol. 26(6), 818–829 (2019).Crossref, CAS, Google Scholar21. Haupt VJ, Daminelli S, Schroeder M. Drug promiscuity in PDB: protein binding site similarity is key. PLoS ONE 8(6), e65894 (2013).Crossref, CAS, Google Scholar22. Ehrt C, Brinkjost T, Koch O. Binding site characterization: similarity, promiscuity, and druggability. Med. Chem. Comm. 10(7), 1145–1159 (2019).Crossref, CAS, Google Scholar23. Gilberg E, Stumpfe D, Bajorath J. X-ray structure-based identification of compounds with activity against targets from different families and generation of templates for multitarget ligand design. ACS Omega 3(1), 106–111 (2018).Crossref, CAS, Google Scholar24. Gilberg E, Gütschow M, Bajorath J. Promiscuous ligands from experimentally determined structures, binding conformations, and protein family-dependent interaction hotspots. ACS Omega 4(1), 1729–1737 (2019). • Binding mode characteristics of multifamily ligands from x-ray structures.Crossref, CAS, Google Scholar25. Gaulton A, Hersey A, Nowotka M et al. The ChEMBL database in 2017. Nucleic Acids Res. 45(D1), D945–D954 (2017).Crossref, CAS, Google Scholar26. Feldmann C, Bajorath J. X-ray structure-based chemoinformatic analysis identifies promiscuous ligands binding to proteins from different classes with varying shapes. Int. J. Mol. Sci. 21(1), e3782 (2020). • Binding shapes of multiclass ligands from x-ray structures.Crossref, Google Scholar27. Ashburner M, Ball CA, Blake JA et al. Gene ontology: tool for the unification of biology. Nat. Genet. 25(1), 25–29 (2000).Crossref, CAS, Google Scholar28. Jasial S, Gilberg E, Blaschke T, Bajorath J. Machine learning distinguishes with high accuracy between pan-assay interference compounds that are promiscuous or represent dark chemical matter. J. Med. Chem. 61(22), 10255–10264 (2018).Crossref, CAS, Google Scholar29. Stork C, Chen Y, Šícho M, Kirchmair J. Hit Dexter 2.0: machine-learning models for the prediction of frequent hitters. J. Chem. Inf. Model. 59(3), 1030–1043 (2019).Crossref, CAS, Google Scholar30. Blaschke T, Miljković F, Bajorath J. Prediction of different classes of promiscuous and nonpromiscuous compounds using machine learning and nearest neighbor analysis. ACS Omega 4(4), 6883–6890 (2019). • Machine learning provides evidence for structure–promiscuity relationships.Crossref, CAS, Google ScholarFiguresReferencesRelatedDetailsCited ByStructural feature-driven pattern analysis for multitarget modulator landscapes9 December 2021 | Bioinformatics, Vol. 210Gains from no real PAINS: Where 'Fair Trial Strategy' stands in the development of multi-target ligandsActa Pharmaceutica Sinica B, Vol. 11, No. 11Problems of Pathogenesis and Pathogenetic Therapy of COVID-19 from the Perspective of the General Theory of Pathological Systems (General Pathological Processes)15 July 2021 | International Journal of Molecular Sciences, Vol. 22, No. 14Structural characteristics of compounds with multitarget activityJürgen Bajorath14 May 2021 | Future Drug Discovery, Vol. 3, No. 2Binding mode analysis of ABCA7 for the prediction of novel Alzheimer's disease therapeuticsComputational and Structural Biotechnology Journal, Vol. 19Vesicular ATP-binding cassette transporters in human disease: relevant aspects of their organization for future drug developmentKatja Stefan, Lionel Yi Wen Leck, Vigneshwaran Namasivayam, Pablo Bascuñana, Michael Li-Hsuan Huang, Patrick Johannes Riss, Jens Pahnke, Patric Jan Jansson & Sven Marcel Stefan26 October 2020 | Future Drug Discovery, Vol. 2, No. 4 Vol. 2, No. 3 Follow us on social media for the latest updates Metrics Downloaded 593 times History Received 25 May 2020 Accepted 26 May 2020 Published online 26 June 2020 Published in print July 2020 Information© 2020 Jürgen BajorathKeywordsactive compoundsbiological targetsmachine learningmultitarget activitypolypharmacologypromiscuityprotein classesx-ray crystallographyFinancial & competing interests disclosureThe authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.No writing assistance was utilized in the production of this manuscript.Open accessThis work is licensed under the Attribution-NonCommercial-NoDerivatives 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/PDF download
Referência(s)