Generative Models for Artificially‐intelligent Molecular Design
2018; Wiley; Volume: 37; Issue: 1-2 Linguagem: Inglês
10.1002/minf.201880131
ISSN1868-1751
Autores Tópico(s)Protein Structure and Dynamics
ResumoIn the midst of the fourth industrial revolution,1 there is much excitement about the potential of artificial intelligence (AI) to further pharmaceutical research and development. How might we define intelligent behavior in the context of drug discovery? Essentially, an intelligent agent – man or machine – demonstrates an ability to solve problems, to learn from experience, and to deal with new situations. With regard to these three central criteria, certain machine learning modalities, specifically autonomous adaptive systems, may be considered instances of AI. To date, several such systems have been designed and developed to rationalize and articulate next steps in compound selection, synthesis, and testing.2 This special issue of Molecular Informatics highlights some of the most recent advances in this exciting field of molecular research and discovery. Machine learning models often outperform more traditional approaches for compound prioritization and selection.3,4 Machine learning also bears promise for computationally demanding applications in theoretical chemistry relevant to drug discovery.5 For example, Roitberg and coworkers have recently presented a deep neural network trained on molecular energies obtained from quantum mechanical calculations.6 The accuracy of the resulting model proved comparable to the much more computationally expensive ab initio approach. Importantly, machine learning as such does not constitute AI. Whilst most of the machine learning methods employed in drug discovery essentially implement algorithms and concepts coined more than two decades ago,7 there are undoubtedly new twists and tricks to their modern application, which enable the emergence of weak-AI learning behaviors. One such re-imagining of an older concept is in the burgeoning field of deep learning. Certain characteristics of this set of methods enable and encourage their nascent applications in the field of molecular design.4,8 Essentially, deep learning models are universal function estimators and hypothesis generators: Deep models form an abstract internal representation of the training domain in terms of context-sensitive features. They learn from basic data patterns (e.g., molecular graphs), thereby transferring the choice of a suitable molecular representation for a given training task from the user to the machine intelligence. Deep models perform cascaded nonlinear feature transformation from the training data, enabling the extraction of complex features for arbitrary input-output relationships (e.g., quantitative structure-activity relationships). This ability facilitates the development of generalizing models which are able to deal with novel data instances. Certain deep models may be used for constructive machine learning. These models not only learn from a given set of training data and classify unlabeled instances post hoc, but are able to generate new instances of the training domain in a constructive process, thereby exemplifying the three aspects of an artificially-intelligent system introduced above (problem solving, learning from experience, and coping with new situations). In this context, the umbrella term 'constructive learning' describes an entire class of problem–solving techniques, including generative deep networks, for which the ultimate learning goal is not necessarily to find the optimal model for the training data but rather to identify new instances (e.g., molecules) from within the applicability domain of the model which are likely to exhibit the desired properties.[9] In other words, these models learn from data, form their own abstract representation of the data, and use this representation to generate new data points (Figure 1). This idea lies at the heart of AI and seems ideally suited for de novo drug design. Designing drugs means to create means to interfere with dynamic systems – living organisms.10 There are numerous paths to success in this game, and we have certainly only just begun to understand the many opportunities of AI for pharmaceutical discovery. Schematic of generative modeling. Data distributions are learned by a generative model, which is able to generate new data instances based on the learned internal representation of the training domain. Such an approach may be considered artificially-intelligent, bearing promise for drug design. Despite all the justifiable excitement surrounding this inchoate field, one must be careful not to over-emphasize their benefits for drug discovery without diligent study and proper scrutiny, to paraphrase John Lennon's "Instant Karma!". With a relatively shallow pool of prospective applications to date, it seems advisable to explore the opportunities and limitations of constructive AI for molecular design, and compare this new approach with conventional techniques. After all, we are witnessing a renaissance of the AI debate of two decades past. As with its modern incarnation, discussion abounded on false starts and strong foundations for functionally useful AI.11 The contributions compiled in this special issue of Molecular Informatics showcase not only applications of generative models in drug design but also highlight use-cases of related techniques. It is the editor's hope that these articles will inspire and contribute to the current scientific debate on the role and future of AI in pharmaceutical research. G. S. declares a potential financial conflict of interest in his role as life-science industry consultant and cofounder of inSili.com GmbH, Zurich. The author thanks all contributors to this special issue of Molecular Informatics. Max Pillong created the cover art.
Referência(s)