Reliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification in silico

Artigo Acesso aberto Revisado por pares

Reliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification in silico

2021; Royal Society; Volume: 379; Issue: 2197 Linguagem: Italiano

10.1098/rsta.2020.0409

ISSN

1471-2962

Autores

Peter V. Coveney, Derek Groen, Alfons G. Hoekstra,

Tópico(s)

Machine Learning in Materials Science

Resumo

You have accessMoreSectionsView PDF ToolsAdd to favoritesDownload CitationsTrack Citations ShareShare onFacebookTwitterLinked InRedditEmail Cite this article Coveney Peter V., Groen Derek and Hoekstra Alfons G. 2021Reliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification in silicoPhil. Trans. R. Soc. A.37920200409http://doi.org/10.1098/rsta.2020.0409SectionYou have accessIntroductionReliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification in silico Peter V. Coveney Peter V. Coveney http://orcid.org/0000-0002-8787-7256 Centre for Computational Science, University College London, Gordon Street, London, UK [email protected] Google Scholar Find this author on PubMed Search for more papers by this author , Derek Groen Derek Groen http://orcid.org/0000-0001-7463-3765 Department of Computer Science, Brunel University London, London, UK Google Scholar Find this author on PubMed Search for more papers by this author and Alfons G. Hoekstra Alfons G. Hoekstra http://orcid.org/0000-0002-3955-2449 Institute for Informatics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, Netherlands Google Scholar Find this author on PubMed Search for more papers by this author Peter V. Coveney Peter V. Coveney http://orcid.org/0000-0002-8787-7256 Centre for Computational Science, University College London, Gordon Street, London, UK [email protected] Google Scholar Find this author on PubMed , Derek Groen Derek Groen http://orcid.org/0000-0001-7463-3765 Department of Computer Science, Brunel University London, London, UK Google Scholar Find this author on PubMed and Alfons G. Hoekstra Alfons G. Hoekstra http://orcid.org/0000-0002-3955-2449 Institute for Informatics, University of Amsterdam, Science Park 904, 1098 XH Amsterdam, Netherlands Google Scholar Find this author on PubMed Published:29 March 2021https://doi.org/10.1098/rsta.2020.0409 The objectivity of science is its crowning and distinguishing feature. Its stock-in-trade are experimental facts, observations and theories which do not depend on who reports them but rather on the notion that the same findings would be obtained by anyone else performing similar procedures. This is what is meant by scientific reproducibility. That, at least, is the aspiration. In practice, things are often less clear cut. Theories are built on observations and experimental data. They involve a process of logical thought based on mathematical methods. These can sometimes be found to be in error because the empirical data they depend on are subsequently shown to be wrong; or there were errors in the logical development of the theory propounded. And then there are the many differences between a given experimental set up and others used to measure the same thing, so error and uncertainty are inevitable. In the modern era of science, computers have come to play a central role. Computer simulation is a way of extracting useful information from theories and the models built using them. Such models are typically impossible to analyse without computers. They produce results which may be designed for comparison against existing experimental measurements; but they are also capable of making predictions for which no experimental data are available. Owing to the sophistication of modern science, such calculations often require powerful computers if results of any kind are to be forthcoming. And for situations in which it is thought that the theories and models are sufficiently accurate, one would like to use computer-based simulation in order to make actionable predictions—predictions whose credibility is sufficiently great that we can use them to make important decisions. Examples of actionable predictions occur in weather forecasts, environmental disasters, climate science, the design of advanced materials, drug discovery and clinical decision making. Are the methods we use today of sufficient reliability that they can generate actionable results? That question is what this theme issue is about. Three notions inform the assessment of such reliability. First, validation: confirmation that the results are in agreement with experiment, the litmus test for whether a simulation is credible. Second, verification: that the software does what it is supposed to do, and does not contain any errors arising from an incorrect implementation or incorrect numerical methods. Third, uncertainty quantification: identification of the provenance of errors within the model, which stem from two distinct sources, one being systematic errors due to parameter estimation, the other arising from random errors which come from the use of random number generators in the code. The purpose of the present theme issue is to survey the state of the art in this domain. The emphasis is on seeking to obtain reproducible scientific findings using computers, and to quantify the level of uncertainty in the codes and procedures used in contemporary scientific research. The issue contains a total of fourteen papers, ranging from research articles and opinion pieces to reviews of aspects of the domain. The diversity of the topics underlines its trans-disciplinary character. There are papers by computer and computational scientists, alongside or together with contributions from authors from established scientific domains. While uncertainty quantification is a well-established field so far as engineers and applied mathematicians are concerned, it is relatively uncommon in other branches of science such as physics and chemistry, life and medical sciences. Within this theme issue, there are papers which look at single-scale modelling and simulation methods such as the well-known molecular dynamics approach, while others are concerned with uncertainty quantification within a multiscale context, in which multiple different single-scale methods are connected in order to bridge spatial and temporal domains. This is currently a research frontier in uncertainty quantification; unlike single-scale approaches, which can be implemented in an unobtrusive manner, quantifying uncertainty within multiscale models frequently requires semi-intrusive or more fully intrusive modifications to the existing modelling code base. The practicalities of reproducing results from computational studies are made evident in the work presented by Krafczyk et al. [1]. They examine over three hundred computational studies and attempt to access the code used and recreate the results presented in the articles. Within their self-imposed time limits, they could not fully reproduce the results from any of these papers. In response to this, the authors propose a set of Reproducibility Principles and Guidelines to assist researchers in making their computational results reproducible. Complementing these, the authors outline the structure of a Reproduction Package as a set of documentation and files to allow a simulation code to be reproduced in a straightforward manner. They provide several vignettes describing their efforts to reproduce results from published articles in order to highlight issues that may be overlooked or neglected when reporting on computational research. A prime example of good practice in both evaluating the performance of a numerical model and adherence to reproducibility standards is provided by Clementi & Barba [2] in the field of nanoscale electrostatics. They commence by attempting to replicate the results for the resonance modes of silicon carbide obtained from two studies in the literature and extend this to validation of their model against experimental data presented in one of the studies. They successfully achieve replication up to fundamental differences between their modelling approach and those in the comparison papers; validation was also successful. The application of reproducibility packages provides readers with access to all the digital artefacts needed to create the results presented in the study—including source code, input files and post-processing scripts. Numerical simulations have become a cornerstone of research in many fields of science and engineering. With this prevalence, ensuring the reproducibility of simulation studies is key to maintaining confidence in such work. Although many such models are deterministic, understanding the sensitivity of outputs to input variation is of central importance. Volodina & Challenor [3] seek to overcome the computational expense of gaining such insight through multiple (i.e. ensemble) simulations by capturing the characteristics of a complex model with a cheaper Gaussian process emulator. Through demonstration of their methodology using a simple one-dimensional function and a climate model of cloud behaviour, they illustrate how uncertainty characteristics of complex deterministic models can be assessed and interpreted. In 'Towards validated multiscale simulations for fusion' [4], Luk et al. apply uncertainty quantification to modelling nuclear fusion for energy production using coupled multiscale simulations. Time-scale bridging requires standardized procedures to determine scale separation and the existence of a steady state within the fastest evolving model. In the case of nuclear fusion, the turbulence model associated with the plasma instabilities needs to reach a steady state. The authors discuss and compare existing and newly introduced time-scale bridging methods by means of sensitivity analysis. Furthermore, quantitative probabilistic metrics are used to assess the validity of the predictions of the multiscale model by comparison with experimental data using the Hellinger distance, Jensen-Shannon divergence and Wasserstein metric. Wan et al. [5] discuss the quantification of uncertainty in simulations that are based on classical molecular dynamics. The paper addresses simulations in a wide range of applications from binding affinity calculations for drug discovery to properties prediction within condensed matter and materials. Valuable insights are provided concerning the intrinsic stochasticity of molecular systems due to their chaotic nature, whose resulting uncertainty is a dominant factor contributing to the uncertainty of individual trajectories. The authors show that ensemble methods provide statistically reliable results and that the distributions predicted from such simulations are often non-Gaussian in nature. Suleimenova et al. [6] investigate how human migration modelling depends on reliable handling of the many parameters which pervade such computer-based studies. They integrate the use of sensitivity analysis into the development of new simulation rule sets. Based on an agent-based simulation of migration, they use Sobol's method for sensitivity analysis to identify the most sensitive assumptions. They refine the rule set with the aim of making these assumptions more detailed (e.g. through parameter splitting) and are able to reduce the sensitivity of these assumptions. Their development approach is potentially more robust than conventional ones as developers do not directly aim for error reduction but instead optimize for balanced sensitivity across the assumptions made within the models. In 'Uncertainty Quantification Patterns for Multiscale Models', Ye et al. [7] present a conceptual framework of computing patterns that support the analysis of uncertainty in coupled models, irrespective of their source domain. The paper presents the basic templates for each uncertainty quantification pattern (UQP) and introduces the notion of semi-intrusive UQ, where sub-models are treated as black boxes but UQ algorithms are applied to the coupling between the individual sub-models. They showcase their implementation through two applications, each of which has been coupled using the Multiscale Coupling Library and Environment (MUSCLE3). Daub et al. [8] provide a form of tutorial which introduces a surrogate-model based uncertainty quantification approach applied to an earthquake rupture simulator. It uses the mogp_emulator package to perform model calibrations, combining it with the FabSim3 automation toolkit to automatically execute and curate the large number of surrogate model executions required on remote resources. The tutorial has been successfully performed in various workshops, is fully open, and can be readily undertaken by readers within a matter of hours. Jansson et al. [9] in 'Assessing uncertainties from physical parameters and modelling choices in an atmospheric LES model' apply a range of modern UQ methods to investigate uncertainties in a large eddy simulation. To do this, they use the stochastic collocation scheme with the EasyVVUQ package to calculate the Sobol indices for a range of parameters. They identify uncertainties caused by small random initial state perturbations and find that the chosen advection scheme has a major influence on the resulting quantities of interest. Coveney & Highfield [10] discuss the importance of reproducibility in science and more particularly computational science. They assess where this can impact data analysis and simulation. The paper describes existing initiatives as well as new suggestions for ensuring increased trust in computer-based predictions. The authors draw attention to various limits in the applicability of computer simulation methods and raise concerns about the lack of transparency of many artificial intelligence methods which are often applied as 'black boxes' to solve complex problems without a clear understanding of their inherent limitations. Fursin presents a new framework (collective knowledge, or CK) to decompose projects into reusable components [11]. Among other things, the approach facilitates the assembly of portable workflows, and helps to reproduce, compare and reuse research techniques from existing publications. To showcase the added value and generality of CK, the author applies it to six exemplary use cases, many of which are directly informed by industrial needs. In the paper entitled 'VECMAtk: A Scalable Verification, Validation and Uncertainty Quantification toolkit for Scientific Simulations', Groen et al. [12] introduce a toolkit that helps users to gain access to a wide range of methods by means of which to scrutinize and assess all kinds of scientific simulations. It facilitates the efficient and straightforward execution of substantial sensitivity analysis and uncertainty quantification (UQ) investigations using remote supercomputers. The authors present applications across six different scientific domains, each of which highlight different aspects of the toolkit in terms of advanced UQ algorithm support, code coupling with uncertainty taken into account, efficient execution of tens of thousands of ensemble-based simulation jobs, and the automated calculation of key sensitivity and uncertainty measures. In 'The case for free and open source software in research and scholarship', Fortunato and Galassi [13] explore the close relation between Free and Open Software (FOSS) and academia at large. They resolve a range of common misconceptions among academics about free software and open source software, and introduce a primer to FOSS suitable for researchers in any field. The paper includes a case study about the GNU Scientific Library project which demonstrates among other things how seemingly minor misconceptions about the openness of underlying libraries can give rise to existential and far-reaching problems in reproducing scientific results. In his paper entitled 'A Fundamental View on Reproducibility' Odd Erik Gundersen provides a survey of the literature on reproducibility and a clarification on its meaning in a computer science context [14]. Through the use of the scientific method, Gundersen identifies four types of transparency that enable reproducible software and distinguishes between two types of reproducibility: output reproducible and analysis reproducible. Overall, transparency and openness are identified as key drivers for reproducibility, which in turn promotes more fast-paced and assured scientific progress. Data accessibility This article has no additional data. Authors' contributions All authors contributed equally. Competing interests The authors have no competing interest. Funding The authors are grateful for funding from the European Commission for the VECMA grant no. (800925) and from the Alan Turing Institute in London that enabled us to run an event in January 2020 under a similar title to this theme issue. Acknowledgements We are grateful to Dr Apostolos Evangelopoulos for his important support in the development of this theme issue, and to Dr Hugh Martin for his contribution, alongside Dr Evangelopoulos, to organizing the event at the Alan Turing Institute. FootnotesOne contribution of 15 to a theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico'. © 2021 The Author(s) Published by the Royal Society. All rights reserved. References1. Krafczyk M, Shi A, Bhaskar A, Marinov D, Stodden V. 2021 Learning from reproducing computational results: three reproducibility principles and the reproduction package. Phil. Trans. R. Soc. A 379, 20200069. (doi:10.1098/rsta.2020.0069) Link, ISI, Google Scholar2. Clementi NC, Barba LA. 2021 Reproducible validation and replication studies in nanoscale physics. Phil. Trans. R. Soc. A 379, 20200068. (doi:10.1098/rsta.2020.0068) Link, ISI, Google Scholar3. Volodina V, Challenor P. 2021 The importance of uncertainty quantification in model reproducibility. Phil. Trans. R. Soc. A 379, 20200071. (doi:10.1098/rsta.2020.0071) Link, ISI, Google Scholar4. Luk OO, Lakhlili J, Hoenen O, von Toussaint U, Scott BD, Coster DP. 2021 Towards validated multiscale simulations for fusion. Phil. Trans. R. Soc. A 379, 20200074. (doi:10.1098/rsta.2020.0074) Link, ISI, Google Scholar5. Wan S, Sinclair RC, Coveney PV. 2021 Uncertainty quantification in classical molecular dynamics. Phil. Trans. R. Soc. A 379, 20200082. (doi:10.1098/rsta.2020.0082) Link, ISI, Google Scholar6. Suleimenova D, Arabnejad H, Edeling WN, Groen D. 2021 Sensitivity-driven simulation development: a case study in forced migration. Phil. Trans. R. Soc. A 379, 20200077. (doi:10.1098/rsta.2020.0077) Link, ISI, Google Scholar7. Ye D, Veen L, Nikishova A, Lakhlili J, Edeling W, Luk OO, Krzhizhanovskaya VV, Hoekstra AG. 2021 Uncertainty quantification patterns for multiscale models. Phil. Trans. R. Soc. A 379, 20200072. (doi:10.1098/rsta.2020.0072) Link, ISI, Google Scholar8. Daub EG, Arabnejad H, Mahmood I, Groen D. 2021 Uncertainty quantification of dynamic earthquake rupture simulations. Phil. Trans. R. Soc. A 379, 20200076. (doi:10.1098/rsta.2020.0076) Link, ISI, Google Scholar9. Jansson F, Edeling W, Attema J, Crommelin D. 2021 Assessing uncertainties from physical parameters and modelling choices in an atmospheric large eddy simulation model. Phil. Trans. R. Soc. A 379, 20200073. (doi:10.1098/rsta.2020.0073) Link, ISI, Google Scholar10. Coveney PV, Highfield RR. 2021 When we can trust computers (and when we can't). Phil. Trans. R. Soc. A 379, 20200067. (doi:10.1098/rsta.2020.0067) Link, ISI, Google Scholar11. Fursin G. 2021 Collective knowledge: organizing research projects as a database of reusable components and portable workflows with common interfaces. Phil. Trans. R. Soc. A 379, 20200211. (doi:10.1098/rsta.2020.0211) Link, ISI, Google Scholar12. Groen D et al. 2021 VECMAtk: a scalable verification, validation and uncertainty quantification toolkit for scientific simulations. Phil. Trans. R. Soc. A 379, 20200221. (doi:10.1098/rsta.2020.0221) Link, ISI, Google Scholar13. Fortunato L, Galassi M. 2021 The case for free and open source software in research and scholarship. Phil. Trans. R. Soc. A 379, 20200079. (doi:10.1098/rsta.2020.0079) Link, ISI, Google Scholar14. Erik Gundersen O. 2021 A fundamental principles of reproducibility. Phil. Trans. R. Soc. A 379, 20200210. (doi:10.1098/rsta.2020.0210) Link, Google Scholar Next Article VIEW FULL TEXT DOWNLOAD PDF FiguresRelatedReferencesDetailsCited by Verhulst S (2023) Computational Social Science for the Public Good: Towards a Taxonomy of Governance and Policy Challenges Handbook of Computational Social Science for Policy, 10.1007/978-3-031-16624-2_2, (19-40), . Dasgupta S, Humble T, Deacon K and Meyers R (2022) Assessing the stability of noisy quantum computation Quantum Communications and Quantum Imaging XX, 10.1117/12.2631809, 9781510654600, (13) Dasgupta S and Humble T (2022) Adaptive stabilization of quantum circuits executed on unstable devices 2022 IEEE International Conference on Quantum Computing and Engineering (QCE), 10.1109/QCE53715.2022.00102, 978-1-6654-9113-6, (736-740) Peluso P and Chankvetadze B (2022) Recognition in the Domain of Molecular Chirality: From Noncovalent Interactions to Separation of Enantiomers, Chemical Reviews, 10.1021/acs.chemrev.1c00846, 122:16, (13235-13400), Online publication date: 24-Aug-2022. Vassaux M, Wan S, Edeling W and Coveney P (2021) Ensembles Are Required to Handle Aleatoric and Parametric Uncertainty in Molecular Dynamics Simulation, Journal of Chemical Theory and Computation, 10.1021/acs.jctc.1c00526, 17:8, (5187-5197), Online publication date: 10-Aug-2021. Lamprecht A, Palmblad M, Ison J, Schwämmle V, Al Manir M, Altintas I, Baker C, Ben Hadj Amor A, Capella-Gutierrez S, Charonyktakis P, Crusoe M, Gil Y, Goble C, Griffin T, Groth P, Ienasescu H, Jagtap P, Kalaš M, Kasalica V, Khanteymoori A, Kuhn T, Mei H, Ménager H, Möller S, Richardson R, Robert V, Soiland-Reyes S, Stevens R, Szaniszlo S, Verberne S, Verhoeven A and Wolstencroft K (2021) Perspectives on automated composition of workflows in the life sciences, F1000Research, 10.12688/f1000research.54159.1, 10, (897) This Issue17 May 2021Volume 379Issue 2197Theme issue 'Reliability and reproducibility in computational science: implementing verification, validation and uncertainty quantification in silico' compiled and edited by P. V. Coveney, D. Groen and A. G. Hoekstra Article InformationDOI:https://doi.org/10.1098/rsta.2020.0409Published by:Royal SocietyPrint ISSN:1364-503XOnline ISSN:1471-2962History: Manuscript accepted06/01/2021Published online29/03/2021Published in print17/05/2021 License:© 2021 The Author(s)Published by the Royal Society. All rights reserved. Citations and impact Keywordscomputational sciencecomputer sciencereproducibilityvalidationverificationuncertainty quantification Subjectscomputational chemistrycomputational physicscomputer modelling and simulation

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Reliability and reproducibility in computational science: implementing validation, verification and uncertainty quantification in silico