Characterizing scientific failure

Artigo Acesso aberto Revisado por pares

Characterizing scientific failure

2019; Springer Nature; Volume: 20; Issue: 9 Linguagem: Inglês

10.15252/embr.201948765

ISSN

1469-3178

Autores

Stephan Güttinger, Alan C. Love,

Tópico(s)

Statistical Methods in Clinical Trials

Resumo

Science & Society5 August 2019free access Characterizing scientific failure Putting the replication crisis in context Stephan Guttinger Corresponding Author [email protected] orcid.org/0000-0001-9448-973X Center for Philosophy of Natural and Social Science, London School of Economics, London, UK Search for more papers by this author Alan C Love Corresponding Author [email protected] Department of Philosophy, University of Minnesota, Minneapolis, MN, USA Minnesota Center for Philosophy of Science, University of Minnesota, Minneapolis, MN, USA Search for more papers by this author Stephan Guttinger Corresponding Author [email protected] orcid.org/0000-0001-9448-973X Center for Philosophy of Natural and Social Science, London School of Economics, London, UK Search for more papers by this author Alan C Love Corresponding Author [email protected] Department of Philosophy, University of Minnesota, Minneapolis, MN, USA Minnesota Center for Philosophy of Science, University of Minnesota, Minneapolis, MN, USA Search for more papers by this author Author Information Stephan Guttinger *,1 and Alan C Love *,2,3 1Center for Philosophy of Natural and Social Science, London School of Economics, London, UK 2Department of Philosophy, University of Minnesota, Minneapolis, MN, USA 3Minnesota Center for Philosophy of Science, University of Minnesota, Minneapolis, MN, USA EMBO Rep (2019)20:e48765https://doi.org/10.15252/embr.201948765 PDFDownload PDF of article text and main figures. ToolsAdd to favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures & Info The ongoing debate about a "replication crisis" has put scientific failure in the spotlight, not only in psychological research and the social sciences but also in the life sciences. However, despite this increased salience of failure in research, the concept itself has so far received little attention in the literature (for an exception, see Ref. 1). The lack of a systematic perspective on scientific failure—a daily experience for researchers—hampers our understanding of this complex phenomenon and the development of efficient policies and measures to address it. Without a better grasp of the multiple dimensions of scientific failure, there is a risk that necessary measures will be neglected or that inadequate policies will be adopted because different kinds of failures require different responses. Developing a basic taxonomy of scientific failure will help to identify connections between different types of failures and benefit the formulation of policy measures for improving replicability. Expected outcomes Failure comes in many forms, but the type that matters most in science is the failure to observe an expected outcome. This is what scientists usually mean when they speak of an experiment that has failed. There are two primary reasons why researchers expect a particular outcome from an experiment: Either a prediction was made based on existing theories or models, or a specific outcome had been observed in a previous experiment. We label the first "failure of theorizing" and the latter "failure of precedent" (Fig 1). Figure 1. A basic taxonomy of scientific failureTwo primary reasons why expectations might fail: failure of theorizing and failure of precedent. These two types of failure can be divided further into two sub-types of failure: overgeneralization failures and methodological failures (including materials, experimental design, and technique). Download figure Download PowerPoint These failures can be further distinguished into four different sub-types. Three of these are classified as methodological failures: the use of defective materials, faulty experimental design, or flawed techniques. In rare cases, scientists intentionally use flawed designs or techniques to force an expected outcome. The fourth sub-type is not a methodological failure but is related to overgeneralization. This can happen when researchers have a limited understanding of the parameter space of a discovery or when they rely on mistaken assumptions about the nature of a phenomenon under scrutiny. The debates about reproducibility have mostly focused on methodological issues, especially in relation to questionable research practices, such as selective reporting or gathering data until statistical significance is achieved 2. Little attention has been paid to overgeneralization failures. This, we argue, is problematic because overgeneralization failures are connected to methodological failures, which affects how we can and should deal with failure in science. Methodological failures Most commentators have identified methodological failures as the central drivers of the replication crisis with one component being defective materials. Examples include contaminated cell lines or non-specific antibodies that bind indiscriminately to various targets. A second component of methodological failure is the use of faulty or inappropriate techniques. This includes not only failures of experimental manipulation, such as using the wrong buffer or not following the protocol, but also questionable applications of statistical methods; p-hacking and sample selection fall within this sub-type of scientific failure. The third component of methodological failure is flawed experimental design. We distinguish this from "techniques" because it is not the execution of the experiment itself or the analysis of its results, but rather how the experiment and its controls were designed 3. Controls allow researchers to check for artifacts and provide small-scale replications of existing data 4, which make them crucial elements for generating trust in experimental results. The lack of a systematic perspective on scientific failure […] hampers our understanding of this complex phenomenon and the development of efficient policies and measures to address it. Although not distinguished as such, methodological sub-types of scientific failure have received significant attention in the recent discussion about reproducibility. These failures are attractive for policy debates because they are, to some degree, easily resolvable. Better quality control of materials, better and more transparent reporting of methods, better training of researchers, and lessening the pressure to publish all can help to reduce these types of failure and hence lead to more reproducible science 5. Overgeneralization failures In addition to methodological failures, there are failures of theorizing and precedent that originate from overgeneralization, the causes of which are complex. For example, overgeneralization can occur when the experimenter has a limited understanding of the parameter space of a discovery. Many natural phenomena, and living systems in particular, are governed by a complex network of causal variables and their interconnections. Without a clear understanding of this complex parameter space, researchers may overlook variables that affect the phenomenon of interest. In such a case, they may overgeneralize in modeling the phenomenon by assuming it will manifest across a range of conditions, which increases the risk that a repetition of the experiment will not produce the same results because hidden or overlooked variables cause a different outcome. Such overgeneralization failures could be called "failures of modeling". Overgeneralization failures of this form can be found across all sciences, from biology to psychology and even in mathematics 6. By way of example, assume that biologists have discovered a particular enzymatic activity for a protein and that other researchers are not able to reproduce this finding. One reason could be any of the methodological failures discussed above. Another reason, however, could be a lack of understanding of the factors responsible for the protein's behavior. It could, for instance, depend on a yet unidentified co-factor: The first group had this co-factor present in their sample coincidentally, whereas the second group did not. This means the latter will not be able to reproduce the results of the former. The problem is not faulty methods or defective materials, but an overgeneralization of the model regarding the conditions under which the protein is active. Once researchers know about and understand this additional parameter—through more research—they can overcome the problem by explicitly including this co-factor in their model. Another flavor of overgeneralization failures is based on flawed preconceptions about the nature of the phenomenon under study, or what might be called "failures of presupposition". In the first case, researchers work with a partial model that lacks relevant factors and their connections; in this second case, researchers assume their target is something that it is not. Importantly, researchers working with such flawed assumptions are likely to miss or ignore relevant factors and parameters. The debates about reproducibility have mostly focused on methodological issues, especially in relation to questionable research practices… Consider again protein biology as a case study. A standard view during the 20th century was that all proteins have a stable three-dimensional structure under physiological conditions. Over the past few decades, it has become clear that this fundamental assumption is incorrect; a newly discovered class of "intrinsically disordered proteins" (IDPs) displays varying degrees of disorder under physiological conditions even when they are in their active state 7. The openness and dynamic nature of these polypeptides has practical consequences, especially for measuring protein concentrations in solution. Different assays—such as the Bradford assay or ultraviolet absorbance—rely, in part, on the assumption that all proteins are folded. However, because this does not apply to IDPs, there is a direct impact on the accuracy of these assays 8. If different research groups use different assays, they might systematically under- or overestimate the amount of protein in a sample, which can lead to a lack of replication. The problem is not that researchers use faulty materials or protocols; it is the flawed presupposition that all proteins are folded which ultimately leads them to miss crucial parameters, such as different responses to different assays. … many overgeneralization failures cannot be eliminated, because they are caused by a lack of knowledge that is typical of normal research Other forms of overgeneralization exist. One is related to publishing pressures: Choices made by authors to make their results sound more impactful or leaving out important qualifications to meet word count restrictions can similarly lead to overgeneralization. These, however, can often be spotted and dealt with during peer review. Connecting failures The above examples highlight important connections between different types of failures in scientific practice. For example, overgeneralization failures related to modeling are connected to methodological failures: If researchers have a limited understanding of the parameter space of a discovery, they are at risk of using sub-optimal experimental designs or techniques simply because they do not know which parameters should be considered and controlled. Importantly, many overgeneralization failures cannot be eliminated, because they are caused by a lack of knowledge that is typical of normal research. Scientists are usually moving at (or beyond) the edge of knowledge where they have an incomplete understanding of the parameter space for a particular observation or a phenomenon of interest. Science is an epistemically risky enterprise, and necessarily so. If researchers knew everything already, there would be no point of doing further investigation. Identifying and addressing overgeneralization failures is therefore one of the ways in which science progresses 6. The connections between overgeneralization and methodological failures imply that failures of precedent cannot be eradicated simply by installing stricter quality controls or new incentive structures. The methodological failures of scientists are not just a consequence of flawed science policy; many are fueled by the nature of the research process itself and its inherent epistemic limitations. But this is not to say that these failures cannot be mitigated; even if they cannot be eliminated, measures can be put in place to counteract potential negative effects. Overgeneralization failures can be addressed by more research into the parameter space of a phenomenon, such as seeking convergence between disparate lines of evidence from different approaches 9. This is a time-consuming process that requires the integration of different studies and perspectives on the same phenomenon 10. Therefore, the measures needed to deal with scientific failure include not only better quality control or reducing the pressure to publish, but also a change in how research priorities are established and funding structures are implemented. … existing or proposed changes of research policy are needed to address methodological issues, overgeneralization failures are different in kind and difficult to eliminate. The importance of failure We propose here a framework for the types of failures inherent to scientific practice. Further research is needed to better understand its contours given that identifying these different types of failures and their connections is essential for responding to the replication crisis. Our analysis has shown that while existing or proposed changes in research policy are needed to address methodological issues, overgeneralization failures are different in kind and difficult to eliminate. These failures require other measures, such as longitudinal research studies that combine different methods, to generate stable, reproducible insights into a particular phenomenon and its parameter space. Funding and policy debates must take these additional types of failures into account when designing measures or implementing incentives to improve scientific inquiry. References 1. Firestein S (2015) Failure: why science is so successful. Oxford: Oxford University PressGoogle Scholar 2. John LK, Loewenstein G, Prelec D (2012) Measuring the prevalence of questionable research practices with incentives for truth telling. Psychol Sci 23: 524–532CrossrefPubMedWeb of Science®Google Scholar 3. Begley CG (2013) Reproducibility: six red flags for suspect work. Nature 497: 433–434CrossrefCASPubMedWeb of Science®Google Scholar 4. Guttinger S (2018) Replications everywhere: why the replication crisis might be less severe than it seems at first. BioEssays 40: e1800055Wiley Online LibraryPubMedWeb of Science®Google Scholar 5. Munafò MR, Nosek BA, Bishop DVM, Button KS, Chambers CD, Percie du Sert N, Simonsohn U, Wagenmakers E-J, Ware JJ, Ioannidis JPA (2017) A manifesto for reproducible science. Nat Hum Behav 1: 0021CrossrefPubMedWeb of Science®Google Scholar 6. Redish AD, Kummerfeld E, Morris RL, Love AC (2018) Opinion: Reproducibility failures are essential to scientific inquiry. Proc Natl Acad Sci USA 115: 5042–5046CrossrefCASPubMedWeb of Science®Google Scholar 7. Uversky VN (2016) Dancing protein clouds: the strange biology and chaotic physics of intrinsically disordered proteins. J Biol Chem 291: 6681–6688CrossrefCASPubMedWeb of Science®Google Scholar 8. Contreras Martos S, Nguyen HN, Nguyen PN, Hristozova N, Macossay-Castillo M, Kovacs D, Bekesi A, Oemig JS, Maes D, Pauwels K et al (2018) Quantification of intrinsically disordered proteins: a problem not fully appreciated. Front Mol Biosci 5: 83CrossrefPubMedWeb of Science®Google Scholar 9. Munafò MR, Smith GD (2018) Repeating experiments is not enough. Nature 553: 399–401CrossrefCASPubMedWeb of Science®Google Scholar 10. Kuorikoski J, Marchionni C (2016) Evidential diversity and the triangulation of phenomena. Philos Sci 83: 227–247CrossrefWeb of Science®Google Scholar Previous ArticleNext Article Read MoreAbout the coverClose modalView large imageVolume 20,Issue 9,September 2019Caption: Application of succinate increases running endurance by switching skeletal muscle from fast-twitch fibers to slow‐twitch fibers. The image refers to an ancient Chinese legend from The Classic of Mountains and Seas describing a tall, powerful, strong-willed and extraordinary giant, called Kuafu, who lived on the Chengdu Tianshan Mountain. To light the world, Kuafu set an ambition and vowed to catch the sun. With the help of succinate, he may dramatically increase muscle endurance and catch the sun to brighten and warm the world. By Tao Wang, Ya‐Qiong Xu, Ye‐Xian Yuan, Qing‐Yan Jiang and colleagues: Succinate induces skeletal muscle fiber remodeling via SUNCR1Signaling Path. Volume 20Issue 91 September 2019In this issue FiguresReferencesRelatedDetailsLoading ...

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Characterizing scientific failure