Artigo Acesso aberto Revisado por pares

Accelerating Materials Development via Automation, Machine Learning, and High-Performance Computing

2018; Elsevier BV; Volume: 2; Issue: 8 Linguagem: Inglês

10.1016/j.joule.2018.05.009

ISSN

2542-4785

Autores

J.P. Correa-Baena, Kedar Hippalgaonkar, Jeroen van Duren, Shaffiq A. Jaffer, Vijay Chandrasekhar, Vladan Stevanović, Cyrus Wadia, Supratik Guha, Tonio Buonassisi,

Tópico(s)

Advanced Memory and Neural Computing

Resumo

The convergence of high-performance computing, automation, and machine learning promises to accelerate the rate of materials discovery by ≥10 times, better aligning investor and stakeholder timelines. Infrastructure and human-capital investments are discussed, including equipment capabilities, data management, education, and incentives. As our field transitions from thinking “data poor” to thinking “data rich,” we envision a scientific laboratory where the process of materials discovery continues without disruptions, aided by computational power augmenting the human mind, and freeing the latter to perform research closer to the speed of imagination, addressing societal challenges in market-relevant timeframes. Successful materials innovations can transform society. However, materials research often involves long timelines and low success probabilities, dissuading investors who have expectations of shorter times from bench to business. A combination of emergent technologies could accelerate the pace of novel materials development by ten times or more, aligning the timelines of stakeholders (investors and researchers), markets, and the environment, while increasing return on investment. First, tool automation enables rapid experimental testing of candidate materials. Second, high-performance computing concentrates experimental bandwidth on promising compounds by predicting and inferring bulk, interface, and defect-related properties. Third, machine learning connects the former two, where experimental outputs automatically refine theory and help define next experiments. We describe state-of-the-art attempts to realize this vision and identify resource gaps. We posit that over the coming decade, this combination of tools will transform the way we perform materials research, with considerable first-mover advantages at stake. Successful materials innovations can transform society. However, materials research often involves long timelines and low success probabilities, dissuading investors who have expectations of shorter times from bench to business. A combination of emergent technologies could accelerate the pace of novel materials development by ten times or more, aligning the timelines of stakeholders (investors and researchers), markets, and the environment, while increasing return on investment. First, tool automation enables rapid experimental testing of candidate materials. Second, high-performance computing concentrates experimental bandwidth on promising compounds by predicting and inferring bulk, interface, and defect-related properties. Third, machine learning connects the former two, where experimental outputs automatically refine theory and help define next experiments. We describe state-of-the-art attempts to realize this vision and identify resource gaps. We posit that over the coming decade, this combination of tools will transform the way we perform materials research, with considerable first-mover advantages at stake. The development of novel materials has long been stymied by a mismatch of time constants (Figure 1). Materials development typically occurs over a 15- to 25-year time horizon, sometimes requiring synthesis and characterization of millions of samples. However, corporate and government funders desire tangible results within the residency time of their leadership, typically 2–5 years. The residency time for postdocs and students in a research laboratory is usually 2–5 years; when a project outlasts the residency of a single individual, seamless continuity of motivation and intellectual property is often the exception, not the rule. Market drivers of novel materials development, informed by business competition and environmental considerations, often demand solutions within a shorter time horizon. This mismatch in time constants results in a historically poor return on investment of energy-materials (cleantech) research relative to comparable investments in medical or software development.1Gaddy B.E. Sivaram V. Jones T.B. Wayman L. Venture capital and cleantech: the wrong model for energy innovation.Energy Policy. 2017; 102: 385-395Crossref Scopus (71) Google Scholar To bridge this mismatch in time horizons and increase the success rate of materials research, both public- and private-sector actors endeavor to develop new paradigms for materials development.2Aspuru-Guzik, A., and Persson, K. (2018). Materials acceleration platform: accelerating advanced energy materials discovery by integrating high-throughput methods and artificial intelligence. Mission Innovation: Innovation Challenge 6. http://nrs.harvard.edu/urn-3:HUL.InstRepos:35164974.Google Scholar The U.S. Materials Genome Initiative focused on three “missing links”: computational tools to focus experimental efforts in the most promising directions, data repositories to aggregate learnings and identify trends, and higher-throughput experimental tools.3Jain A. Ong S.P. Hautier G. Chen W. Richards W.D. Dacek S. Cholia S. Gunter D. Skinner D. Ceder G. Persson K.A. Commentary: the materials project: a materials genome approach to accelerating materials innovation.APL Mater. 2013; 1: 011002Crossref Scopus (3900) Google Scholar This call to action was mirrored in industry and by university- and laboratory-led consortia, many focused on simulation-based inverse design and discovery and properties databases. As these tools matured, the throughput of materials prediction often vastly outstripped experimentalists' ability to screen for materials with low rates of false negatives. Today, a new paradigm is emerging for experimental materials research, which promises to enable more rapid discovery of novel materials.4Nosengo N. The material code: machine-learning techniques could revolutionize how materials science is done.Nature. 2016; 533: 22Crossref PubMed Scopus (80) Google Scholar, 5De Luna P. Wei J. Bengio Y. Aspuru- Guzik A. Sargent E. Use machine learning to find energy materials.Nat. Mater. 2017; 552: 23Google Scholar Figure 2 illustrates one such prototypical vision, entitled “accelerated materials development and manufacturing.” Rapid, automated feedback loops are guided by machine learning, and an emphasis on value creation through end-product and industry transfer. There is a unique opportunity today to develop these capabilities in testbed fashion, with considerable improvements in research productivity and first-mover advantages at stake. As is often the case with convergent technologies, one observes significant advances in individual “silos” before the leveraged ensemble effect bears its full impact. A historical example is three-dimensional (3D) printing, wherein 3D computer-aided design, computer-to-hardware interface protocols, and ink-jet printing technologies evolved individually, before being combined by Prof. Ely Sachs and his MIT team into the first 3D printer. The ability to observe emergent technologies within individual silos, and assemble them into an ensemble that is greater than the sum of its parts, mirrors the challenge in novel materials development today. The following paragraphs describe the discrete, emergent innovations in “siloed” domains that are presently converging, and promise to enable this paradigm shift within the next decade. Today, the rate of theoretical prediction vastly outstrips the rate of experimental synthesis, characterization, and validation.7Pyzer-Knapp E.O. Suh C. Gómez-Bombarelli R. Aguilera-Iparraguirre J. Aspuru-Guzik A. What is high-throughput virtual screening? A perspective from organic materials discovery.Annu. Rev. Mater. Res. 2015; 45: 195-216Crossref Scopus (164) Google Scholar This emergence is enabled by three trends: faster computation, more efficient and accurate theoretical approaches and simulation tools, and the ability to screen large databases quickly, such as MaterialsProject.org. To better focus limited experimental bandwidth, there is increasing interest to simulate the “how” of synthesis, not just the “what”—capturing in computer models the full complexity of environmental factors (e.g., humidity), reaction energy barriers, and kinetic limitations (so-called “non-equilibrium” synthesis).8US Department of Energy (2016) Basic Research Needs for Synthesis Science. Report of the Basic Energy Sciences Workshop on on Basic Research Needs for Synthesis Science for Energy Relevant Technology. May 2–4, 2016.Google Scholar In parallel, theorists seek to rationally design materials with combinations of properties; first, by predicting combinations of properties (e.g., chemical, microstructural, interface, surface) in one simulation framework and/or database, then connecting material predictions with device performance and reliability predictions, then extending this framework to both known and not-yet-discovered compounds, and ultimately, solving the inverse problem.9Phillips C.L. Littlewood P. Preface: special topic on materials genome.APL Mater. 2016; 4: 053001Crossref Scopus (9) Google Scholar, 10Zunger A. Inverse design in search of materials with target functionalities.Nat. Rev. Chem. 2018; 2: 0121Crossref Google Scholar, 11Roch, L.M., Häse, F., Kreisbeck, C., Lars, T.T-M., Yunker, P.E., Hein, J.E., Aspuru-Guzik, A. ChemOS: An Orchestration Software to Democratize Autonomous Discovery. https://chemrxiv.org/articles/ChemOS_An_Orchestration_Software_to_Democratize_Autonomous_Discovery/5953606.Google Scholar Historically, slow vacuum-based deposition methods inhibit materials development. Modern vacuum-based tools, including combinatorial approaches and large-scale, fast serial deposition/reactions, enable meaningful rate increases for materials and device synthesis.12Eid J. Liang H. Gereige I. Lee S. Van Duren J. Combinatorial study of NaF addition in CIGSe films for high efficiency solar cells.Prog. Photovolt. 2015; 23: 269-280Crossref Scopus (28) Google Scholar, 13Jeon M.K. Cooper J.S. McGinn P.J. Investigation of PtCoCr/C catalysts for methanol electro-oxidation identified by a thin film combinatorial method.J. Power Sources. 2009; 192: 391-395Crossref Scopus (24) Google Scholar Variants of existing deposition methods (e.g., close-space sublimation) offer higher growth rates, point-defect control, and precise stoichiometry and impurity control for process-compatible materials. Solution synthesis has gained acceptance with the emergence of higher-quality precursors and materials, including CdS quantum dots, polymer solar cells, and lead-halide perovskites.7Pyzer-Knapp E.O. Suh C. Gómez-Bombarelli R. Aguilera-Iparraguirre J. Aspuru-Guzik A. What is high-throughput virtual screening? A perspective from organic materials discovery.Annu. Rev. Mater. Res. 2015; 45: 195-216Crossref Scopus (164) Google Scholar, 14Kojima A. Teshima K. Shirai Y. Miyasaka T. Organometal halide perovskites as visible-light sensitizers for photovoltaic cells.J. Am. Chem. Soc. 2009; 131: 6050-6051Crossref PubMed Scopus (15162) Google Scholar The growing diversity of precursors (from molecular to nanoparticle), synthesis control (including solvent engineering), and thin-film synthesis methods (lab-based spin coating to industrially compatible large-area printing) makes this a powerful and flexible platform to deposit a range of new materials. Emergence of 3D printed materials provides another ubiquitous alternative. At laboratory scale, throughputs for such rapid synthesis routes7Pyzer-Knapp E.O. Suh C. Gómez-Bombarelli R. Aguilera-Iparraguirre J. Aspuru-Guzik A. What is high-throughput virtual screening? A perspective from organic materials discovery.Annu. Rev. Mater. Res. 2015; 45: 195-216Crossref Scopus (164) Google Scholar, 15Graetzel M. Janssen R.A.J. Mitzi D.B. Sargent E.H. Materials interface engineering for solution-processed photovoltaics.Nature. 2012; 488: 304-312Crossref PubMed Scopus (930) Google Scholar can be up to an order of magnitude greater than vacuum-based techniques, and remain to be explored for multinary materials with novel microstructures. With declining component costs and greater adoption of standards, the ability to rapidly combine discrete devices into components and systems in a modular and flexible manner is emerging. Often, theoretical predictions are made for “ideal” materials systems. However, real samples contain defects (e.g., impurities, structural defects) that can harm (or, occasionally, benefit) bulk and interface properties. To mitigate the risk of defect-induced false negatives during high-throughput materials screening, it is desirable to identify classes of materials less adversely affected by defects (so-called “defect tolerant”16Yin W.-J. Shi T. Yan Y. Unusual defect physics in CH3NH3PbI3 perovskite solar cell absorber.Appl. Phys. Lett. 2014; 104: 063903Crossref Scopus (1910) Google Scholar, 17Brandt R.E. Stevanović V. Ginley D.S. Buonassisi T. Identifying defect-tolerant semiconductors with high minority-carrier lifetimes: beyond hybrid lead halide perovskites.MRS Commun. 2015; 5: 265-275Crossref Scopus (545) Google Scholar), and rapidly diagnose and decouple the effects of defects on material performance. A notable recent example is the serendipitous discovery of lead-halide perovskites for optoelectronic applications.14Kojima A. Teshima K. Shirai Y. Miyasaka T. Organometal halide perovskites as visible-light sensitizers for photovoltaic cells.J. Am. Chem. Soc. 2009; 131: 6050-6051Crossref PubMed Scopus (15162) Google Scholar, 15Graetzel M. Janssen R.A.J. Mitzi D.B. Sargent E.H. Materials interface engineering for solution-processed photovoltaics.Nature. 2012; 488: 304-312Crossref PubMed Scopus (930) Google Scholar In addition to being amenable to high-throughput solution-phase deposition, lead-halide perovskites also required orders of magnitude less research effort to achieve similar performance improvements to traditional inorganic thin-film materials (Figure 3). It is suspected that part of the facility to improve performance is owed to increased defect tolerance of lead-halide perovskites, resulting in improved bulk-transport properties. Determining the underlying physics of and developing design rules for defect tolerance may inform screening criteria for new materials, especially with new computational tools such as General Adversarial Networks that are state-of-the-art in anomaly detection.19Zenati H. Foo C. Lecouat B. Manek G. Chandrasekhar V. Efficient GAN-based anomaly detection.arXiv. 2018; (1802.06222)Google Scholar, 20Guimaraes G.L. Sanchez-Lengeling B. Outeiral C. Farias P.L.C. Aspuru-Guzik A. Objective-Reinforced Generative Adversarial Networks (ORGAN) for sequence generation models.arXiv. 2017; (1705.10843)Google Scholar The next step lies in focusing experimental effort on candidates capable of rapid performance improvements during early screening and development, and wider process tolerance in manufacturing. In relation to the beneficial aspects of defects and impurities, recent theory advancements,21Freysoldt C. Grabowski B. Hickel T. Neugebauer J. Kresse G. Janotti A. Van de Walle C.G. First-principles calculations for point defects in solids.Rev. Mod. Phys. 2014; 86: 253Crossref Scopus (1589) Google Scholar in combination with computational tools to rapidly assess and predict solubility and electrical properties of defects,22Goyal A. Gorai P. Peng H. Lany S. Stevanovic V. A computational framework for automation of point defect calculations.Comput. Mater. Sci. 2017; 130: 1Crossref Scopus (94) Google Scholar allows high-throughput screening of materials for applications where the desired functionality is enabled by the defects and/or dopants (e.g., thermoelectrics, transparent electronics). Characterization tools have also benefited from high-performance computing, automation, and machine learning. For instance, one high-resolution X-ray photoelectron spectroscopy spectrum could take an entire day with technology from the 1970s, while the same measurement today requires less than an hour. Today, advanced statistics and machine learning promise to further accelerate the rate of learning.23Kusne A.G. Gao T. Mehta A. Ke L. Nguyen M.C. Ho K.-M. Antropov V. Wang C.-Z. Kramer M.J. Long C. Takeuchi I. On-the-fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets.Sci. Rep. 2014; 4: 6367Crossref PubMed Scopus (175) Google Scholar, 24Iwasaki Y. Kusne A.G. Takeuchi I. Comparison of dissimilarity measures for cluster analysis of X-ray diffraction data from combinatorial libraries.NPJ Comput. Mater. 2017; 3: 4Crossref Scopus (62) Google Scholar Tools now exist that can acquire multiple XPS spectra on a single sample (e.g., with composition gradients), and automated spectral analysis of large datasets is now possible, enabling estimation of unknown materials in a compositional map. Others seek to replace spectroscopy with rapid non-destructive testing; several bulk and interface properties can be simultaneously diagnosed by using Bayesian inference in combination with non-destructive device testing, enabling ≥10 times faster (and in certain cases, more precise) diagnosis than traditional characterization tools.25Brandt R.E. Kurchin R.C. Steinmann V. Kitchaev D. Roat C. Levcenco S. Ceder G. Unold T. Buonassisi T. Rapid photovoltaic device characterization through Bayesian parameter estimation.Joule. 2017; 1: 843-856Abstract Full Text Full Text PDF Scopus (36) Google Scholar This kind of parameter estimation can be applied to finished components, devices, and systems, and has the potential to not only enable faster troubleshooting, but also to accurately estimate intrinsic material properties26Somnath S. Law K.J.H. Morozovska A.N. Maksymovych P. Kim Y. Lu X. Alexe M. Archibald R. Kalinin S.V. Jesse S. Vasudevan R.K. Ultrafast current imaging by Bayesian inversion.Nat. Commun. 2018; 9: 513Crossref PubMed Scopus (12) Google Scholar, 27Li L. Yang Y. Zhang D. Ye Z.-G. Jesse S. Kalinin S.V. Vasudevan R.K. Machine learning–enabled identification of material phase transitions based on experimental data: exploring collective dynamics in ferroelectric relaxors.Sci. Adv. 2018; 4: eaap8672Crossref PubMed Scopus (44) Google Scholar as well as ultimate performance potential, thus informing the decision to pursue or abandon further investment in a given candidate material even at early stages of materials screening. Machine learning comprises a broad class of approaches, which may play several different roles in the future materials development cycle. First, a common application of machine learning is for materials selection, in which historical experimental observations are used to inform predictions of future properties (attributes) of unknown compounds, or discover new ones.28Ward L. Agrawal A. Choudhary A. Wolverton C. A general-purpose machine learning framework for predicting properties of inorganic materials.NPJ Comput. Mater. 2016; 2: 16028Crossref Scopus (682) Google Scholar Such an approach has been realized to help discover novel active layers in organic solar cells29Lopez S.A. Sanchez-Lengeling B. de Goes Soares J. Aspuru-Guzik A. Design principles and top non-fullerene acceptor candidates for organic photovoltaics.Joule. 2017; 1: 857-870Abstract Full Text Full Text PDF Scopus (128) Google Scholar and light-emitting diodes,30Gómez-Bombarelli R. Aguilera-Iparraguirre J. Hirzel T.D. Duvenaud D. Maclaurin D. Blood-Forsythe M.A. Chae H.S. Einzinger M. Ha D.-G. Wu T. et al.Design of efficient molecular organic light-emitting diodes by a high-throughput virtual screening and experimental approach.Nat. Mater. 2016; 15: 1120-1127Crossref PubMed Scopus (557) Google Scholar and metal alloys,31Conduit B.D. Jones N.G. Stone H.J. Conduit G.J. Design of a nickel-base superalloy using a neural network.Mater. Des. 2017; 131: 358-365Crossref Scopus (87) Google Scholar, 32Senkov O.N. Miller J.D. Miracle D.B. Woodward C. Accelerated exploration of multi-principal element alloys with solid solution phases.Nat. Commun. 2015; 6: 6529Crossref PubMed Scopus (473) Google Scholar among many others.33Green M.L. Choi C.L. Hattrick-Simpers J.R. Joshi A.M. Takeuchi I. Barron S.C. Campo E. Chiang T. Empedocles S. Gregoire J.M. et al.Fulfilling the promise of the materials genome initiative with high-throughput experimental methodologies.Appl. Phys. Rev. 2017; 4: 011105Crossref Scopus (197) Google Scholar Second, machine-learning tools can help extract greater and more accurate information from diagnosis, as detailed in the previous section. Third, machine-learning tools may help close the automation loop between diagnosis and synthesis, shown in Figure 2, by reducing the degree of human intervention and reliance on heuristics. For example, when relationships between experimental inputs and diagnosis outputs can be inferred by neural networks, detailed process and device models may no longer be needed to predict outcomes and optimize processes. All three applications of machine learning to the materials development cycle benefit from the availability of more data, to train and sharpen the predictive capacity of such tools. Achieving predictability without losing physical insights is an emergent challenge and research opportunity. Such methods may also increase learning from diagnosis, by consolidating research output in singular databases, drawing automated inferences from the data, and in the future perhaps aggregating the experience and knowledge base via natural language processing of existing research papers and materials property databases.34Kim E. Huang K. Saunders A. McCallum A. Ceder G. Olivetti E. Materials synthesis insights from scientific literature via text extraction and machine learning.Chem. Mater. 2017; 29: 9436-9444Crossref Scopus (255) Google Scholar Materials synthesis equipment today is becoming increasingly remotely operable—enabling research and operation by an investigator who is not in proximal presence to the deposition equipment. This opens up two related opportunities with far-reaching consequences. Large, expensive, synthesis equipment can be grouped together with massively parallel characterization equipment to form synthesis centers of the future, which are operated by remote users and researchers and managed by an on-site professional staff. Akin in concept to the Software Cloud concept, where one's computing and data are stored across machines worldwide in a seamless manner, a Hardware Cloud would enable a user to deposit, measure, and carry out research (with real-time feedback through in situ characterization tools) across a number of networked materials-processing systems distributed nationally or internationally in a seamless manner. This also leads to the second opportunity: to be able to store, curate, access, process, and diagnose all data gathered in these networked experiments in Public or Private Clouds.3Jain A. Ong S.P. Hautier G. Chen W. Richards W.D. Dacek S. Cholia S. Gunter D. Skinner D. Ceder G. Persson K.A. Commentary: the materials project: a materials genome approach to accelerating materials innovation.APL Mater. 2013; 1: 011002Crossref Scopus (3900) Google Scholar, 35Zakutayev A. Wunder N. Schwarting M. Perkins J.D. White R. Munch K. Tumas W. Phillips C. An open experimental database for exploring inorganic materials.Sci. Data. 2018; 5: 180053Crossref PubMed Scopus (91) Google Scholar, 36White R.R. Munch K. Handling large and complex data in a photovoltaic research institution using a custom laboratory information management system.MRS Proceedings. 2014; 1654 (Mrsf13-1654-nn11-04)https://doi.org/10.1557/opl.2014.31Crossref Scopus (3) Google Scholar, 37Ren F. Ward L. Williams T. Laws K.J. Wolverton C. Hattrick-Simpers J. Mehta A. Accelerated discovery of metallic glasses through iteration of machine learning and high-throughput experiments.Sci. Adv. 2018; 4: eaaq1566Crossref PubMed Scopus (277) Google Scholar (protocols and formats for such science data collectives are discussed in the following paragraphs.) This will greatly facilitate two emerging issues: (1) increasing the efficient availability of data across a wide number of experiments and experimental platforms for post-analysis; and (2) making available for analysis data that indicate “what did not work”; this is not easily available but is instrumental in the learning process, and has its own value in increasing the collective efficiency of research progress. Realizing the vision shown in Figure 2 requires a sustained commitment over several years to develop software, hardware, and human resources, and to connect these new capabilities in testbed fashion. Supported by ample investments into machine-learning methods development, a pressing challenge is how to down-select and apply the most appropriate machine-learning methods to enable the “automated feedback loop” shown in Figure 2. Compared with other widely recognized applications of machine learning today (e.g., vision recognition, natural language processing, and board gaming), materials research often involves sparse datasets (e.g., small sample sizes and number of experimental inputs and outputs, for training and fitting) and less well-constrained “rules” (e.g., complex physics and chemistry, non-binary inputs and outputs, large experimental errors, uncontrolled input variables, and incomplete characterization of outputs, to name a few). These realities make the typical materials science problem (e.g., layer-by-layer atomic assembly of a thin film) decidedly more complex and less well defined than a match of “Go,” where the rules and playing board are constrained. Deep machine learning (DML) appears well poised to address this complexity.38Lecouat B. Foo C. Zenati H. Chandrasekhar V. Semi-supervised deep learning with GANs: revisiting manifold regularization.arXiv. 2018; (1805:08957)Google Scholar Computation speed can be improved by developing “pre-trained” neural networks that incorporate the underlying physics and chemistry common to materials synthesis, performance, and defects, bringing DML within reach of commonly available hardware and software. A balance must be found between achieving actionable results and inferring physical insight from “black-box” computational methods, to advance both engineering and scientific objectives, and minimize unintended consequences. There is a need to apply “white-box” (i.e., opposite of black-box) machine-learning methods to materials science problems. One possible approach may be application of semi-supervised deep learning algorithms, which learn with lots of unlabeled data and very little labeled data.39Hutchinson M.L. Antono E. Gibbons B.M. Paradiso S. Ling J. Meredig B. Overcoming data scarcity with transfer learning.arXiv. 2017; (1711:05099)Google Scholar Lastly, the ability of machine-learning tools to adapt to uncontrolled and changing experimental conditions is essential. Promising developments include online deep learning, which builds neural networks on the fly, gradually adding neurons (e.g., as baseline experimental conditions change, or as new physics becomes dominant).40Ramasamy S. Rajaraman K. Krishnaswamy P. Chandrasekhar V. Online deep learning: growing RBM on the fly.arXiv. 2018; (1803:02043)Google Scholar Investment in standards governing data formatting and storage would facilitate data entry into machine-learning software. Standards embed contextual know-how, hierarchy, and rational thought. Some communities have implemented standards governing raw and processed data, e.g., crystallography, genetics, and geography. However, in most materials research communities, there are no universally accepted and implemented data standards. Several materials databases have been created, often specialized by material class or application, and with varying protocols for updating information and enforcing hygiene. Furthermore, these databases often lack ability to quickly and accurately predict device-relevant combinations of properties (e.g., chemical, mechanical, optoelectronic, microstructural, surface, interface). Several data standards have been proposed41Hill J. Mannodi-Kanakkithodi A. Ramprasad R. Meredig B. Materials data infrastructure and materials informatics.in: Shin D. Saal J. Computational Materials System Design. Springer, 2017: 193-225Google Scholar, 42Ananthakrishnan R. Chard K. Foster I. Tuecke S. Globus platform-as-a-service for collaborative science applications.Concurr. Comput. 2015; 27: 290-305Crossref PubMed Scopus (35) Google Scholar, 43Chard K. Dart E. Foster I. Shifflett D. Tuecke S. Williams J. The modern research data portal: a design pattern for networked, data-intensive science.PeerJ Comput. Sci. 2018; 4: e144Crossref Scopus (18) Google Scholar; widespread adoption may hinge on pervasive adoption of data-management systems described in the next paragraph, with FAIR guiding principles in mind.44Wilkinson M.D. Dumontier M. Aalbersberg I.J. Appleton G. Axton M. Baak A. Blomberg N. Boiten J.-W. da Silva Santos L.B. Bourne P.E. et al.The FAIR guiding principles for scientific data management and stewardship.Sci. Data. 2016; 3: 160018Crossref PubMed Scopus (5366) Google Scholar In the absence of data standards, it is possible that the burden of data aggregation will shift onto natural language processors,34Kim E. Huang K. Saunders A. McCallum A. Ceder G. Olivetti E. Materials synthesis insights from scientific literature via text extraction and machine learning.Chem. Mater. 2017; 29: 9436-9444Crossref Scopus (255) Google Scholar i.e., computer programs designed to extract relevant data from available media (e.g., publications, reports, presentations, and theses). Investment in data-management tools (e.g., informatics systems) is needed to manage data obtained from lab equipment and store records, coordinate tasks, and enforce protocols. On one hand, such systems have been shown to be of high value for well-defined research problems and tool sets. For early-stage materials

Referência(s)