Artigo Acesso aberto Revisado por pares

Continuous improvement and adaptation of predictive models in smart manufacturing and model management

2021; Institution of Engineering and Technology; Volume: 3; Issue: 1 Linguagem: Inglês

10.1049/cim2.12009

ISSN

2516-8398

Autores

Florian Bachinger, Gabriel Kronberger, Michael Affenzeller,

Tópico(s)

Air Quality Monitoring and Forecasting

Resumo

IET Collaborative Intelligent ManufacturingVolume 3, Issue 1 p. 48-63 ORIGINAL RESEARCH PAPEROpen Access Continuous improvement and adaptation of predictive models in smart manufacturing and model management Florian Bachinger, Corresponding Author Florian Bachinger florian.bachinger@fh-hagenberg.at Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Hagenberg, Austria Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, Austria Institute for Application-Oriented Knowledge Processing (FAW), Johannes Kepler University, Linz, Austria Correspondence Florian Bachinger, Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Softwarepark 11, Hagenberg 4232, Austria. Email: florian.bachinger@fh-hagenberg.atSearch for more papers by this authorGabriel Kronberger, Gabriel Kronberger Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Hagenberg, Austria Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, AustriaSearch for more papers by this authorMichael Affenzeller, Michael Affenzeller Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, Austria Institute for Formal Models and Verification, Johannes Kepler University, Linz, AustriaSearch for more papers by this author Florian Bachinger, Corresponding Author Florian Bachinger florian.bachinger@fh-hagenberg.at Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Hagenberg, Austria Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, Austria Institute for Application-Oriented Knowledge Processing (FAW), Johannes Kepler University, Linz, Austria Correspondence Florian Bachinger, Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Softwarepark 11, Hagenberg 4232, Austria. Email: florian.bachinger@fh-hagenberg.atSearch for more papers by this authorGabriel Kronberger, Gabriel Kronberger Josef Ressel Center for Symbolic Regression, University of Applied Sciences Upper Austria, Hagenberg, Austria Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, AustriaSearch for more papers by this authorMichael Affenzeller, Michael Affenzeller Heuristic and Evolutionary Algorithms Laboratory, University of Applied Sciences Upper Austria, Hagenberg, Austria Institute for Formal Models and Verification, Johannes Kepler University, Linz, AustriaSearch for more papers by this author First published: 23 March 2021 https://doi.org/10.1049/cim2.12009AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract Predictive models are increasingly deployed within smart manufacturing for the control of industrial plants. With this arises, the need for long-term monitoring of model performance and adaptation of models if surrounding conditions change and the desired prediction accuracy is no longer met. The heterogeneous landscape of application scenarios, machine learning frameworks, hardware-restricted IIoT platforms, and the diversity of enterprise systems require flexible, yet stable and error resilient solutions that allow the automated adaptation of prediction models. Recommendations are provided for the application and management of predictive models in smart manufacturing. Typical causes for concept drift in real-world smart manufacturing applications are analysed, and essential steps in data and prediction model management are highlighted, to ensure reliability and efficiency in such applications. For this purpose, recommendations and a reference architecture for model management are provided. In addition, experimental results of two model adaptation strategies on an artificial dataset are shown. 1 INTRODUCTION The ever increasing digitalisation efforts of businesses in virtually all kinds of disciplines have paved the way for an age of information technology. Over the last few decades, artificial intelligence (AI) has had its ups and downs, but recent years have seen a renaissance and a surge of novel applications in more traditional businesses. With the increasing availability of data through application of sensor platforms, AI has arrived in areas of industrial production and manufacturing [1]. This domain of trends is more commonly identified under the buzzwords Industry 4.0 and Smart Manufacturing. Industry 4.0 introduces the information and telecommunications revolution to manufacturing and industrial applications. Factories utilize information networks to connect devices such as production machines and sensors over the industrial Internet of things (IIoT) and collect massive quantities of data. Local data from the production process is connected to data from enterprise resource planning (ERP), production planning (PPS) and manufacturing execution (MES) systems. Thus human operators are equipped with up-to-date information of multiple plants at their fingertips, allowing educated decision-making in their planning to achieve globally optimal solutions for the whole system [1-4]. Smart Manufacturing is a subset of the Industry 4.0 concept which applies AI methods, big data analytics, machine learning, and simulations to optimize the manufacturing process. Detailed simulations of plants are used as a safe testing environment for the evaluation of machine learning prediction models. The need for safe testing environments is emphasised by the example of robots or autonomous transport vehicles using reinforcement learning. This approach requires that autonomous robots, or in general learning agents, operate within a safe environment where their possible actions and effects can be explored and observed respectively. However, free exploration of possibly sub-optimal actions in real-world manufacturing scenarios might lead to significant failures in the plant. The creation of highly detailed simulation models of the production system—the so-called digital twins—is therefore essential for evaluation of learned control policies by the agent [4, 5]. To achieve this high level of detail in simulations, individual models including for example kinetics models, process models or deformation models are often combined and applied together to fully encompass the manufacturing process. Such chained, interconnected models effectively provide a digital twin of the production process, but raise new challenges in the management of these individual heterogeneous models. Individual models can be either derived from first principles, when the causal effects within a system are fully known, or created by machine learning methods based solely on data. Herein, we focus on the management of predictive models used for control of cyber-physical-systems in smart manufacturing environments [6] that are combining multiple heterogeneous models to create a digital twin of the production process. We employ a practitioner's approach to the problem of model adaptation, necessitated by the occurrence of concept drift, by proposing additional steps and tools in data and model management to implement machine learning framework agnostic continuous model improvement. In Section 2 we discuss literature related to the area of model management. Section 3 details the role of prediction models used for control in smart manufacturing. We discuss the role of concept drift and highlight the importance of continuous model monitoring and adaptation, as well as describes a model management system for smart manufacturing. Section 4 provides a characteristic smart production process, which requires multiple interconnected models of individual components to simulate and control the entire production process. Based on this process we discuss common problems and analyse possible causes of concept drift in industrial applications. In Section 5 we showcase two different methods of model adaptation and show how model management systems can help in continuous improvement and adaptation of prediction models. This example serves as motivation for Section 6 in which we provide generalizable recommendations for continuous model improvement. Section 7 discusses existing work and available tools in the area of data and model management for machine learning applications which can implement the aforementioned recommendations. Finally, we draw conclusions in Section 8. 2 PRIOR WORK Kumar et al. [7] identify end-to-end machine learning as one of the open problems in their review of data management in machine learning. Related work has led to systems such as MLflow [8], ModelDB [9] or DVC [10], which support the machine learning workflow by keeping track of experiments, algorithm parameters, models and metrics. However, these systems provide no integration of enterprise knowledge systems, model monitoring or automated model adaptation. Bharadwaj et al. [11, 12] describe DataHub, a system for the centralised, versioned storage of datasets that provides an UI for the visualisation of changes in between data versions. This is an essential prerequisite for retrospective error analysis, but would additionally require connecting dataset versions to model versions, which is not directly supported in DataHub. Ease.ml/ci [13] is an advanced model evaluation system that builds upon a continuous integration platform which allows the user to define unit tests for machine learning experiments. Such tests can ensure that models improve over their predecessors in predictive accuracy. Clipper [14], Velox [15] and TensorFlow-Serving [16] are centralised services which provide the execution environment for individual models and provide APIs to request predictions. Clipper even supports heterogeneous execution environments for models through containerisation. Alternatively, models can be directly deployed onto the target systems which require model predictions. Kriens et al. [17] propose a definition of model metadata to describe model capabilities and software dependencies to allow provisioning of execution environments for models. Recent efforts towards model management systems for smart manufacturing [18-22] try to apply these findings in an industrial setting. Weber et al. [20] state that current model management systems, when applied in smart manufacturing environments, lack the integration of external knowledge bases to provide domain context and lack proper mechanisms to retire models at the end of their lifetime. We see additional challenges motivated by the complex nature of production processes, which require interconnected and interdependent prediction models created for individual components and subsequent monitoring and model adaptation to react to concept drift. 3 APPLICATION OF PREDICTION MODELS IN SMART MANUFACTURING The following subsections discuss the possible application scenarios of prediction models in smart manufacturing. We describe the so-called concept drift effect and list aspects of model monitoring and model adaptation which facilitate continuous model improvement. Additionally, we highlight aspects of machine learning that are important to the handling of model adaptation and are addressed in the recommendations for data preparation and storage in Section 7.1. In this section we provide arguments and examples to motivate the introduction of model management systems in smart manufacturing environments to enable long term monitoring and continuous improvement of prediction model. 3.1 Predictive models in smart manufacturing The applications of predictive models in a smart manufacturing environments are manifold. Models can be employed for example, to predict customer preferences and behaviour [23], in predictive maintenance strategies to reduce downtimes and cost [24], to control complex machine state dynamics in manufacturing, or to predict product quality [25]. In this analysis, we focus on the application of predictive control using machine learning. 3.1.1 Predictive models as virtual sensors In predictive control of production plants, as illustrated in Figure 1, the output of the predictive model is used as an additional input of the control algorithm, which in turn operates the plant. In an online setting, the output of the plant is fed back into the model creating a learning or self-adapting system. This generalised illustration also covers for example predictive maintenance scenarios where the model's prediction is used by the controller to optimally schedule the required plant downtime for maintenance tasks. One frequent application scenario of prediction models are so-called virtual sensors, wherein the models are trained on target features that were recorded by physical sensors in an offline phase. The physical sensor is subsequently replaced by the predictive model in the online phase of the plant. Virtual sensors can be used to replace costly or impractical physical sensors or to provide insights in cases where the application of a physical sensor is unfeasible in production. For example, semiconductor manufacturing applies the so called virtual metrology to predict product properties based on process variables and logistic information on the production that is available without further cost [25]. FIGURE 1Open in figure viewerPowerPoint Information flow diagram for control of a plant using prediction models. The optional feedback loop, displayed as a dashed line, differentiates between the online and offline setting 3.1.2 Pre-processing and information on the data as part of the model Machine learning methods and resulting data-driven predictive models only yield valid predictions if input data ranges or values are equal or similar to the values seen during the building phase [26], that is as long as there are no changes in the environment which affect the model's, the controller's or the plant's behaviour. Outside this tested and validated parameter ranges, high uncertainty or erroneous predictions could lead to incorrect controller outputs, instable processes, and in the worst case, significant failure of the plant. To prevent these scenarios, the ranges of input data and certainty of predictions in models need to be monitored and models have to be adapted when necessary. Thus, any pre-processing modifications, such as feature engineering techniques, on the training dataset such as null value replacement, resampling or transformations must be applied identically on training, test and online data. Consequently, pre-processing code and valid ranges of input data must be considered as part of the prediction model and have to be integrated into any model management system. 3.2 Concept drift in smart manufacturing Concept drift refers to a change of the modelled process or system. In the literature, concept drift is often characterised based on simplified two dimensional examples showing abrupt, gradual, incremental or reoccurring drift [27]. Webb et al. [28] extend this base characterisation with a taxonomy for concept drift types and provide further classification of concept drift. Models have to be adapted, retuned or retrained, to accommodate the change in concept of the target system. Gama et al. distinguish model adaptation methods into blind adaptation where models adapt to drift implicitly by learning on a sliding window and informed adaptation where concept drift is flagged by some detector. The informed adaptation is further divided into global replacement where the full model is discarded and a new model is created or local replacement where models can be retrained or adapted to react to local changes in data [27]. Regardless of the type of concept drift and model adaptation method, the change inside the retrained model or the new model must be tracked by a model management system. Comparison of model versions and the changes after model adaptation might provide additional insights about the target system. 3.2.1 Detection of drift Concept drift can be detected by monitoring the predictive accuracy of models. However, this requires a comparison of predictions to the actual values of the modelled variables. In applications where models are used as virtual sensors, this is not easily possible because the actual values are not measured. In such cases, concept drift can be detected only indirectly by monitoring observable variables (inputs, internal state, outputs) of the process. The deterioration of predictive accuracy, which is caused by concept drift, subsequently affects the process and leads to sub-optimal processing conditions. Deviations in observable variables, relative to the values when the model was initially deployed, are indicative of concept drift. For example the quality of input materials caused by upstream processes or tool wear often has an effect on the process that can be observed even though there is no data on the root cause of the problem. Observables that should be monitored in a smart production environment are for example the frequency and duration of down-times, failure rates or product quality (often recorded only second hand by rate of product returns or complaints per production batch). Information about such observables is often only available in external enterprise systems such as ERP, PPS or MES. The integration of these external systems as triggers or information base into the model management system is imperative to implement automated model adaptation strategies. 3.2.2 Effects of concept drift in smart manufacturing Concept drift occurs in almost all smart manufacturing applications. The reason is that production plants have a long lifetime over which many changes in upstream and downstream processes occur naturally. Potential causes that come to mind are for example changing suppliers of parts or input materials, improved upstream processing leading to changed input material quality, changes to the product mix as a reaction to customer demands, changed frequency and schedule of tool maintenance or changing suppliers for spare parts or consumables. It is clear that machine learning models must adapt to such changes as well. We cannot assume that there is a single best model for the application which can be kept as is for a longer duration in which such changes occur. Instead, it is necessary to assume a continuously changing environment and prepare infrastructure and processes for monitoring and iterative or continuous adaptation of machine learning models. This enables to build increasingly improved machine learning models while more and more data about the process and its environment is collected. In applications, we often encounter systems of interrelated models whereby individual models capture individual entities or effects in the modelled system or process (cf. Section 4.1). Concept drift might affect only certain parts of the system and necessitate updates of only selected models. Ideally, the interfaces between the individual models are well-defined and stable, so that updates of lone models are possible. However, we also require validation strategies to ensure that input data ranges and prediction accuracy and ranges are compatible with all interconnected models. Correspondingly, over time many different versions of predictive models—each representing the best model for a certain component and point in time—are created and must be managed, cf. Section 3.3, to ensure a continuous improvement of predictive models. 3.3 Model management systems for smart manufacturing In the literature discussing concept drift, model management refers to ensemble learning methods, wherein a concept drift adaptation algorithm employs multiple individually weighted models to react to changing concepts [27]. In this work however we use the term model management to refer to end-to-end machine learning lifecycle systems [7]. Model management systems (MMS) (i) facilitate monitoring of machine learning experiments, (ii) store machine learning artefacts such as data, scripts, resulting models and evaluation metrics and (iii) are, in many cases, agnostic with regards to the machine learning framework or algorithms. However, such systems target data scientists and cover only the machine learning lifecycle of data preparation, training and sometimes also model deployment in the form of prediction serving. 3.3.1 Model selection and evaluation Typical MMS, as detailed in Section 6.2 and Section 7.2, allow management of machine learning experiments and monitoring of predictive model qualities over multiple experiment iterations and model versions. However, during selection of a suitable prediction model for an application scenario, predictive accuracy is only one of multiple relevant metrics. In practice, model interpretability or explainability [29], the complexity of implementation and deployment of models, and the flexibility to combine multiple different models are important as well. Therefore, modules for inspection and validation as well as for the automatic deployment of machine learning models are essential components of MMS [30]. Ideally, it should be easy to verify that a predictive model works as expected and the deployment should be almost effortless. This makes it possible to quickly iterate and frequently deploy improved models. Additionally, safety mechanisms are required to ensure that model deployments are successful and to allow rollback to previous model versions when problems occur. 3.3.2 Model extension and adaptation For example, assume that the MMS is notified by an external knowledge sources about the introduction of a new product type, which requires a specific adaptation of the processing recipe. Here we would like to keep the model which works well for the existing product types and simply extend it to support the new product type as well. With white-box models such extensions are usually easier to accomplish. White-box models can be inspected, validated and verified by domain experts [31]. Correspondingly, they can be extended easily to add model capabilities. For black-box models, the idea of transfer learning can be used for such adaptations [26], whereby a model with its optimised parameters is re-trained with an extended dataset. Such re-training usually requires less computational effort than fully training a new model from scratch. Therefore, the MMS must be aware of different model types and select appropriate adaptation methods for the occurring changes. In the case of white-box models, changes over multiple versions can be easily tracked by MMS and differences between model versions can be highlighted. It is possible to document why specific parts of the model have been added or changed in accordance with changes that occurred in the data, process or system. These analysis and documentation capabilities also increase trust in model predictions by human operators. 3.3.3 Architecture of a MMS for smart manufacturing Our extended system architecture for MMS is based upon existing MMS designs as described in the literature. A MMS utilizes machine learning algorithms and persists the machine learning artefacts such as data, scripts and models in an integrated store. To such an existing system, we add additional capabilities for continuous model adaptation and assurance of functional safety in smart manufacturing environments. Figure 2 illustrates the system architecture, wherein the MMS deploys prediction models to edge devices located at the plants and subsequently monitors their performance. Computations for model adaptation can be executed directly on the edge when the devices are powerful enough. Alternatively, the necessary computation can be executed on the MMS. When model adaptation is executed on the edge, the edge devices send back the results of adaptation (i.e. extended datasets and updated prediction models) to the MMS. The MMS is serving multiple plants and new data trickles in row- or batch-wise through IIoT platforms and protocols such as MQTT. On the arrival of new data, the predictive accuracy of models is evaluated which might trigger the adaptation of models. A typical sequence of model adaptation using such an architecture is shown in Figure 3 whereby the model validation component runs simulations with the updated models to ensure the functional safety of the updated model prior to deployment—similarly to an automated software test. FIGURE 2Open in figure viewerPowerPoint Architecture for the integration of MMS in smart manufacturing environments FIGURE 3Open in figure viewerPowerPoint Exemplary sequence of steps executed by the components of the MMS when new data arrives and the need to adapt the model is detected 4 EXEMPLARY SMART PRODUCTION PROCESS In this Section, we describe an exemplary smart production process, comparable to real life applications, that includes multiple interconnected, heterogeneous machine learning models and complex interactions of different input and environment variables. This will serve as an example for the analysis of possible causes for concept drift and serves to motivate the subsequent recommendations for handling these cases in Section 6. This example should highlight some of the challenges that present themselves in the application of predictive models in smart manufacturing scenarios, especially, in the area of interconnected models and resulting requirements for model evaluation and compatibility. 4.1 Process description The machine of this exemplary process is a continuous production machine. Work pieces are fed as inputs into one end, go through a multi-step production process in which the work pieces are scanned, heat treated, cooled down, and result in a continuous stream of final output products, as visualised in Figure 4. FIGURE 4Open in figure viewerPowerPoint Individual components of an exemplary heat treating process for work pieces. Prediction models are combined to a digital twin of whole system and allow simulation of the full process and optimisation of key parameters Optimal parameters for the control of the plant are primarily dictated by the composition or type of the input material and the shape of the work piece. In this exemplary process, these properties and dimensions are not known beforehand. The machine contains physical sensors that scan the work piece and a machine learning prediction model is applied to classify the input work piece and identify the material properties such as its composition. As visualised in Figure 4, the model passes this classification and the associated heat transfer coefficient to the heating simulation. Simultaneously, a camera and object detection model provide the exact work piece dimensions. The complete heat treating process of the production machine could be simulated accurately by a finite element analysis (FEA) model. However, the computational resources required to complete a detailed simulation in time are not available on the edge device of the IIoT platform due to the restricted hardware of the controller system. Instead, as displayed in Figure 4, this part is subdivided into the heating simulation and heat distribution model. The heating simulation calculates a quick, less accurate, low-resolution run. The subsequent heat distribution model uses the low resolution heating simulation and work piece dimensions to predict the full heat distribution. In the online control setting of this physical production process the heat distribution model receives the current surface temperature of the work piece as feedback and is therefore able to adapt the prediction. This can be used to learn product characteristics such as the heating coefficient online, even for novel materials. A subsequent quality model estimates the target product characteristics by combining information about the heat distribution, the material composition, and the production recipe. A human operator is tasked to supervise the machine and is able to adjust machine and process parameters. The operator is supported by the system's quality controller component. This controller tunes parameters of the production recipe, whilst taking into account the predicted product characteristics of the quality model and energy efficiency predicted by the cost model. The operator keeps the machine running at minimal energy consumption and high material throughput, whilst adhering to the defined quality requirements for the product. An experienced operator is able to judge the output quality by inspecting product samples and changes control parameters on the fly based on the inspection of the product. The machine has a high number of configuration parameters with wide range of valid values with no singular optimal combination. Configuration of the machine's parameters can therefore be categorised of being multi-objective and multi-modal. The operator's choice is often made by habit and personal preference. We assume that in our example, training data for the quality model and cost model are recorded in offline test runs under controlled parameters where the individual effects of heat distribution and material properties could be systematically observed. In the subsequent online application the quality controller is able to react to feedback from the operator but is not able to observe the output quality directly, thus complicating conce

Referência(s)