Creating of an Algorithm for Analyzing Data from Industrial Equipment
2024; Springer Science+Business Media; Linguagem: Inglês
10.1007/978-3-031-51127-1_34
ISSN1876-1119
AutoresIlya Kovalev, Ramil Nezhmetdinov, Dieter Kalin,
Tópico(s)Economic and Technological Systems Analysis
ResumoThis article discusses an approach to the development of a software module for analyzing data that comes from process equipment. Various systems are analyzed, which are either complex vendor lock or simple systems for data collection, which lack the module of pre-active analytics and finding anomalies. The article proposes an algorithm for analyzing data from industrial equipment using machine learning. The k-mean method was used to cluster anomalies in technological data. The advantages and disadvantages of this method are described. The developed system architecture is described, which includes ClickHouse database with JDBC driver used, Apache Airflow data processing task management and scheduling module, and Pyspark, Scikit Learn libraries. After pre-processing the process data from the dataset (reciprocating compressor), a Random Forest model is trained on the labeled data, which why classifies the data into anomalous and non-anomalous. An interpretation of the results is given at the end of the paper.
Referência(s)