Indirect monitoring and early detection of faults in trains' motors

Artigo Acesso aberto Revisado por pares

Indirect monitoring and early detection of faults in trains' motors

2017; Institution of Engineering and Technology; Volume: 8; Issue: 2 Linguagem: Inglês

10.1049/iet-est.2017.0056

ISSN

2042-9746

Autores

Anuj Jain, Sami Barmada, Emanuele Crisostomi, Francesco Romanò, Fabrizio Tavano, Mauro Tucci,

Tópico(s)

Machine Fault Diagnosis Techniques

Resumo

IET Electrical Systems in TransportationVolume 8, Issue 2 p. 86-94 Research ArticleFree Access Indirect monitoring and early detection of faults in trains' motors Anuj Jain, Anuj Jain Indian Institute of Technology, Roorkee, IndiaSearch for more papers by this authorSami Barmada, Sami Barmada Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this authorEmanuele Crisostomi, Corresponding Author Emanuele Crisostomi emanuele.crisostomi@unipi.it Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this authorFrancesco Romano, Francesco Romano Trenitalia S.p.A., Firenze, Italy Now with Titagarh Firema Adler S.p.A, Milano, ItalySearch for more papers by this authorFabrizio Tavano, Fabrizio Tavano Trenitalia S.p.A., Firenze, ItalySearch for more papers by this authorMauro Tucci, Mauro Tucci Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this author Anuj Jain, Anuj Jain Indian Institute of Technology, Roorkee, IndiaSearch for more papers by this authorSami Barmada, Sami Barmada Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this authorEmanuele Crisostomi, Corresponding Author Emanuele Crisostomi emanuele.crisostomi@unipi.it Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this authorFrancesco Romano, Francesco Romano Trenitalia S.p.A., Firenze, Italy Now with Titagarh Firema Adler S.p.A, Milano, ItalySearch for more papers by this authorFabrizio Tavano, Fabrizio Tavano Trenitalia S.p.A., Firenze, ItalySearch for more papers by this authorMauro Tucci, Mauro Tucci Department of Energy, Systems, Territory and Constructions Engineering, University of Pisa, Pisa, ItalySearch for more papers by this author First published: 01 June 2018 https://doi.org/10.1049/iet-est.2017.0056Citations: 3 AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract This study investigates the ability of temperature sensors installed in the traction core of trains to early detect incipient faults. For instance, the breaking of a bearing is known to be critical as it may cause an increase of the temperature in the motor compartment, that in turn may eventually lead to a winding fault in the induction motor. The technique proposed in this contribution is characterised by extreme generality, since most frequent incipient faults lead to temperature increase that, if properly analysed, can be a tool for preventive maintenance. In particular, the measured data, provided by the main Italian railway company, are processed by two different methodologies which are characterised by positive, yet different, performances. The results show that preventive maintenance with the proposed approach is feasible. 1 Introduction Modern high-speed trains are fully equipped with sensors measuring physical quantities that are automatically recorded and sent to acquisition data centres [1]. Such data are often used to make a precise diagnosis after an event has caused the need of a non-scheduled maintenance. An example of this, that is of interest in this paper, is the measurement of the temperature of the engines of a train. Fast and/or out-of-range temperature variations have been recorded in the occurrence of a fault in the engine compartment. In this case, the objective is to early detect the fault before more serious consequences occur. This seems to be a very challenging task, as changes in the temperature are caused by a number of regular (non-faulty) events as well, including among others, the train velocity, acceleration, outside temperature, driving direction and the engine position (relative to the direction). In particular, a critical subsystem that needs an accurate monitoring and preventive maintenance is the traction core, composed by the engine, the bearings and the gearbox. In the authors' experience among the possible failures of one of the three above-mentioned components, the most frequent and critical one is the failure of a bearing, as if it is not detected in a short time, then it can consequently cause additional damages to the other components as well. Such a consideration is based upon the experience on the numerous fleet of E464 locomotives that are widely and frequently adopted for commuting services by Trenitalia S.p.A., the main railway company in Italy. There are many non-invasive techniques to monitor the functioning of induction motors, and usually they rely on easily measured electrical or mechanical quantities, such as voltage, current, external magnetic field, speed and vibrations [2]. In particular, the literature on early fault detection in induction motors is mainly focused on the use of stator current measurements, [3-5] and vibrations measurements [6, 7]. It is worth to mention that the motors of the regional trains E464 are not equipped with current and vibrations sensors, that are commonly used in high-speed trains. In contrast, E464 locomotives are equipped with temperature sensors that can record the signals during the train operation. Real-time temperatures have more rarely been investigated in traction applications, whereas there is a literature dealing with infrared thermography for condition monitoring of induction motors [8, 9]. Since the temperatures are available for monitoring the locomotive of the E464 fleet, this work investigates the possible predictive power of the temperature signals to detect motor faults, in particular the failure of a bearing. In [10-13], it is underlined that a temperature increase (thermal stress) can have negative effects both on the bearings and on the motor. In particular, a high temperature can deteriorate the bearing lubrication causing an abnormal friction that may eventually lead to a bearing damage and in turn accelerate the ageing process of the winding insulation leading to a winding fault inside the induction motor. In addition, as remarked in [14, 15] with specific reference to the condition monitoring of bearings, temperature rises can be attributed to several reasons, namely winding temperature rise, motor operating speed, temperature distribution within the motor, lubricant viscosity, and the amount of lubricant. Accordingly, a rise in the temperature can be caused by different factors and, at the same time, can be used as an indicator of a number of severe faults. In [10], it is said that about 30% of induction motor faults are stator winding fault, and a small percentage of them is directly caused by an initial fault in the insulation. In case the insulation has a defect, then partial discharges are created, which cause overheating that can be detected by looking at the temperatures before a short circuit occurs; on the contrary, if a short circuit suddenly occurs without a previous temperature increase, it means that the insulation presented major issues directly from fabrication, and at this stage nothing can be done in terms of predictive maintenance. It is of paramount importance for railway companies to develop the ability to perform online monitoring of the engine temperatures and preventive maintenance in order to prevent engine faults from occurring (due to the above-mentioned components), as they give rise to serious traffic disruptions, leading to high maintenance and rescheduling costs. In particular, we here use available temperature measurements (of both the stator windings and the gearbox) recorded with a sampling time of 3 min and made remotely available on a database, to perform such a predictive monitoring analysis. We analyse the time domain behaviour of the recorded temperatures, for both faulty and non-faulty conditions, and computational intelligence techniques are employed to identify non-conventional patterns corresponding to faulty operating conditions. In particular, in this case study, simple variance-based (or fixed threshold-based) techniques fail to reveal faulty conditions, and thus are inapplicable for preventive maintenance. On the other hand, more sophisticated and dynamic techniques provide promising results (at least on the available data) and appear to be able to take pre-emptive measures to fix the faults without needs of online rescheduling of the transportation service. In [16], a wireless communication-based system is proposed and the authors mention a diagnostic tool that uses engine-temperatures in real-time operation, but no specific details about the model are published in that paper. In fact, the scientific literature on this specific topic is scarce, as it is difficult to find detailed studies about the used algorithms and models, in particular for the analysis of engine-temperature data in real time. One of the motivations of our paper is to begin filling this gap; in this perspective, the objective of this paper is to investigate two possible strategies and provide some promising results obtained on real data. The techniques used in this paper are feedforward neural networks (FFNN) and Hotelling's multivariate control chart. As it will be shown later, both of these techniques are not new, however the implementation of the procedure on such a specific set of data, the definition of the thresholds and the way input/outputs are chosen (this applies to the FFNN) are an original contribution. Moreover, a semi-supervised (one-class) method (Hotelling) and a supervised method (FFNN) are intentionally selected because they are different both in function and mechanism, and this helps to differentiate the results and improving the performance of the analysis when they are used in parallel. This is a general rule of fusion of classifiers and it is confirmed by experimental results [17]. 2 System description and measurements The locomotives under study are the Trenitalia E464 class, constructed by Bombardier (http://www.bombardier.com/en/home.html), operating on regular commuter service (see Fig. 1). Each locomotive is equipped with four induction motors (see Fig. 2). All motors have (i) squirrel cage rotors, (ii) double start stators, (iii) four poles, (iv) a maximum power of 895 kW, and a (v) rated voltage of 1090 V. Fig. 1Open in figure viewerPowerPoint Trenitalia E464 class from http://www.bombardier.com/en/home.html Fig. 2Open in figure viewerPowerPoint One of the induction motors in the locomotive In addition, the locomotive is equipped with two inverters, where each one of them powers one of the two stator windings of the four motors, connected all-together in parallel. During the normal operation of the train, sensors are used to regularly monitor the temperature of the motor windings, the temperature of the gearbox lubricant (both acquired with standard PT100 Resistance Temperature Detectors), and the train speed (acquired through a pulse generator). The pulse generator has a maximum operating frequency of 30 kHz, and the temperatures are recorded with a sampling time of 3 min. More in detail, the temperature probes are located in the stator laminated core, so that the sensitive element can reach one of the stator tooth and could reach a temperature very close to the winding temperature; as for the probes in the gearbox, they are directly immersed in oil. Fig. 3 shows a schematic relative to the position of the PT100 probes. Fig. 3Open in figure viewerPowerPoint Temperature probes positioning For redundancy reasons, two temperature sensors are used on each motor and gearbox, thus leading to a total of 16 temperatures that are measured for each locomotive every 3 min. Some examples of recorded data are provided in Figs. 4-7. Fig. 4Open in figure viewerPowerPoint Example of the temperatures recorded for the motor in a normal operation Fig. 5Open in figure viewerPowerPoint Example of the temperatures recorded for the gearbox in a normal operation Fig. 6Open in figure viewerPowerPoint Example of the temperatures recorded for the motor when a fault occurs Fig. 7Open in figure viewerPowerPoint Example of the temperatures recorded for the gearbox when a fault occurs In particular, Figs. 4 and 5 show a sequence of 1500 samples for a regular operation of a train, with temperatures recorded for the four engines, and, with the same style, for the gearboxes, respectively. On the other hand, Figs. 6 and 7 show a sequence of 1500 samples that ends with a fault, as can be seen more clearly towards the end of Fig. 6, where the temperature of one motor becomes ultimately too large. A number of useful remarks can be made by observing the data: The temperatures have some periodic patterns. This is due to the fact that temperatures increase during a trip of the train and decrease again when the train stops before the next trip. Also, sometimes trains have long pauses (e.g. some trains do not travel at night time), and consequently all temperatures converge to a value close to the environmental temperature. In general, it is possible to observe that the temperatures of the motors are greater than those of the gearboxes. In some cases, all sensors provide very similar values of the temperature (see the sequence of Figs. 4 and 5), while in other circumstances they read more different values (see the sequence of 6 and 7). While this fact is not correlated with the chance of a fault, still the very different variance of the vector of read temperatures from train to train complicates the prediction of a fault in practice. The fault becomes evident from the temperatures of the motors while it is less evident from the temperatures of the gearboxes (see Fig. 6 in contrast with Fig. 7, that refer to the same operation of the same train). However, in our experience, the use of the sensors in the gearboxes is of paramount importance just the same for the prediction of the fault. In some cases, some sensors provide evidently wrong reads (i.e. negative temperatures). This could occur for single values (i.e. where more likely an error occurred in the data transmission process) or for longer sequences of values (i.e. more likely a fault in a single sensors). In our analysis, we have cleaned the data by neglecting (single) values (of single sensors) that are out of the historical range of temperatures, and in the case of long sequences of values out of range, we automatically send a warning to the control room regarding the likely failure of a sensor. It is worth mentioning that the proposed method is general, i.e. it can take into account also quantities such as the train velocity, acceleration, outside temperature, driving direction, and the engine position (already mentioned in the introduction); however, they were not available for all the data sets, for this reason the authors decided not to use them in order to test and compare all the runs on the same basis. 3 Methodologies In Fig. 8, it is possible to see, by visual inspection, that the eventual failure is about to happen. In particular, during normal operation, the temperatures of motor 4 are lower than the temperatures of motor 3, as indicated using green ellipses in Fig. 8. In the periods before the fault occurs, the temperatures of motor 3 start following a slightly different pattern, as indicated using an orange ellipse. However, it is not straightforward to design an algorithm that automatically (i.e. in an unsupervised fashion) can recognise in real time the faulty condition. In particular, it should be remarked that simple obvious solutions, as a point-wise comparison of single temperatures with a safety threshold, are in general ineffective. Fig. 8Open in figure viewerPowerPoint Visual inspection of the temperatures recorded for the motor when a fault occurs Given the multivariable nature of the application (i.e. 16 different signals of temperatures are constantly monitored in real time), in this paper we compare the ability of two different multivariate methods to recognise in advance the faulty condition; namely, the Hotelling's multivariate control chart ([18]), and FFNNs ([19]). One of the main challenges of predictive anomaly detection is that the class distribution of the data is in general unbalanced: observations of the abnormal behaviours are scarce while most of the observations represent nominal behaviours. Under unbalanced class distribution, like in this case, most of the classification methods perform poorly. A common solution consists in creating a model of the nominal behaviour, and monitoring the deviations from the nominal conditions. Both of the methods analysed in this paper adopt this strategy (though using different approaches), as it is described with more details in the remainder of this section. Other different methods exist to overcome the problem, one example being the use of cost matrices [20]. Unfortunately, this method, as many others, is not efficient if one of the two classes (anomalous vs nominal) contains a very small number of points (tens vs thousands). The anomaly in our case is a catastrophic event, and such data cannot be used to learn examples of anomalous behaviours. For this reason, we follow the most convenient and recommended approach (both for Hotelling and FFNN) that is modelling the nominal behaviour, and monitoring the deviation from the nominal behaviour. In particular, our approach is original for the FFNN model, as instead of using it as a classifier for detecting the anomalies, we use it as a regression algorithm between the temperatures of three motors as input and the remaining motor as output. 3.1 Hotelling multivariate control chart The Hotelling control chart performs a dimensionality reduction of the multivariate data to a scalar parameter denoted as statistics, which represents the square of the Mahalanobis distance [21] of the observation vector from the vector containing the mean values of the variables in nominal conditions. As reported in many studies [22-24], the statistics is able to capture the changes in multivariate data, revealing the deviations from the nominal behaviour. In principle, this allows one to select safety thresholds (UCL, upper control limit; LCL, lower control limit) on the control chart more efficiently than upon the original data. Although such a tool had been originally proposed already in 1947 as a tool for quality control, the Hotelling control chart is still being applied in many process control applications, and can be regarded as a precursor of one-class classification methods (i.e. methods that only model a single class of the data [25]). For these reasons, the Hotelling control chart is widely used for early detection of incipient faults, as an example [3, 4] proposed the use of the Hotelling control chart for incipient fault detection in induction motors, by monitoring the stator current. The construction of the control chart includes two phases: in the first phase, historical data are analysed and the safety thresholds are computed; phase two corresponds to the monitoring of the real-time process. In phase one, a faultless historic dataset of the process should be defined by experts with the confidence that it represents mainly nominal behaviours of the process. The historic dataset is used to create a statistic of the nominal behaviour, consisting of a nominal mean vector and a covariance matrix. Let the historic dataset of phase one be represented by the matrix , containing N observations of nominal states of the process, that consist in row vectors of p variables. We denote the row vectors of as , where . The sample mean vector of the data is defined as (1) In order to define the covariance matrix, we need to construct the demeaned (zero-mean) data matrix : (2) Then, the covariance matrix of the data is defined as (3) where we use the transpose notation . The multivariate statistics and C represent the nominal behaviour of the process, and we assume that C is full rank. The scalar statistics is defined as the following function of a single observation vector : (4) The statistics is small when the observation vector represents nominal states, while it increases when the observation vector deviates from the nominal behaviour. In order to define the safety thresholds UCL and LCL of the control chart, in phase one we calculate the mean value and standard deviation of the values obtained with the nominal observations , for , i.e.: (5) Then we define the safety thresholds as (6) where the actual concern regards only the upper limit, as in faulty conditions the temperatures exceed the upper limit. In phase two, we use process statistics and the control limits extracted during phase one in order to detect an anomalous behaviour on new data to be monitored. In particular, during phase two new observation vectors are measured, and the corresponding values are calculated as in (4). The Hotelling control chart consists in a monitoring tool that plots the values as consecutive points in time and compares them against the control levels. The process is considered 'out of control', and an anomalous behaviour is detected, when the values continuously exceed the control limits. We now briefly discuss how we implemented it in our specific application. The observation vector is represented by the vector of the 16 temperatures measured simultaneously. In particular, we have used the initial 600 samples (i.e. 30 h) of normal operation of the train to calculate the nominal values of the , i.e. in terms of its average value and its standard deviation as in (5). These values, as well as the control limits defined in (6), are characteristic of a specific train. Note that the specific values depend on the train, on its typical route, and also on some specific installation parameters (e.g. motors and sensors). In our experience, we have found out that it is very important to continuously update the values of and to take into account physiological variations of the nominal parameters, for instance due to different environmental temperatures. At the same time, the parameters cannot be updated too frequently to avoid including possible incipient faulty conditions into the computation of the safety thresholds. Accordingly, we shift the window of 600 samples every 50 new samples (i.e. every 2.5 h), when the new safety thresholds of 6 are duly recomputed. 3.2 Feedforward neural network (NN) Feedforward NNs are a class of universal approximators, as they can approximate arbitrarily well functions from to with a finite number of neurons in a single layer [19]. NNs are widely used for detection of incipient faults and predictive diagnostics in induction motors, for instance, they are used to detect and classify the faults using vibration signals, [6, 7], and stator current signals [5]. In this work, we consider one hidden layer networks, depicted in Fig. 9, where a stochastic gradient descent optimisation algorithm is used for backpropagation, and where the objective function is the minimisation of the mean-square error (MSE). The output of the NN is described by the following expression: (7) where is the input column vector, , , and represent the weight vector, the bias and the output weight of neuron i, respectively, and is the activation function (a non-constant, bounded, and monotonically increasing continuous function). Finally, H is the number of neurons in the hidden layer. One typical problem of NNs, especially when large sets of data are available like in this case, is that of overfitting, where roughly speaking the NN learns the data, and not the structure underlying the data. To avoid this circumstance and to improve the ability of the NN to generalise, we used a validation set as a subset of the training data (not used for training), which stops the training after the MSE in the validation set does not improve for a fixed number of consecutive epochs (max_fail). In particular, we have chosen a sigmoidal function as an activation function, we have determined the optimal number of neurons by using a three-fold cross-validation, we set the validation set as a randomly selected set of 20% the size of the training set, we set the value of max_fail equal to 15, and similarly to the Hotelling solution, we used a training set of 600 samples, representing nominal operation. As further data preprocessing it is worth to mention that the Matlab implementation of the FFNN automatically applies a mapping of input and output data to the range [–1, 1]. Fig. 9Open in figure viewerPowerPoint Architecture of the one hidden layer NN In our specific application, we decided to use four NNs in parallel, where each one of them had the temperatures of three motors as an input, p = 12, and the average temperature of the four sensors of the remaining motor as an output. The rationale of this choice is that under the assumption that the temperature signals remain more or less the same (as a whole), then by knowing the temperatures of three motors, one may learn how to predict the temperatures of the fourth one. However, such a pattern breaks when one fault occurs. In particular, one motor starts heating, and the other three motors lose their ability to predict its temperature. As a result of the cross-validation analysis, we obtained that the optimal number of neurons in the hidden layer H is between 3 and 5, depending on the particular run and particular set of input–output signals. In the monitoring stage, new observations are measured and given as input to the four trained NNs. The output of each NN is compared with the average of the measured temperatures of the motor to be predicted, and an absolute error, AE, is calculated and compared to an upper threshold. The threshold in this case is defined as three standard deviations of the AE calculated in the training phase. 3.3 Performance evaluation With both methodologies, and as typical with most fault predictions approaches, it is important to decide whether we are interested in receiving many alarms (which might include false alarms as well), or whether we wish to receive an alarm only when the algorithm is pretty confident that a fault has actually occurred. In this specific case, we are interested in being very conservative when giving an alarm, as a false alarm is also very expensive from the point of view of the train company (in fact, the train company might want to stop the train while running, with a number of inconveniences for the passengers). Thus, we are specifically interested in minimising the chance of having false alarms. However, the counterpart of being conservative is that a fault may be recognised with some delay (i.e. some extra time is required to make sure that a failure has actually occurred and it is not a false alarm). 4 Experimental results Final results corresponding to the analyses of ten different trains (with three faults) are summarised in Table 1. In addition, the results for two single realisations (i.e. the specific correct and faulty sequences previously shown in Figs. 4-7), obtained with Hotelling and NNs are shown in Figs. 10-13. Fig. 10Open in figure viewerPowerPoint Hotelling analysis of a normal operation of a train Fig. 11Open in figure viewerPowerPoint Absolute error index of the four NNs during a normal operation of a train Fig. 12Open in figure viewerPowerPoint Hotelling analysis. Here a faulty occurs towards the end (the vertical thick black line indicates that a too large temperature has been achieved). The vertical red line corresponds to the instant of time when Hotelling predicts the fault and recommends to stop the train Fig. 13Open in figure viewerPowerPoint NN analysis. Here a faulty occurs towards the end (the vertical thick black line indicates that a too large temperature has been achieved). The vertical red line corresponds to the instant of time when the NN predicts the fault and recommends to stop the train Table 1. Performance of the two algorithms in the early detection of the fault Performance/method Hotelling NNs false positives 0 0 true negatives 0 0 shortest early prediction of a fault, min 6 33 largest early prediction of a fault, min 402 42 ability to assess the exact motor where the fault has occurred no yes As can be noticed in Figs. 10 and 12, the statistics frequently exceeds the UCL threshold (horizontal-dashed line). To avoid false positives (i.e. fault alarms when not required), we decided to set the alarm when a sequence of 20 consecutive samples is found to continuously exceed the threshold. On the other hand, NNs do not seem to ever provide false positives, and in general they do not require special care in tuning particular parameters. This is a great advantage, as some tuning procedures (e.g. the previous choice of 20 consecutive samples out-of-bounds before Hotelling recognises a faulty condition in practice) may be regarded as empirical. Also, Fig. 13 shows that the all four the NNs recognise the occurrence of a fault. This is due to the fact that the high temperatures of a single motor affect both the NN where they correspond to the output si

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Indirect monitoring and early detection of faults in trains' motors