Wind Power Pattern Forecasting Based on Projected Clustering and Classification Methods
2015; Electronics and Telecommunications Research Institute; Volume: 37; Issue: 2 Linguagem: Inglês
10.4218/etrij.15.2314.0070
ISSN2233-7326
AutoresHeon Gyu Lee, Minghao Piao, Yong Ho Shin,
Tópico(s)Evaluation Methods in Various Fields
ResumoETRI JournalVolume 37, Issue 2 p. 283-294 ArticleFree Access Wind Power Pattern Forecasting Based on Projected Clustering and Classification Methods Heon Gyu Lee, Heon Gyu Lee hg_lee@etri.re.kr Search for more papers by this authorMinghao Piao, Minghao Piao bluemhp@cbnu.ac.kr Search for more papers by this authorYong Ho Shin, Corresponding Author Yong Ho Shin yhshin@ynu.ac.kr Corresponding Authoryhshin@ynu.ac.krSearch for more papers by this author Heon Gyu Lee, Heon Gyu Lee hg_lee@etri.re.kr Search for more papers by this authorMinghao Piao, Minghao Piao bluemhp@cbnu.ac.kr Search for more papers by this authorYong Ho Shin, Corresponding Author Yong Ho Shin yhshin@ynu.ac.kr Corresponding Authoryhshin@ynu.ac.krSearch for more papers by this author First published: 01 April 2015 https://doi.org/10.4218/etrij.15.2314.0070Citations: 6 This work was supported by the Postal Technology R&D program of MSIP (10039146, Development of Implementation Technology for SMART Post). Heon Gyu Lee (hg_lee@etri.re.kr) is with the IT Convergence Technology Research Laboratory, ETRI, Daejeon, Rep. of Korea. Minghao Piao (bluemhp@cbnu.ac.kr) is with the Database Laboratory, Chungbuk National University, Cheongju, Rep. of Korea. Yong Ho Shin (corresponding author, yhshin@ynu.ac.kr) is with the School of Business, Yeungnam University, Gyeongsan, Rep. of Korea. AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinked InRedditWechat Abstract A model that precisely forecasts how much wind power is generated is critical for making decisions on power generation and infrastructure updates. Existing studies have estimated wind power from wind speed using forecasting models such as ANFIS, SMO, k-NN, and ANN. This study applies a projected clustering technique to identify wind power patterns of wind turbines; profiles the resulting characteristics; and defines hourly and daily power patterns using wind power data collected over a year-long period. A wind power pattern prediction stage uses a time interval feature that is essential for producing representative patterns through a projected clustering technique along with the existing temperature and wind direction from the classifier input. During this stage, this feature is applied to the wind speed, which is the most significant input of a forecasting model. As the test results show, nine hourly power patterns and seven daily power patterns are produced with respect to the Korean wind turbines used in this study. As a result of forecasting the hourly and daily power patterns using the temperature, wind direction, and time interval features for the wind speed, the ANFIS and SMO models show an excellent performance. I. Introduction Wind power is the generation of electric power using mechanical energy converted through a wind turbine [1]. A wind generator can theoretically convert 59.3% of wind energy at maximum, but it is only practically possible to convert from 20% to 40% of wind energy owing to existent loss factors such as the wing shape, mechanical friction, and generator efficiency. In addition, the production level of electric power generated from wind is irregular and sometimes cannot meet the required power supply. To make matters worse, the power may change on a large scale. Since each problem is caused from the wind power source, it is very hard to forecast an accurate quantity of wind power generation. An accurate analysis model for predicting wind power generation can reduce the cost needed to maintain the equilibrium of supply and demand of electric power and help in the decision-making of timely infrastructure updates for the wind power industry. Therefore, more exact prediction techniques of power patterns are indispensable for the efficient operation and planning of wind power generation, and mathematical methods such as data mining are used in such prediction techniques [2]–[3]. In general, wind power generation prediction builds a power pattern model from related data and forecasts power patterns by applying a built model [4]. A variety of prediction models for wind power generation use wind speed as input data. At the learning stage of the prediction model, the relation between wind speed and wind power is learned, and we can predict the amount of wind power for a specified wind speed. The difference between the predicted wind power and actual measurement becomes the prediction error. There have been many comparative studies on predicting wind power based on wind speed. Li and others [5] applied regression and an artificial neural network (ANN) model and showed that the ANN performed better than regression. Üstüntas and Sahin [6] estimated the power curve using a cluster center fuzzy logic (CCFL) model, which demonstrated the lowest prediction errors. Kusiak and others [2] suggested five non-parametric models for monitoring wind farm power; that is, a neural network (NN), M5 tree, representative tree, bagging tree, and k-nearest neighbor (k-NN), the last of which showed the best performance. Unlike existing studies, Schlechtingen and others [7] added ambient temperature and wind direction in addition to wind speed as prediction model inputs, as well as applying four data mining techniques. The applied prediction models were CCFL, NNs, k-NN, and an adaptive neuro-fuzzy interference system (ANFIS). Their test results showed that the prediction model errors were reduced when the temperature and wind direction were applied instead of the wind speed only. When the four prediction models were compared with each other, ANFIS showed the lowest prediction errors and the possibility of early detection of abnormal power output. Rahmani and others [8] applied a hybrid technique of ant colony optimization and particle swarm optimization on wind speed and environment temperature for short-term wind energy forecasting. The mean absolute percentage error was used to assess the accuracy of the model. As a result of the aforementioned related studies, it can be summarized that ANN, k-NN, and ANFIS are good techniques for power generation and power curve estimations. However, existing studies applying these techniques did not analyze generation patterns varying with time (hour, day, and month). They estimated the power generation output (kW) with respect to wind speed input (m/s) by building a power curve based on a prediction model. This is the main difference between them and our suggested model, of which this study forecasts power generation for each time stamp. Moreover, since they only estimate the power output based on wind speed, temperature, and direction, there exists a limitation in that most of them are only short-term forecasting models. Recently, Azad and others [9] proposed statistical-based and NN-based approaches to predict the hourly wind speed data of the subsequent year in their long-term wind speed forecasting study. The proposed approaches exhibited small error rates, with values occurring in the range 0.8 m/s to 0.9 m/s. Even though wind speed and output power of a wind generator have a proportional relation (Schlechtingen [7] insisted that temperature and wind direction affect wind power), our study predicts not wind power but long-term hourly wind speed. Wind power patterns have significantly varying power generation depending on the season and time. Therefore, it is necessary to create accurate wind power patterns under different time conditions and to analyze their characteristics. For this study, we generate wind power patterns of wind turbines and use a projected clustering technique to profile the resulting characteristics. Traditional clustering algorithms discover clusters using all data dimensions. Most of the time, however, time-series data, such as power patterns, are characterized based on time-interval features in the subset of the dimensions. As the number of dimensions increases, distance measures of the clustering algorithms become more and more meaningless [10]. Thus, the application of projected clustering methods allows selecting of the cluster composition of similar power patterns and feature vectors that are the subset of all dimensions used for that cluster composition. Figure 1 shows the framework of wind power pattern profiling and forecasting suggested in this study. From Fig. 1, the following steps can be identified: 1. Data generation and preprocessing. a) The recorded wind power value based on time of use of the wind turbines is calculated using the values of different time granularities, such as hour and day. b) Temperature, wind speed, and direction data are induced to generate the training and testing datasets; the wind speed is the value measured at the same time as the wind power. 2. Power patterns clustering. a) From the wind power patterns of the wind turbines, the patterns with high similarity are grouped through a projected cluster analysis. b) Representative power patterns are then created from each group, and class labels are created for the groups. 3. Building the prediction models. a) As a result of the cluster analysis, we can generate the wind speed values corresponding to the time intervals by detecting those intervals that belong to subsets of the time dimensions applied to the algorithms. b) These time-interval features for the wind speed are employed as inputs for prediction model learning along with the temperature and wind direction. c) As the output of the prediction model, class labels corresponding to the representative power patterns of each group are assigned to the new data. Figure 1Open in figure viewerPowerPoint Wind power pattern profiling and forecasting framework. This study has three contributions. Firstly, representative power patterns of wind turbines operated in Korea are found, and these features are then profiled by subspace projection methods. Secondly, this study proposes a way to select task-relevant features instead of wind speed in a whole time dimension (selecting time-interval features, temperature, and wind direction through subspace discovery over a whole time dimension) so as to build a highly accurate prediction model. Finally, we apply the existing techniques in the wind power prediction research and suggest the most appropriate method, upon assessment, for predicting the profiled representative power patterns. The rest of this paper is organized as follows. In Section II, we review previous studies of projected clustering approaches and point out their characteristics. In Section III, we introduce state-of-the-art classifiers for predicting power patterns. In Section IV, we present the experimental results and discuss issues. Finally, in Section V, we provide some concluding remarks. II. Projected Clustering Methods In this study, we use projected clustering approaches for discovering representative power patterns. Projected clustering is a method for detecting clusters with the highest similarity from the subsets of all data dimensions. The biggest difference between projected clustering and traditional clustering methods is that in a projected clustering approach, the detection of various subsets is carried out based on the fact that subsets differ from each other and that they include meaningful clusters, rather than considering all dimensions given during the clustering process [11]. For example, if time is a dimension and power generation is an object in the data where generated wind power values are recorded, through the projected clustering shown in Fig. 2, then time intervals (t7 through t15) can be detected, which are subsets having discriminating power among different clusters. Projected clustering approaches can be divided into three paradigms based on the detection methods of the subsets [12]. Figure 2Open in figure viewerPowerPoint Different power patterns of different clusters. The first paradigm is to divide a data space into grid-cells (cell-based) and form clusters of sufficient density from the cells. The basic concept is to first define grid-cell sets before assigning objects to suitable cells and to then calculate the density of each cell. Next, cells with a density of a certain threshold or lower are removed and clusters are built from a series of cells with high density. A popular cell-based method, CLIQUE [13], is a grid-based clustering algorithm that detects clusters of subsets through certain procedures. When multi-dimensional data points of large capacity are given, the data space, in general, is not uniformly occupied by the data points. The clustering of this approach discriminates sparse and crowded regions in space (or unit) and detects the entire distribution type of the datasets. Clustering of CLIQUE is defined as the biggest group of connected dense units. SCHISM [14] finds subsets by using a "support" and Chernoff–Hoeffding bound concept and determines the interesting subsets using a depth-first search and backtracking. The second paradigm is density-based projected clustering, which utilizes an algorithm to identify clusters. In this algorithm, a cluster is defined to be a dense area (that is, a group of points that are closely packed together, whereby each point has many nearby neighbors) separated by sparsely populated areas (that is, low-density areas). Though the overall clustering concept is based on DBSCAN [15], the density calculation here considers only the relevant dimensions. The representative algorithm is FIRES [16], and it applies an efficient filter-refinement method. Above all, the existing base-clusters are created, and those that fail to meet the given density conditions are removed in the filtering stage. Next, the base-clusters are merged to create the maximal dimensional projected cluster approximations. Lastly, the final refined clusters are built during the refinement stage. The SUBCLU [17] is a DBSCAN-based greedy algorithm for projected clustering. Unlike grid-based approaches, it can detect clusters with arbitrary shapes. The third paradigm is a clustering-oriented approach. As the data dimension increases in the clustering for high-dimensional data, clustering that considers all dimensions can hinder the performance remarkably owing to the presence of sparse data. PROCLUS [18], a famous algorithm, starts from a single dimensional space. Instead, the algorithm of the third paradigm begins by searching the initial estimation regarding clusters in a high-dimensional space. Weight is provided for each cluster per each dimension, and the renewed weight is used to create clusters again for the next iteration. STATPC [19] detects relevant subsets based on objects and builds candidate subspaces, which are refined to build local optimal projected clusters. Finally a greedy search algorithm is used to review all subspaces and build optimal clusters. Table 1 shows the properties of the clustering algorithms used in our study (all properties of the clustering algorithms are stated in [12]). The important parameter settings and performance evaluation results of the algorithms are described in detail in Section IV. Table 1. Properties of the three paradigms. Paradigm Algorithm Properties Cell-based CLIQUE Fixed threshold and grid size, pruning by monotonicity property SCHISM Enhanced CLIQUE by variable threshold, using heuristics for pruning Density-based FIRES Variable density threshold, based on filter-refinement architecture to drop irrelevant base-clusters SUBCLU Fixed density threshold, pruning by monotonicity property Clustering-oriented PROCLUS Fixed cluster number, iteratively improving result like k-means, partitioning STATPC Statistical tests, reducing result size by redundancy elimination III. Classification Model for Predicting Representative Power Patterns Feature vectors for the classifier's supervised learning include prior information such as the temperature, wind direction, and wind speed, which are time-interval features; class labels are representative power patterns built through clustering. Among all the features used for the supervised learning, the wind speed affects the power output the most, while the wind speed and wind power are measured for the same hour. Therefore, the model considers only the wind speed values, which is a time-dimensional subset (time-intervals features) selected during the clustering stage. This has the effect of relevant feature selection and a dimensionality reduction to build accurate and fast classifiers. For instance, Fig. 3 describes the time dimension involved in building three clusters, C_0, C_1, and C_2. Time intervals (f1, f2, f3, f4); that is, the subsets of all time dimensions that are applied to the clustering of the wind power patterns, are applied to the wind speed equally, and only the wind speed value corresponding to these time intervals is drawn as a feature vector. The classifiers used in the study are the sequential minimal optimization (SMO) algorithm, which shows an excellent performance, and AFNIS, k-NN, and ANN, which were all evaluated in related papers [8]–[9]. Figure 3Open in figure viewerPowerPoint Example of time-interval features for wind speed. 1. ANFIS ANFIS [20] is a kind of ANN based on a Takagi–Sugeno fuzzy inference system. Since it integrates both NNs and fuzzy logic principles, it has the potential to capture the benefits of both in a single framework. Its inference system corresponds to a set of fuzzy IF–THEN rules that have the learning capability to approximate nonlinear functions. As wind power prediction includes the uncertainty of the input/output variables, power generation is determined through learning. Therefore, employing ANFIS can help with the accuracy of prediction, as modeling a prediction system mathematically is difficult, and the nonlinearity is contained. Figure 4 shows the ANFIS structure utilized in this study, and the layer properties and learning procedure are as follows: Figure 4Open in figure viewerPowerPoint Structure of ANFIS. ■ Layer 1. A given node, i, has , and , where x is the input value of node i, Ai indicates the fuzzy set related to the function of the node, and O1,i is the membership function that represents the membership degree of the input value x for Ai; and is put into (1) in various ways and can be written as (2) through a parameter adjustment. (1) (2) ■ Layer 2. A T-norm computation is conducted, and each membership function is multiplied and presented as (3) below. (3) ■ Layer 3. The i rule is normalized to perform computations such as (4) below. (4) ■ Layer 4. The output function of each rule is multiplied by the compatibility obtained from layer 3, such as (5). The parameters pi, qi, and ri are determined in such a way as to minimize errors. (5) ■ Layer 5. The output is calculated through the above process. (6) 2. SVM by SMO The SMO algorithm [21] is appropriate to realize the optimization of the support vector machine (SVM), which is offered by a different normalization value to the class for imbalanced learning. SMO is an algorithm for solving the quadratic programming problem that arises during the training of support vector machines and is widely used for training such machines. The learning stage of the SMO algorithm detects an optimal hyperplane using training data and classifies it using test data. Although the SMO algorithm provides the Poly and radial basis function (RBF) kernels, the RBF kernel is generally used in many cases for the following reasons. The RBF kernel can handle nonlinear relationships between classes and attributes and has fewer hyper-parameters that influence the complexity of the model selection than the Poly kernel. The RBF kernel function is as follows: (7) For the test, the RBF kernel was chosen as the kernel function. A grid-search approach using a 10-fold cross validation was carried out to determine the optimal value for each dataset of parameters C and γ, and as a result, the parameter range was determined as to . 3. k-NN k-NN is a basic instance-based learner that finds the training instance closest in Euclidean distance to the given test instance and predicts the same class as this training instance [22]. This paper used the IBk algorithm; that is, the k-NN classifier, which is provided by Java WEKA [23]. The number of nearest neighbors can be specified explicitly in an object editor or determined automatically using a leave-one-out cross validation, subject to an upper limit given by the specified value. The predictions from more than one neighbor can be weighted according to their distance from the test instance, and two different formulas are implemented for converting the distance into a weight. The number of training instances maintained by the classifier can be restricted by setting the window size option. As new training instances are added, the oldest ones are removed to maintain the number of training instances at this size. Parameter k is attested by setting a default value of 1 to each class. 4. ANN ANN is useful to consider complicated nonlinearity, while a multilayer perceptron (MLP) NN is currently utilized for time-series forecasting. An ANN model consists of learning, parameter coordination, verification, and forecasting steps. At the learning step, the structure of the NN is determined by learning the nonlinear relationship between input and output variables using the backpropagation algorithm. The verification stage attempts to predict using the structure determined by learning and minimizes the error with ANN model learning. The accuracy of forecasted wind power patterns is verified by analyzing the performance error with mean absolute error (MAE). In the study, an MLP model provided by Java WEKA [23] was used. The nodes in this network are all sigmoid (except for when the class is numeric, in which case the output nodes become unthreshold linear units). IV. Experimental Results and Discussion 1. Datasets and Data Preprocessing Wind power generation data were collected from three wind turbines with different regional characteristics in Korea for a year throughout the four seasons in 2010. Two of them (WT1 and WT2) were operated on land, and the other on an island (WT3). As ten minutes of saved wind power can be changed into a value of various time granularities (day, week, and month), wind power patterns of different time units can be built. Definition 1. A time schema (TS) is defined as the time granularity and its domain. The form of the schema is as follows: (8) where G is a time granularity, and D is the domain value of time granularity G as a set of positive integer numbers. In the case of , is valid for expressing the twentieth day, whereas is not valid. Definition 2. The power pattern of each turbine can be described as below for the given TS (9) where w is the turbine identifier used to measure the power generation, and pw is the total power generation during the given TS. If , then the power patterns are illustrated using hour units and one-day power patterns that have a total of 24 dimensions. Figure 5 shows an example of hourly power patterns. Figure 5Open in figure viewerPowerPoint Hourly power patterns of two turbines. The above hourly power patterns show the change in power generation of two turbines for a day-long period. The wind energy data used in this paper are shown in Table 2, while the distribution of the training dataset and test dataset preprocessed by the two types of time units is shown in Table 3. Table 2. Extracted features from raw data. Feature Description Timestamp Wind turbine ID Turbine ID code Wind power Wind power generation measured by hour Wind speed Wind speed measured by hour Ambient temp. Ambient temperature of turbines measured Wind direction Wind direction measured Table 3. Data distribution after data preprocessing. Data type Wind turbine ID Training (80%) Testing (20%) Total Type 1: WT1 5,792 1,448 7,240 WT2 5,768 1,442 7,210 WT3 4,832 1,208 6,040 Type 2: WT1 285 71 356 WT2 280 70 350 WT3 216 54 270 2. Projected Clustering for Profiling Power Patterns Building representative power patterns through a clustering analysis of the wind power patterns can characterize the change in wind power generation patterns depending on the time of objects inside the clusters. The cluster analysis stage describes the cluster analysis results of only SCHISM, FIRES, and PROCLUS, which showed excellent performance among the six algorithms introduced in Table 1. The projected clustering algorithm uses Java WEKA's [24] OpenSubpace, which is a data mining tool. OpenSubspace supports up-to-date performance evaluators to facilitate studies on projected clustering. For unsupervised learning methods such as clustering, it is difficult to provide appropriate parameter settings without prior knowledge on the data. In the case of the k-means algorithm, for instance, users face a difficulty in defining the appropriate number of clusters in advance. OpenSubspace supports parameter bracketing for selecting the most appropriate parameter values. However, as most parameters are supposed to be set as a range and not as a specific value, more repeated works are needed than in traditional methods to determine the optimal range. For example, the PROCLUS algorithm uses parameter C as the number of clusters and parameter D as the number of dimensions. If the range of these two parameters is set at by the user, then a total of 20 individual results need to be analyzed to find the optimal parameter pair of (C, D). The optimal parameter settings of each algorithm for the two types of data corresponding to the hour and day in the test are given in Tables 4 and 5, respectively. Table 4. Parameter bracketing for Type 1 (. Algorithm Parameter From Offset Op Steps To Cell-based (SCHISM) TAU 0.1 0.1 + 10 1.0 XI 1 1 + 24 24 U 0.05 0 + 1 0.05 Total number of experiments: 240 (steps: ) Density-based (FIRES) BASE_DBSCAN_EPSILON 1.0 0 + 1 1.0 BASE_DBSCAN_MINPTS 100 0 + 1 100 GRAPH_K 15 1 + 4 18 GRAPH_MIN CLU 1 1 + 4 4 GRAPH_MU 1 1 + 4 4 GRAPH_SPLIT 0.66 0 + 1 0.66 POST_DBSCAN_EPSILON 300 0 + 1 300 POST_DBSCAN_MINPTS 24 0 + 1 24 PRE_MINIMUM PERCENT 10 00 + 1 10 Total number of experiments: 64 (steps: ) Clustering-oriented (PROCLUS) average Demensions 1 1 + 24 24 numberOfClusters 2 1 + 8 9 Total number of experiments: 192 (steps: ) Table 5. Parameter bracketing for Type 2 (. Algorithm Parameter From Offset Op Steps To Cell-based (SCHISM) TAU 0.1 0.1 + 10 1.0 XI 1 1 + 30 30 U 0.05 0 + 1 0.05 Total number of experiments: 300 (steps: ) Density-based (FIRES) BASE_DBSCAN_EPSILON 1.0 0 + 1 1.0 BASE_DBSCAN_MINPTS 100 0 + 1 100 GRAPH_K 15 1 + 4 18 GRAPH_MIN CLU 1 1 + 4 4 GRAPH_MU 1 1 + 4 4 GRAPH_SPLIT 0.66 0 + 1 0.66 POST_DBSCAN_EPSILON 300 0 + 1 300 POST_DBSCAN_MINPTS 30 0 + 1 30 PRE_MINIMUM PERCENT 10 0 + 1 10 Total number of experiments: 64 (steps: ) Clustering-oriented (PROCLUS) average Demensions 1 1 + 30 30 numberOfClusters 2 1 + 9 10 Total number of experiments: 270 (steps: ) Because the projected clustering algorithm groups similar power generation patterns for the time dimensions in the training datasets and classifies which group the test data objects belong to out of the defined clusters, it includes the clustering and classification methods together. Therefore, an evaluation measure such as the sum of the squared error or normalized mutual information [25] for a traditional clustering method is inappropriate. The present study used evaluation measures such as the precision, recall, F1-value, and accuracy to evaluate the three clustering algorithms. Formal definitions of these measures are given below. (10) (11) (12) (13) In (10)–(13), we use the following abbreviations: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). All tests for the three clustering measures use 10-fold cross validation. In addition, the original classes used to evaluate the algorithms in the cluster analysis stage are the wind turbine IDs (WT1, WT2, WT3) of the three regions. Table 6 shows the result of the evaluators based on the clustering methods for the two datasets. Table 6. Description of the summary results. Data Algorithm Precision Recall F1 Class Cell-based(SCHISM) 0.748 0.873 0.805 WT1 0.688 0.550 0.611 WT2 0.647 0.423 0.512 WT3 Density-based (FIRES) 0.762 0.912 0.830 WT1 0.655 0.475 0.551 WT2 0.824 0.538 0.651 WT3 Clustering-oriented(PROCLUS) 0.809 0.873 0.881 WT1 0.769 0.750 0.937 WT2 0.579 0.423 0.900 WT3 Cell-based(SCHISM) 0.748 0.873 0.805 WT1 0.688 0.550 0.611 WT2 0.647 0.423 0.512 WT3 Density-based (FIRES) 0.787 0.833 0.810 WT1 0.694 0.625 0.658 WT2 0.500 0.462 0.480 WT3 Clustering-oriented(PROCLUS) 0.88 0.863 0.871 WT1 0.882 0.925 0.871 WT2 0.565 0.500 0.531 WT3 In both data types, the test results show that the PROCLUS method, which is a clustering-oriented approach, achieves a good performanc
Referência(s)