Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors

Artigo Acesso aberto Revisado por pares

Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors

2016; Electronics and Telecommunications Research Institute; Linguagem: Inglês

10.4218/etrij.16.0115.0517

ISSN

2233-7326

Autores

Xiucai Ye, Tetsuya Sakurai,

Tópico(s)

Face and Expression Recognition

Resumo

ETRI JournalVolume 38, Issue 3 p. 540-550 ArticleFree Access Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors Xiucai Ye, Corresponding Author Xiucai Ye yexiucai2013@gmail.com Corresponding Authoryexiucai2013@gmail.comSearch for more papers by this authorTetsuya Sakurai, Tetsuya Sakurai sakurai@cs.tsukuba.ac.jp Search for more papers by this author Xiucai Ye, Corresponding Author Xiucai Ye yexiucai2013@gmail.com Corresponding Authoryexiucai2013@gmail.comSearch for more papers by this authorTetsuya Sakurai, Tetsuya Sakurai sakurai@cs.tsukuba.ac.jp Search for more papers by this author First published: 01 June 2016 https://doi.org/10.4218/etrij.16.0115.0517Citations: 9 Xiucai Ye (corresponding author, yexiucai2013@gmail.com) and Tetsuya Sakurai (sakurai@cs.tsukuba.ac.jp) are with the Department of Computer Science, University of Tsukuba, Ibaraki, Japan AboutSectionsPDF ToolsRequest permissionExport citationAdd to favoritesTrack citation ShareShare Give accessShare full text accessShare full-text accessPlease review our Terms and Conditions of Use and check box below to share full-text version of article.I have read and accept the Wiley Online Library Terms and Conditions of UseShareable LinkUse the link below to share a full-text version of this article with your friends and colleagues. Learn more.Copy URL Share a linkShare onFacebookTwitterLinkedInRedditWechat Abstract Spectral clustering is a powerful tool for exploratory data analysis. Many existing spectral clustering algorithms typically measure the similarity by using a Gaussian kernel function or an undirected k-nearest neighbor (kNN) graph, which cannot reveal the real clusters when the data are not well separated. In this paper, to improve the spectral clustering, we consider a robust similarity measure based on the shared nearest neighbors in a directed kNN graph. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors, and one based on their closeness. The proposed algorithms are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Moreover, the proposed algorithms have only one parameter, k. We evaluated the proposed algorithms using synthetic and real-world datasets. The experimental results demonstrate that the proposed algorithms not only achieve a good level of performance, they also outperform the traditional spectral clustering algorithms. I. Introduction Spectral clustering has become one of the most popular clustering methods, and its performance is superior to that of traditional clustering methods, such as K-means [1]. Spectral clustering has been applied successfully in a large number of fields, including image segmentation [2], load balancing [3], video retrieval [4], and bioinformatics [5]. In spectral clustering, a similarity matrix is first constructed, and the spectrum of the similarity matrix is then used to reveal the cluster structure of the data. The performance of spectral clustering relies heavily on the similarity measure used to construct the similarity matrix [6]–[11]. The complex data are often high dimensional, heterogeneous, and without prior knowledge, and it is therefore a fundamental challenge to define a pairwise similarity for effective spectral clustering. The Gaussian kernel function is widely used as a similarity measure for spectral clustering, and is used to calculate the pairwise similarity sij as , where is the Euclidean distance between data points xi and , and σ is the kernel parameter. The Gaussian kernel function is simple to calculate and results in a positive definite similarity matrix, which simplifies the analysis of the eigenvectors. However, it is difficult to determine the optimal value of the kernel parameter σ, which reflects the neighborhood of the data points. Many researchers have focused on determining the optimal value of the kernel parameter σ in the Gaussian kernel function to improve spectral clustering. Ng and others [6] presented a method to choose this parameter automatically by comparing the clustering results based on certain criteria. However, the chosen optimal value is a global value, which is not suitable for datasets with an arbitrary construction. Because a local scaling parameter is more efficient than a global one, a family of spectral clustering algorithms used to find the local scaling parameter has been reported [8]–[12]. Zelnik-Manor and Perona [7] proposed a self-tuning spectral clustering algorithm, which locally scales this parameter based on the local statistics of the neighborhood of each data point. Chang and Yeung [8] proposed a robust path-based similarity measure method based on M-estimation from robust statistics. Zhang and others [9] use the local density between data points to scale the parameter, which has the effect of amplifying intra-cluster similarity. Cao and others [11] satisfied the requirements of a similarity measure used in spectral clustering by measuring the similarity based on the maximum flow between data points. Instead of focusing on the kernel parameter, many spectral clustering algorithms measure the similarity based on the k-nearest neighbor (kNN) graph [1], [12]. In the kNN graph, point xi is connected to point xj if xi is among the k nearest neighbors of xj, or if xj is among the k nearest neighbors of xi. According to the kNN graph, the pairwise similarity is sij if point xi is connected to point xj; otherwise, . In similarity measure methods based on a kNN graph, the main parameter σ is replaced by k, which is easier to find because it is an integer and usually takes a small value. Furthermore, the similarity matrix S is sparse, and thus is computationally efficient in solving the eigenvectors. In spectral clustering, the applied kNN graph is undirected owing to the symmetry of the similarity matrix. However, an undirected kNN graph usually introduces redundant connections, which may incur incorrect clustering results. Spectral clustering algorithms based on the local scaling parameter and an undirected kNN, respectively, seem to be effective for various clustering tasks. However, they cannot reveal the real clusters for certain complex datasets, particularly those that are not well separated, that is, those in which the clusters are not far apart. The presence of noise can also reduce separation within the data. Because many real-world datasets are not well separated, in general, it is difficult to find the correct clusters. Thus, a clustering algorithm that is robust to data that are not well separated is both important and desirable. An interesting alternative to a distance-based similarity measure is to use information regarding the shared nearest neighbors. In most cases, two data points belong to the same cluster not only because they are close to each other, but also because they have many shared nearest neighbors that connect them to the same cluster. In clustering methods based on shared nearest neighbors, two data points have a higher similarity if they have more shared nearest neighbors. These methods are more robust to datasets that are not well separated, and have been reported to be effective in agglomerative clustering algorithms [13], [14], as well as less sensitive to the dimensionality than a conventional distance measure [15]. Specifically, Zhang and others [9] and He and others [10] used the shared nearest neighbors to locally scale the kernel parameter σ in the Gaussian kernel function for spectral clustering. In addition to the information regarding the shared nearest neighbors, other similarity measure methods have also been proposed for spectral clustering to replace the distance-based similarity measure. Beauchemin [16] proposed a similarity measure method for spectral clustering based on a K-means density estimation embedded within a subbagging procedure. Zhu and others [17] constructed a random-forest based affinity graph to learn the discriminative feature subspaces for a similarity measure in spectral clustering. A good similarity measure method for constructing a similarity matrix can significantly improve the clustering results of spectral clustering. Other researchers have also considered improving the spectral clustering after a similarity matrix is constructed. For example, the Nystrom method is applied to deal with the eigenvector extraction problem for a reduction in the computation cost [18]. Genetic algorithms have also been utilized in the clustering process to improve the results of spectral clustering [19], [20]. In this paper, we focus on the similarity measure for a similarity construction in spectral clustering. We use the shared nearest neighbors in a directed kNN graph as a measure of the similarity to improve the spectral clustering. We propose two novel algorithms for spectral clustering: one based on the number of shared nearest neighbors (SC-nSNN) and one based on their closeness (SC-cSNN). The main differences between the proposed algorithms and the related existing clustering algorithms are summarized as follows. (1) In spectral clustering, the commonly used kNN graph is undirected. We are the first to use a directed kNN graph to construct a similarity matrix for spectral clustering. The proposed algorithms, SC-nSNN and SC-cSNN, use a directed kNN graph to find the nearest neighbors, which does not introduce redundant connections, and more accurately reflects the true neighborhood of the data points than the use of an undirected kNN graph. (2) Some traditional clustering algorithms use shared neighbors to construct a similarity matrix, and clustering is then conducted directly on the similarity matrix obtained. The definition of shared nearest neighbors in the proposed algorithms differs from that used in the existing algorithms, which is based on a directed kNN graph. In addition, in the proposed algorithms, clustering is not performed directly on the similarity matrix obtained. (3) In [9] and [10], the authors used shared neighbors to modify the parameter in the Gaussian kernel function, but the shared neighbors were not found using a kNN graph. The proposed algorithms directly measure the similarity based on the shared nearest neighbors, do not use a Gaussian kernel function, are able to explore the underlying similarity relationships between data points, and are robust to datasets that are not well separated. Despite the above advantages, SC-nSNN and SC-cSNN have only one parameter, k, which is the number of nearest neighbors. In particular, SC-cSNN is less sensitive to k than SC-nSNN, both of which are less sensitive to k than the spectral clustering algorithms based on an undirected kNN graph. This paper is an extension of [21], which introduced the basic idea of the SC-cSNN method. In this paper, we further develop the SC-cSNN algorithm, and introduce another algorithm, SC-nSNN; each of these measures has its own particular advantage for certain types of datasets. In this study, we conducted a complexity analysis on these two efficient algorithms, and tested them on more comprehensive synthetic and real-world datasets. The remainder of this paper is organized as follows. We begin with an overview of spectral clustering in Section II. The two proposed algorithms are presented in Section III. The experimental results are presented in Section IV, and we provide some concluding remarks in Section V. II. Overview of Spectral Clustering 1. Spectral Clustering Algorithm Given a set of n data points in Rd, the objective of spectral clustering is to divide these data points into K clusters. The steps of a general algorithm for spectral clustering can be summarized as follows [1], [6], [8]. 1) Construct a similarity matrix S, which has pairwise similarities sij as its entries. The similarity measure methods used to calculate sij will be introduced later. 2) Compute normalized Laplacian matrix L based on similarity matrix S as , where D is an diagonal matrix with on the diagonal. 3) Compute the K largest eigenvectors of the normalized Laplacian matrix L, and form the matrix using these eigenvectors as its columns. 4) Form the matrix by normalizing the rows of V, such that . 5) Let each row of U represent a data point in RK and cluster these points using the K-means method. 6) Assign each data point xi to a given cluster c if the corresponding row i in U is assigned to this cluster. 2. Similarity Measure In spectral clustering, the similarity measure used to construct the similarity matrix S in the first step is important in the following steps, and crucial to the clustering result. If the Gaussian kernel function is being applied, the pairwise similarity is calculated as (1) where is the Euclidean distance between the data points xi and xj, and σ is the kernel parameter. Zelnik-Manor and Perona [7] locally scaled the kernel parameter σ, and calculated the pairwise similarity as (2) where and xk is the k-nearest neighbors of xi. Zhang and others [9] used the number of shared nearest neighbors to scale the kernel parameter, and calculated the pairwise similarity as (3) where Ns(xi, xj) is the number of shared neighbors between xi and xj. The shared neighbors are found according to the ε-neighborhood graph, in which xk is a neighbor of . However, the radius ε is difficult to determine because a real dataset usually has clusters of different densities. The method of similarity measure in [10] is very similar to (3). Other methods for improving the similarity measure in spectral clustering include the use of an undirected kNN graph, by which sij is as given by (1) when point xi is connected to point xj; otherwise . III. Similarity Measure for Spectral Clustering Based on Shared Neighbors In this section, we present our two novel algorithms to measure the similarity for spectral clustering, SC-nSNN and SC-cSNN. 1. Construction of Directed kNN Graph We consider measuring the similarity of data points based on a directed kNN graph. In a directed kNN graph, an edge is connected from xi to xj if xj is one of the k-nearest neighbors of xi, and we then state that xj is a direct successor of xi, and that xi is a direct predecessor of xj, which is denoted as . Further, let denote the case in which xi and xj are the k nearest neighbors of each other. Let Ni be a set that contains the direct successors of xi, and let Pj be a set that contains the direct predecessors of xj. Thus, Ni contains the k nearest neighbors of xi, and xj is a shared nearest neighbor of the data points in Pj. In a directed kNN graph, each data point has exactly k nearest neighbors, whereas some data points have more than k nearest neighbors in an undirected kNN graph. As an example, in Fig. 1, data points are clustered based on (a) an undirected 3NN graph and (b) a directed 3NN graph. Note that in the undirected 3NN graph, point 7 has more than three nearest neighbors. For point 7, the edges between point 7 and points 8, 9, and 10 are redundant, which leads to an incorrect clustering result. The two clusters that are obtained based on this undirected 3NN graph are {1, 2, 3, 4, 5, 6} and {7, 8, 9, 10}, where point 7 is assigned to the wrong cluster. Unlike the undirected kNN graph, the directed kNN graph does not suffer from redundant connections. Instead, the problem with using the directed kNN graph is its asymmetry. For example, in Fig. 1(b), point 7 is a nearest neighbor of point 8, whereas point 8 is not a nearest neighbor of point 7. Because the similarity matrix in spectral clustering is symmetric, traditional similarity measure methods cannot be applied to a directed kNN graph. We show that the proposed algorithms introduce novel similarity measures that can be applied to a directed kNN graph, through which the data points in Fig. 1 can be assigned to the two correct clusters, {1, 2, 3, 4, 5, 6, 7} and {8, 9, 10}. Figure 1Open in figure viewerPowerPoint (a) 3NN undirected graph and (b) 3NN directed graph. 2. Definition of Shared Nearest Neighbors Because Ni is a set containing the k nearest neighbors of xi in a directed kNN graph, the set of shared nearest neighbors between xi and xj is . The pairwise similarity sij is measured based on the set . In general, Ni does not include xi. Thus, does not include the two measured points xi and xj. Whether should include xi and xj depends on the relationship between xi and xj, which will affect the similarity measure. We use the example shown in Figs. 2(a) and 2(b) to explain the reason for this. In Figs. 2(a) and 2(b), we show the three nearest neighbors of points 1 and 2 for two different cases. In Fig. 2(a), points 1 and 2 are the nearest neighbors of each other, but this is not the case in Fig. 2(b). If we do not consider the relationship between points 1 and 2, the set of shared nearest neighbors of points 1 and 2 in the two cases are the same. However, clearly, the similarity of points 1 and 2 in Fig. 2(a) is higher than it is in Fig. 2(b) because points 1 and 2 are the nearest neighbors of each other in Fig. 2 (a). Figure 2Open in figure viewerPowerPoint Three nearest neighbors of points 1 and 2 in three different cases. In this paper, we redefine the set of shared nearest neighbors of two points in a directed kNN graph by considering the relationship between the two measured points. Here, Ni is the set of nearest neighbors of xi and does not include xi, and is the set of shared nearest neighbors of points xi and xj, which we redefine as (4) where is a virtual data point that represents xi as a nearest neighbor of xj and represents xj as a nearest neighbor of xi. Based on (4), the shared nearest neighbors of points 1 and 2 in Fig. 2(a) are different from those in Fig. 2(b). In Figs. 2(a) and 2(b), are {3, 4, } and {3, 4}, respectively. 3. Measurement of Pairwise Similarity A. Similarity Measure Based on Number of Shared Neighbors We measure the pairwise similarity sij based on the set of shared nearest neighbors , as defined in (4). According to many clustering methods based on shared nearest neighbors, two data points have a higher similarity if they have more shared nearest neighbors [15]. We first consider a similarity measure based on the number of shared nearest neighbors, which we propose for the SC-nSNN algorithm. For a statistical analysis, we consider the pairwise similarity in the range . In SC-nSNN, the pairwise similarity sij is calculated as (5) where is the number of shared nearest neighbors in , and k is the maximum number of nearest neighbors shared between points xi and xj in the directed kNN graph. According to (5), the similarity of points 1 and 2 in Fig. 2(a) is higher than that in Fig. 2(b). B. Similarity Measure Based on Closeness of Shared Neighbors If we only consider the number of shared nearest neighbors, we may neglect the closeness of the data points. Thus, we also consider a similarity measure based on the closeness of the shared nearest neighbors, which we propose this for the SC-cSNN algorithm. In Figs. 2(b) and 2(c), we show the three nearest neighbors of points 1 and 2 in two different cases. In Fig. 2(b), points 1 and 2 have two shared neighbors, whereas in Fig. 2(c), they have one shared neighbor. If we only consider the number of shared nearest neighbors, the pairwise similarity of points 1 and 2 in Fig. 2(b) is higher than that in Fig. 2(c). However, in Fig. 2(c), points 1 and 2 are very close, and thus should have a higher pairwise similarity than those in Fig. 2(b). Thus, if we only consider the number of shared nearest neighbors, we may neglect the closeness of the data points. The orders of the shared nearest neighbors in to the two measured data points xi and xj can reflect the closeness of the shared nearest neighbors. For example, in Fig. 2(b), point 3 is the third-nearest neighbor of points 1 and 2, and point 4 is the second-nearest neighbor of points 1 and 2. Neither of the two shared neighbors is the first-nearest neighbor of either point 1 or point 2. In Fig. 2(c), although points 1 and 2 have only one shared neighbor, this shared neighbor is the first-nearest neighbor of both points 1 and 2. To measure the pairwise similarity sij, we first weigh the shared nearest neighbors in according to their orders relative to the data points xi and xj. Let wij denote the weight of the shared nearest neighbors in . Let xr be one of the shared nearest neighbors in . Assume that xr is the nearest neighbor of xi and that is the nearest neighbor of xj. In the directed kNN graph, the weight of the shared nearest neighbors wij is calculated as (6) For example, in Fig. 2(b), , and thus . Note that the maximum value of wij is obtained when , and all shared neighbors in || have the same orders relative to data points xi and xj. According to (6), . For a statistical analysis, we consider the pairwise similarity in the range . In SC-cSNN, the pairwise similarity sij is calculated as (7) Note that the parameters of sij in (5) and (7) are the number of nearest neighbors, k, which is usually small compared to n. The similarity matrices S obtained by (5) and (7) are sparse because if , which often occurs when k is small compared to n. 4. Algorithms In the proposed SC-nSNN and SC-cSNN algorithms, to reduce the computational cost, we only measure the pairwise similarity between the data pairs that have shared nearest neighbors; otherwise, because . Let (xp, xq) denote a data pair. The data points in set Pj have a shared nearest neighbor xj. That is, . Thus, we measure the pairwise similarity for each data pair in Pj. We let Tj denote the set that contains the possible data pairs in Pj. For example, if , we have . The number of data pairs in Tj is . By searching each , we can find all data pairs that have shared neighbors. After finding a data pair (xp, xq) in Ti, we find their shared nearest neighbors in , as defined in (4), from the sets of direct successors Np and Nq. For SC-nSNN, we measure the pairwise similarity between xp and xq based on and according to (5). For SC-cSNN, because we should use the orders of the shared nearest neighbors to calculate the similarity, we record the orders of each data point xj to the data points in Pj in the order set Rj. For example, if and xj is the nearest neighbor of xi, we record in the order set Rj. After finding the shared nearest neighbors in , we first find the orders of the shared nearest neighbors according to their order sets, and then measure the similarly according to (6) and (7). For example, we find a shared nearest neighbor , and then find the order set of xj. To calculate the similarly, we find the order and from Rj. The SC-cSNN algorithm is summarized in Algorithm 1. The SC-nSNN algorithm is similar to the SC-cSNN algorithm, except for steps 3) and 5) in Algorithm 1. In the SC-nSNN algorithm, the calculation of Ri is unnecessary, and the pairwise similarity is calculated according to (5). Algorithm 1. 1) Calculate Ni and Pi for each data point. 2) Search each Pi to find the data pairs with shared nearest neighbors and record these in set Ti. 3) Calculate the orders of each data point xi relative to the data points in Pi and record these in set Ri. 4) Find the shared nearest neighbors from for each data pair (xp, xq) in Ti. 5) Calculate each pairwise similarity spq to construct the similarity S by searching Ri for each shared nearest neighbor xi in , according to (4) and (5). 6) Compute the normalized Laplacian matrix L based on S. 7) Cluster the data points into K clusters based on the K largest eigenvectors of L by using the K-means method. Steps 6) and 7) were introduced in Section II. The main contribution of SC-nSNN and SC-cSNN is the proposal of a similarity measure for spectral clustering, which improves the clustering, because similarity matrix S is crucial to the spectral clustering results. 5. Complexity Analysis We first show the computational costs of the key steps in the proposed algorithms. The complexity of constructing the directed kNN graph and computing the sets Ni, Pi, and Ri for each data point is . The complexity of computing Ti is , where |Pi| is the number of data points in Pi, and is small. Without searching each data pair to measure their pairwise similarly, we instead search each to find the data pairs having shared nearest neighbors, and then measure the similarity for these data pairs. The similarity is based on the shared nearest neighbors in . In Ni, the k nearest neighbors of xi are arranged in order of their identification numbers. The complexity of finding the shared nearest neighbors in is O(k). Thus, the complexity of finding for measuring the similarity in SC-nSNN is O(mk), where m is the number of data pairs that have shared neighbors. Note that m depends on k, and is usually much less than n2. In SC-cSNN, there is additional complexity when searching order set Ri, but is only O(k) for each data pair. Consequently, the computational complexities of the proposed algorithms are primarily due to the construction of the directed kNN graph, which is . However, there are known algorithms that can reduce the complexity of constructing the kNN graph [22]. The complexity of constructing the undirected kNN graph is , which includes finding the k nearest neighbors of each data point and adjusting the symmetry. Note that, compared to algorithms based on an undirected kNN graph, the main additional computational cost of the proposed similarity measure algorithms is O(mk). In general, the complexity of computing the eigenvectors from a dense matrix is O(m3). For most spectral clustering algorithms that use a local scaling parameter in the Gaussian kernel function to calculate the similarity [7], [9], the normalized Laplacian matrices are dense matrices. The algorithms used for the proposed similarity measure and the undirected kNN graph-based spectral clustering both have sparse normalized Laplacian matrices because they have sparse similarity matrices. The complexity of computing the eigenvectors from a sparse normalized Laplacian matrix L can be reduced by applying sparse eigensolvers [23]. For example, when applying the variants of Lanczos/Arnoldi factorization (as implemented in the solver ARPACK), the computational cost is (number of restarted Arnoldi), where is the Arnoldi length used to compute the first K eigenvectors of L. IV. Experimental Results We evaluated the proposed SC-nSNN and SC-cSNN algorithms on both synthetic and real-world datasets. We compared the proposed algorithms with other state-of-the-art similarity measure methods for spectral clustering, including path-based spectral clustering (SC-PA) [8], self-turning spectral clustering (SC-ST) [7], spectral clustering with local density adaptive similarity (SC-DA) [9], spectral clustering based on the importance of shared nearest neighbors (SC-SNNI) [10], and spectral clustering based on a kNN graph (SC-kNN) [1]. Among them, SC-DA and SC-SNNI utilize the information of shared neighbors to modify the parameter of the Gaussian kernel function for a similarity measure. To show the benefit of spectral clustering, we also compared these spectral clustering algorithms with the K-means method. For a quantitative comparison, we adopted the normalized mutual information (NMI) as the evaluation criterion because it is widely used to evaluate the performance of clustering algorithms [24]. The NMI criterion is defined as (8) where I(X, Y) is the mutual information between two random variables X and Y, and H(X) is the entropy of X. In addition, NMI(X, Y) has a range of [0, 1], and when . Let denote the predicted clustering configuration obtained by a clustering algorithm, and let denote the true clustering configuration. When the performance of the clustering algorithms are evaluated based on the NMI criterion, C and C′ are regarded as two evaluated random variables. Here, I(C, C′) and H(C) in NMI(C, C′), as given by (8), are calculated as follows. (9) (10) (11) (12) where |·| is the cardinality of the cluster, and n is the total number of data points. In addition, P(ci) denotes the probability that the data points belong to cluster ci, and denotes the probability that the data points belong to the intersection of clusters ci and . A larger value of NMI indicates a better clustering result. The largest value of NMI is 1, which occurs when all of the data points are assigned to their correct clusters. Note that the proposed algorithms measure the similarity based on the shared k nearest neighbors. The shared k nearest neighbors can be found according to various distance metrics, such as the Euclidean distance or the cosine metric. In these experiments, we use the Euclidean distance, which is similar to the metric used in the compared clustering algorithms. For the following experiments, we present the best result of each spectral clustering algorithm obtained after exploring their parameters, as suggested in previous related papers. Because the results of K-means (in the last step of spectral clustering) depend on the initialization, the algorithm is repeated 20 times with random initializations. We report the average results with the standard deviation (std) for all compared algorithms. We also evaluated the sensitivity of the algorithms against that of parameter k. We show the average results and the 95% confidence interval when varying k. All experiments were conducted in MATLAB 8.5.0 (R2015a) on Mac OS X 10.10.3 with a core i7 (i7-4650u) CPU and 8 GB of RAM. 1. Clustering Results on Synthetic Data We conducted the experiments on three types of 2D synthetic datasets and a high-dimens

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Robust Similarity Measure for Spectral Clustering Based on Shared Neighbors