Artigo Acesso aberto Revisado por pares

Automatic Switching of Clustering Methods based on Fuzzy Inference in Bibliographic Big Data Retrieval System

2014; Volume: 14; Issue: 4 Linguagem: Inglês

10.5391/ijfis.2014.14.4.256

ISSN

2093-744X

Autores

Maslina Zolkepli, Fangyan Dong, Kaoru Hirota,

Tópico(s)

Text and Document Classification Technologies

Resumo

An automatic switch among ensembles of clustering algorithms is proposed as a part of the bibliographic big data retrieval system by utilizing a fuzzy inference engine as a decision support tool to select the fastest performing clustering algorithm between fuzzy C-means (FCM) clustering, Newman-Girvan clustering, and the combination of both. It aims to realize the best clustering performance with the reduction of computational complexity from O( $n^3$ ) to O(n). The automatic switch is developed by using fuzzy logic controller written in Java and accepts 3 inputs from each clustering result, i.e., number of clusters, number of vertices, and time taken to complete the clustering process. The experimental results on PC (Intel Core i5-3210M at 2.50 GHz) demonstrates that the combination of both clustering algorithms is selected as the best performing algorithm in 20 out of 27 cases with the highest percentage of 83.99%, completed in 161 seconds. The self-adapted FCM is selected as the best performing algorithm in 4 cases and the Newman-Girvan is selected in 3 cases.The automatic switch is to be incorporated into the bibliographic big data retrieval system that focuses on visualization of fuzzy relationship using hybrid approach combining FCM and Newman-Girvan algorithm, and is planning to be released to the public through the Internet.

Referência(s)