Feature Selection-Ranking Methods in a Very Large Electric Database

Capítulo de livro Acesso aberto Revisado por pares

Feature Selection-Ranking Methods in a Very Large Electric Database

2004; Springer Science+Business Media; Linguagem: Inglês

10.1007/978-3-540-24694-7_30

ISSN

1611-3349

Autores

Manuel Mejía-Lavalle, Guillermo Rodríguez-Ortiz, Gustavo Figueroa, Eduardo F. Morales,

Tópico(s)

Fuzzy Logic and Control Systems

Resumo

Feature selection is a crucial activity when knowledge discovery is applied to very large databases, as it reduces dimensionality and therefore the complexity of the problem. Its main objective is to eliminate attributes to obtain a computationally tractable problem, without affecting the quality of the solution. To perform feature selection, several methods have been proposed, some of them tested over small academic datasets. In this paper we evaluate different feature selection-ranking methods over a very large real world database related with a Mexican electric energy client-invoice system. Most of the research on feature selection methods only evaluates accuracy and processing time; here we also report on the amount of discovered knowledge and stress the issue around the boundary that separates relevant and irrelevant features. The evaluation was done using Elvira and Weka tools, which integrate and implement state of the art data mining algorithms. Finally, we propose a promising feature selection heuristic based on the experiments performed.

Ver no editor

Altmetric

PlumX

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação

Feature Selection-Ranking Methods in a Very Large Electric Database