A survey of data mining and knowledge discovery process models and methodologies
2010; Cambridge University Press; Volume: 25; Issue: 2 Linguagem: Inglês
10.1017/s0269888910000032
ISSN1469-8005
AutoresGonzalo Mariscal, Óscar Marbán, C. Fernandez,
Tópico(s)Imbalanced Data Classification Techniques
ResumoAbstract Up to now, many data mining and knowledge discovery methodologies and process models have been developed, with varying degrees of success. In this paper, we describe the most used (in industrial and academic projects) and cited (in scientific literature) data mining and knowledge discovery methodologies and process models, providing an overview of its evolution along data mining and knowledge discovery history and setting down the state of the art in this topic. For every approach, we have provided a brief description of the proposed knowledge discovery in databases (KDD) process, discussing about special features, outstanding advantages and disadvantages of every approach. Apart from that, a global comparative of all presented data mining approaches is provided, focusing on the different steps and tasks in which every approach interprets the whole KDD process. As a result of the comparison, we propose a new data mining and knowledge discovery process named refined data mining process for developing any kind of data mining and knowledge discovery project. The refined data mining process is built on specific steps taken from analyzed approaches.
Referência(s)