Artigo Revisado por pares

LAZY LEARNER ON DECISION TREE FOR RANKING

2008; World Scientific; Volume: 17; Issue: 01 Linguagem: Inglês

10.1142/s0218213008003819

ISSN

1793-6349

Autores

Yuhong Yan, Liang Han,

Tópico(s)

Data Mining Algorithms and Applications

Resumo

This paper aims to improve probability-based ranking (e.g. AUC) under decision-tree paradigm. We observe the fact that probability-based ranking is to sort samples in terms of their class probabilities. Therefore, ranking is a relative evaluation metric among those samples. This motivates us to use a lazy learner to explicitly yield a set of unique class probabilities for a testing sample based on its similarities to the training samples within its neighborhood. We embed lazy learners at the leaves of a decision tree to give class probability assignments. This results in the first model, named Lazy Distance-based Tree (LDTree). Then we further improve this model by continuing to grow the tree for the second time, and call the resulting model Eager Distance-based Tree (EDTree). In addition to the benefits of lazy learning, EDTree also takes advantage of the finer resolution of a large tree structure. We compare our models with C4.5, C4.4 and their variants in AUC on a large suite of UCI sample sets. The improvement shows that our method follows a new path that leads to better ranking performance.

Referência(s)