Artigo Acesso aberto Produção Nacional

Multi-relational data mining in Microsoft SQL Server 2005

2006; WIT Press; Volume: 1; Linguagem: Inglês

10.2495/data060151

ISSN

1746-4463

Autores

C. L. Curotto, Nelson F. F. Ebecken, Hendrik Blockeel,

Tópico(s)

Bayesian Modeling and Causal Inference

Resumo

Most real life data are relational by nature.Database mining integration is an essential goal to be achieved.Microsoft SQL Server (MSSQL) seems to provide an interesting and promising environment to develop aggregated multi-relational data mining algorithms by using nested tables and the plug-in algorithm approach.However, it is currently unclear how these nested tables can best be used by data mining algorithms.In this paper we look at how the Microsoft Decision Trees (MSDT) handles multi-relational data, and we compare it with the multi-relational decision tree learner TILDE.In the experiments we perform, MSDT has equally good predictive accuracy as TILDE, but the trees it gives either ignore the relational information, or use it in a way that yields non-interpretable trees.As such, one could say that its explanatory power is reduced, when compared to a multi-relational decision tree learner.We conclude that it may be worthwhile to integrate a multi-relational decision tree learner in MSSQL.

Referência(s)