Comparison of tree-based ensemble models for regression
2022; Korean Statistical Society; Volume: 29; Issue: 5 Linguagem: Inglês
10.29220/csam.2022.29.5.561
ISSN2383-4757
Autores Tópico(s)Statistical Methods and Bayesian Inference
ResumoWhen multiple classifications and regression trees are combined, tree-based ensemble models, such as random forest (RF) and Bayesian additive regression trees (BART), are produced.We compare the model structures and performances of various ensemble models for regression settings in this study.RF learns bootstrapped samples and selects a splitting variable from predictors gathered at each node.The BART model is specified as the sum of trees and is calculated using the Bayesian backfitting algorithm.Throughout the extensive simulation studies, the strengths and drawbacks of the two methods in the presence of missing data, high-dimensional data, or highly correlated data are investigated.In the presence of missing data, BART performs well in general, whereas RF provides adequate coverage.The BART outperforms in high dimensional, highly correlated data.However, in all of the scenarios considered, the RF has a shorter computation time.The performance of the two methods is also compared using two real data sets that represent the aforementioned situations, and the same conclusion is reached.
Referência(s)