TY - JOUR
T1 - Stochastic tree ensembles for regularized nonlinear regression
AU - He, Jingyu
AU - Hahn, P. Richard
PY - 2023/3
Y1 - 2023/3
N2 - This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.
AB - This article develops a novel stochastic tree ensemble method for nonlinear regression, referred to as accelerated Bayesian additive regression trees, or XBART. By combining regularization and stochastic search strategies from Bayesian modeling with computationally efficient techniques from recursive partitioning algorithms, XBART attains state-of-the-art performance at prediction and function estimation. Simulation studies demonstrate that XBART provides accurate point-wise estimates of the mean function and does so faster than popular alternatives, such as BART, XGBoost, and neural networks (using Keras) on a variety of test functions. Additionally, it is demonstrated that using XBART to initialize the standard BART MCMC algorithm considerably improves credible interval coverage and reduces total run-time. Finally, two basic theoretical results are established: the single tree version of the model is asymptotically consistent and the Markov chain produced by the ensemble version of the algorithm has a unique stationary distribution.
KW - Machine learning
KW - Markov chain Monte Carlo
KW - Regression trees
KW - Supervised learning
KW - Bayesian
UR - http://www.scopus.com/inward/record.url?scp=85112688498&partnerID=8YFLogxK
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85112688498&origin=recordpage
U2 - 10.1080/01621459.2021.1942012
DO - 10.1080/01621459.2021.1942012
M3 - RGC 21 - Publication in refereed journal
SN - 0162-1459
VL - 118
SP - 551
EP - 570
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 541
ER -