Thesis on Machine Learning in Empirical Asset Pricing
機器學習與實證資產定價的研究
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 17 Oct 2022 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(068d2ee3-e7ef-464a-b009-1978205c8d10).html |
---|---|
Other link(s) | Links |
Abstract
This dissertation proposes two machine learning (econometric) models for empirical asset pricing. The first model is Panel Tree (P-Tree) with an application on stock market, and the second model is Benchmark Combination Model (BCM) with an application on corporate bond market. The results are shown in two chapters.
In Chapter 1, We introduce P-Tree, which is a class of interpretable tree-based models with global split criteria for analyzing panel data, to split the cross-section of asset returns, generating stochastic discount factors and diversified test portfolios. P-Tree prevents overfitting and visualizes nonlinearities among both macroeconomic and asset-specific variables. We find long-term reversal, volume volatility, and market equity interact to drive cross-sectional return variation in U.S. equities, and inflation constitutes the most critical regime-switching interacting with firm characteristics. P-Trees consistently outperform extant models at pricing individual asset and portfolio returns, while delivering profitable and transparent trading strategies with 2.46% monthly alpha and 1.71 annualized out-of-sample Sharpe ratio.
In Chapter 2, We propose BCM to estimate and decompose asset risk premia in empirical asset pricing. BCM pricing kernel is a weighted combination of the basis portfolios sorted on many asset characteristics. With a no-arbitrage objective, our approach minimizes cross-sectional pricing errors and identifies the sources of risk premia. With a 45-year sample of U.S. corporate bonds, we find that BCM outperforms prevailing factor models in pricing corporate bonds. Second, we find credit ratings, maturity, short-term reversal, momentum, and variance are primary sources of bond risk premia. Finally, incorporating machine learning forecasts into BCM shows strong evidence of return predictability.
In Chapter 1, We introduce P-Tree, which is a class of interpretable tree-based models with global split criteria for analyzing panel data, to split the cross-section of asset returns, generating stochastic discount factors and diversified test portfolios. P-Tree prevents overfitting and visualizes nonlinearities among both macroeconomic and asset-specific variables. We find long-term reversal, volume volatility, and market equity interact to drive cross-sectional return variation in U.S. equities, and inflation constitutes the most critical regime-switching interacting with firm characteristics. P-Trees consistently outperform extant models at pricing individual asset and portfolio returns, while delivering profitable and transparent trading strategies with 2.46% monthly alpha and 1.71 annualized out-of-sample Sharpe ratio.
In Chapter 2, We propose BCM to estimate and decompose asset risk premia in empirical asset pricing. BCM pricing kernel is a weighted combination of the basis portfolios sorted on many asset characteristics. With a no-arbitrage objective, our approach minimizes cross-sectional pricing errors and identifies the sources of risk premia. With a 45-year sample of U.S. corporate bonds, we find that BCM outperforms prevailing factor models in pricing corporate bonds. Second, we find credit ratings, maturity, short-term reversal, momentum, and variance are primary sources of bond risk premia. Finally, incorporating machine learning forecasts into BCM shows strong evidence of return predictability.