Contributions to frequentist model averaging in econometrics
Student thesis: Doctoral Thesis
Related Research Unit(s)
It is common both in the literature and in practical applications that data analysts choose one best model from a class of models, then proceed with their analysis as though the final “winner” model is the true model that generated the data. However, this approach has been criticized for two main reasons: First, it ignores the uncertainty brought about by the model selection process, which leads to underestimating variability, while producing overly optimistic confidence intervals. Second, it is unstable and has the risk of choosing a very poor model. To overcome the shortcomings of model selection, model averaging, such as Bayesian model averaging and Frequentist model averaging, have been proposed. Instead of picking out one single model, model averaging combines a set of competing models so that the model uncertainty is incorporated into the conclusions about the unknown parameters of interest. Bayesian model averaging has blossomed in theoretical literature since 1970s, and it has been demonstrated empirically to outperform model selection in certain aspects in some applications. However, in practice, the computational burden and the difficulty in assigning prior distributions for the unknown parameters impedes the broad application of Bayesian model averaging. Frequentist model averaging, which has a more recent history than Bayesian model averaging, has drawn more and more attention in the theoretical literature, and is making significant progress in practical applications. Moreover, it has been shown that Frequentist model averaging consistently produces superior forecasts to model selection and Bayesian model averaging in some empirical applications. The primary purpose of this thesis is to examine the performance of Frequentist model averaging methods in Econometrics. Specifically, three scenarios are being considered: the discrete choice models, the parametric model of Lorenz curve, and the Expectile regression model. With the discrete choice models we focus on the ordered probit model and the nested logit model because they are widely used in a range of fields, such as sociology, psychology and marketing. The ordered probit model is used when there are natural orders among the alternative choices, whereas the nested logit model is better in situations where the alternative choices can be separated into several clusters. The Lorenz curve is a graphical representation of the cumulative distribution function of the empirical probability distribution of wealth, and is often used to examine the degree of income inequality of a certain economic entity. Several parametric forms to fit the Lorenz curve have been proposed in the literature. The expectile regression model is useful when applied to forecast the Expectile Value at Risk, which is a newly proposed measure of the risk of loss on a specific portfolio of financial assets. In the context of these three models, we examine the finite sample performance of commonly used Frequentist model averaging methods through several Monte Carlo studies and real data sets from the areas including sociology, income distribution, excess stock returns are applied separately to study the practical performance of the model averaging methods.
- Average., Statistical methods, Econometrics