Project Details
Description
Model selection has always been an integral part of statistical practice. It is so widely practised that
there can be few statisticians who have not employed criteria such as the AIC or BIC to choose
between models. Unfortunately, practitioners do not commonly recognize the additional uncertainty
introduced by model selection in the process of statistical modeling. In reality, properties of estimators
and tests subsequent to model selection depend on the way the model has been selected in addition to
the stochastic nature of the chosen model. However, practitioners usually only take into the account
the latter and report estimates obtained from the chosen model as if they were unconditional when they
are actually conditional estimates. Extensive literature on post-model selection inference has shown
that under-reporting can be a very serious problem if the additional variability introduced by model
selection is ignored.Many statisticians have argued that a simple way to overcome the aforementioned under-reporting
problem is by model averaging. A model average estimator compromises across a set of competing
models, and in doing so incorporates model uncertainty into the conclusions about the unknown
parameters. Model averaging has long been a popular technique among Bayesian statisticians. Lately
there have also been several seminal developments from a frequentist standpoint. The proposed
project is motivated by some of the unanswered questions in this emerging literature. Of the numerous
interesting avenues of research in this growing area, five have been selected for particular attention in
the proposed project. For some of these selected topics the Principal Investigator (P.I.) has carried out
preliminary analysis and succeeded in deriving the key theoretical results.Part 1 of the project develops frequentist model average estimators in discrete choice models based on
scores from a variety of Focused Information Criteria (FIC), with special attention paid to
multinomial, ordinal and nested logit models. Monte-Carlo studies will be undertaken to compare
model average estimators using different weight choices. Empirical analysis using real data will also
be performed. Part 2 of the project deals the Truncated and Censored regression models and examines
model combining in a similar manner to Part 1. Part 3 of the project considers the use of the Mallows
criterion for model averaging in linear regression. A recent working paper by the P.I. shows that the
Mallows criterion continues to possess the optimality property established in Hansen (2007,
Econometrica) even if some of crucial assumptions are relaxed. The whole matter concerning the use
of the Mallows criterion requires a more thorough investigation, especially when the observations are
dependent, and the project will take some steps in this direction. Part 4 of the project focuses on the
threshold regression model, and develops a model combining scheme such that the model weights are
selected by minimizing the trace of the unbiased estimate of the MSE matrix of the model average
estimator. Model selection in the face of incomplete data has received considerable attention in recent
years, and Part 5 of the project is devoted to an investigation of the properties of model average
estimators with weights based on model selection scores developed for different incomplete data
circumstances. We also consider the scenario where model averaging is preceded by imputation of the
missing data. Monte-Carlo studies will be conducted to examine the performance of model average
estimators subject to different methods of missing data correction and based on different model weight
choices.
Project number | 7002428 |
---|---|
Grant type | SRG |
Status | Finished |
Effective start/end date | 1/04/09 → 30/06/09 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.