Conditional inference in generalized linear mixed models : model identification and robust estimation
基於條件分佈的廣義線性混合模型的統計推斷 : 模型識別和穩健估計
Student thesis: Doctoral Thesis
Related Research Unit(s)
In this thesis, statistical inference problems in generalized linear mixed models (GLMMs) are considered. In particular, model identification and robust residual maximum likelihood (REML) estimation for the GLMMs are studied in detail. The formulation and estimation for the GLMMs are first reviewed, and the differences between conditional likelihood based and marginal likelihood based methods are then discussed. Simulation results indicate that both methods are promising when the sample size is relatively large. The REML estimation method is effective in reducing the negative bias in the estimation of the variance component parameters when the sample size is small. To address the problem of model selection in the GLMMs, a model identification instrument based on the conditional Akaike information (cAI) is developed. In particular, an asymptotically unbiased estimator of the cAI (denoted as cAICC) is derived as the model selection criterion, which takes the estimation uncertainty in the variance component parameter into consideration. The relationship between bias correction and generalized degree of freedom for GLMMs is also explored. Simulation results show that the estimator performs well. An adjusted model selection criterion (denoted as cAICA), which is based on heuristic arguments, is also proposed as an alternative tool for model identification. Both criteria demonstrate high proportion of correct model identification for GLMMs. Three sets of real data (i.e. epilepsy seizure count data, polio incidence data and US strike data) are used to illustrate the proposed model identification methods. To limit the effect of outliers, a robust version of the REML estimation for Poisson log-linear mixed model is developed. The method not only provides robust estimation for the fixed effect and variance component parameters, but also gives robust prediction of the random effects. Theoretical and numerical aspects of the estimators are examined. Simulation results show that the proposed method is effective in limiting the effect of outliers under different contamination schemes. The epilepsy seizure count data are used to illustrate the method. The robust REML estimation method is then extended to the k-component Poisson mixture model with random effects. The behavior of the estimator is studied, and the formulae for obtaining the asymptotic covariance matrix are derived. Simulation study shows that the performance of the proposed robust REML estimator is comparable with the conventional REML estimator for regular data, and it outperforms in the presence of outliers. The urinary tract infections data are taken to demonstrate the proposed robust estimation method. Following similar lines of derivations, extensions of the developed methodologies are possible for a general class of hierarchical generalized linear models and generalized additive models. These topics are considered as future research directions.
- Linear models (Statistics)