Project Details
Description
Penalised regression refers to those estimation methods in regression where the
criterion used to estimate coefficients includes a penalty term on those coefficients. An
important distinguishing feature of many penalised regression methods is that they are
able to combine automatic variable selection and coefficient estimation simultaneously.
This is important because both are at the heart of statistical analysis. Moreover,
penalised regression methods also handle cases where the number of parameters to be
estimated exceeds the sample size. These very strong features make penalised
regression an attractive approach, and a sizable body of literature on this subject has
developed and continues to grow. Examples of penalised regression methods that have
achieved great successes include the least absolute shrinkage and selection operator
(LASSO) and its adaptive variant, the elastic net (EN), as well as the smooth clipped
absolute deviation (SCAD) estimator. Different branches of applied statistics have also
developed their own distinct types of penalised regression methods; for example, the
octagon shrinkage and clustering algorithm for regression (OSCAR) was developed
mainly in response to statistical problems encountered in genetic studies.Despite this vast intellectual progress, the literature still remains relatively silent on
several important issues. For example, penalised regression has not yet been extended
to statistical problems characterized by length-biased data. Also, despite the great
popularity of penalised regression in biostatistics, in fields like econometrics where
variable selection is always a major issue, the methodology has so far been largely
ignored. Moreover, there remains a considerable amount of work that could usefully
be undertaken to explore the properties of some recently proposed penalised regression
methods, especially under non-standard conditions.With this motivation, this project is designed to extend the penalised regression method
in several important directions. To achieve this, the project is divided into five
parts. Part 1 focuses on the development of penalised regression methods in a class of
censored regression models known collectively in econometrics as Tobit models. Here,
a major issue to be addressed relates to the combination of penalised regression with
the popular Heckman’s two step estimation procedure in univariate Tobit
models. Also to be considered are ways of implementing penalised regression in the
bivariate Tobit model, with a specific focus on some of the recently proposed semiparametric
methods for this model. Part 2 of the project considers penalised regression
under length-biased sampling, where the sample observations are not selected
randomly, but with probability proportional to their length. Exploiting the principal
investigator’s recent work on modeling length-biased data using a varying-coefficient
quantile approach, we will develop penalised regression methods within the same
context, although our initial analysis will likely focus on the linear regression model as
an exploratory step towards obtaining results in the more complex contexts. Parts 3
and 4 will extend the Pair-wise Absolute Clustering and Sparsity (PACS) method, an
improved variant of the OSCAR, to problems characterized by censored and missing
data; in particular, Part 4 considers penalised regression in conjunction with the
estimating equations approach with missing data as developed in a recent paper by the
principal investigator. Part 5 applies penalised regression to a large Hong Kong
housing data set to illustrate the usefulness of this technique in social science research.
Some theoretical work for this project is already underway, and the investigators have
succeeded in obtaining some preliminary results. The amount requested will be used
mainly for the recruitment of research support staff to conduct simulation and real data
analysis.
Project number | 7008134 |
---|---|
Grant type | SRG |
Status | Finished |
Effective start/end date | 1/05/12 → 13/03/15 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.