Improved and Knowledge-informed Lasso Family Statistical Learning with Applications
改進的和知識引導的Lasso類統計學習及其應⽤
Student thesis: Doctoral Thesis
Author(s)
Related Research Unit(s)
Detail(s)
Awarding Institution | |
---|---|
Supervisors/Advisors |
|
Award date | 23 Nov 2023 |
Link(s)
Permanent Link | https://scholars.cityu.edu.hk/en/theses/theses(e5399c31-6e0d-4b53-a289-19a6eb857b5c).html |
---|---|
Other link(s) | Links |
Abstract
Industrial process quality predictions are critical for smart manufacturing since they directly affect the profitability. Process inferential sensors have been studied for over three decades in process systems engineering. These collected datasets pose challenges for learning and prediction due to their massive size and high dimensionality. One of the critical tasks in developing data-driven inferential sensors is to select relevant variables as input to the model.
Sparse statistical learning methods such as the least absolute shrinkage and selection operator (Lasso) have provided promising solutions to select relevant variables. As a close alternative, the least angle regression (LARS) offers a forward selection approach to sparse regression modeling with computational costs comparable to ordinary least squares. Although the Lasso family has been popular for variable selection in regression modeling, these methods encounter certain challenges in stability and optimality.
Firstly, the Lasso family suffers from several potential drawbacks when applied to process data, which are usually collinear due to material and energy balances. For instance, LARS tends to select only one variable out of a collinear set of variables. Other variables in the collinear set are not selected, even though they are highly relevant to the quality variable to be predicted. Secondly, the selected set of variables of Lasso can be sensitive to the training data samples. Even minor changes can lead to a different set of selected variables without any significant improvement in the model quality. Additionally, the variable selection results can sometimes disagree with the process knowledge. Lastly, a notable drawback of the Lasso family is its acclaimed advantage, in that the Lasso determines the zero coefficients and estimates non-zero coefficients simultaneously with one hyperparameter value. It has been shown that the same hyperparameter value that drives some coefficients to zero does not necessarily yield optimal estimates for the non-zero coefficients of other variables.
In this dissertation, three methods are presented to tackle the identified issues. Firstly, a new stable Lasso formulation and selection criterion is proposed, utilizing cross-validation for effective model structure learning. The stable Lasso algorithm enables the production of reliable and consistent model structures in cross-validation, particularly for steady state and dynamic inferential sensor modeling. Secondly, a knowledge-informed Lasso (KILasso) formulation is introduced to ensure that variables informed by process knowledge are retained in sparse learning models. This approach aligns the model with domain expertise and facilitates more accurate predictions. Lastly, a novel two-step sparse learning approach is presented, involving variable selection and model parameter estimation with optimally tuned hyperparameters in each step. For example, the approach could adopt KILasso alongside ridge regression to form knowledge-informed Lasso-ridge (KILR) algorithms. These methods contribute to improved sparse statistical learning by ensuring stability, leveraging process knowledge, and optimizing variable selection and parameter estimation.
The superiority of the proposed methods has been validated by three experiments: 1. An industrial boiler process; 2. The Dow Chemical challenge problem; 3. A cooling load prediction task for commercial buildings.
Sparse statistical learning methods such as the least absolute shrinkage and selection operator (Lasso) have provided promising solutions to select relevant variables. As a close alternative, the least angle regression (LARS) offers a forward selection approach to sparse regression modeling with computational costs comparable to ordinary least squares. Although the Lasso family has been popular for variable selection in regression modeling, these methods encounter certain challenges in stability and optimality.
Firstly, the Lasso family suffers from several potential drawbacks when applied to process data, which are usually collinear due to material and energy balances. For instance, LARS tends to select only one variable out of a collinear set of variables. Other variables in the collinear set are not selected, even though they are highly relevant to the quality variable to be predicted. Secondly, the selected set of variables of Lasso can be sensitive to the training data samples. Even minor changes can lead to a different set of selected variables without any significant improvement in the model quality. Additionally, the variable selection results can sometimes disagree with the process knowledge. Lastly, a notable drawback of the Lasso family is its acclaimed advantage, in that the Lasso determines the zero coefficients and estimates non-zero coefficients simultaneously with one hyperparameter value. It has been shown that the same hyperparameter value that drives some coefficients to zero does not necessarily yield optimal estimates for the non-zero coefficients of other variables.
In this dissertation, three methods are presented to tackle the identified issues. Firstly, a new stable Lasso formulation and selection criterion is proposed, utilizing cross-validation for effective model structure learning. The stable Lasso algorithm enables the production of reliable and consistent model structures in cross-validation, particularly for steady state and dynamic inferential sensor modeling. Secondly, a knowledge-informed Lasso (KILasso) formulation is introduced to ensure that variables informed by process knowledge are retained in sparse learning models. This approach aligns the model with domain expertise and facilitates more accurate predictions. Lastly, a novel two-step sparse learning approach is presented, involving variable selection and model parameter estimation with optimally tuned hyperparameters in each step. For example, the approach could adopt KILasso alongside ridge regression to form knowledge-informed Lasso-ridge (KILR) algorithms. These methods contribute to improved sparse statistical learning by ensuring stability, leveraging process knowledge, and optimizing variable selection and parameter estimation.
The superiority of the proposed methods has been validated by three experiments: 1. An industrial boiler process; 2. The Dow Chemical challenge problem; 3. A cooling load prediction task for commercial buildings.