Development and Applications of Bayesian Compressive Sensing (BCS)-Based Active Learning Methods  


Student thesis: Doctoral Thesis

View graph of relations



Awarding Institution
Award date9 Aug 2023


With the rapid development of artificial intelligence technology and computer technology and the continuous innovation of learning algorithms and theories, machine learning has recently become a hot research topic in various fields. Machine learning algorithms generally construct models, based on sampling data, to replace original models for response prediction and decision making, while improving the efficiency of achieving the specific objectives. The quality of the trained model obtained from machine learning algorithms depends greatly on the number and locations of sampling data. In engineering practice, however, due to time or budget limit, technical or access constraints, etc., the number of sampling data available is often sparse and limited. Therefore, a main challenge that hinders the application of machine learning algorithms in practical engineering is how to train a high-quality model when only sparse sampling data are available, especially for highly non-stationary data or complicated systems. Moreover, most machine learning algorithms cannot self-evaluate reliability of the trained models. This renders another challenge on how to determine whether the number of sampling data is sufficient to achieve target reliability of specific objectives.

Active learning is a special case of machine learning in which a learning algorithm can adaptively identify the optimal sampling point to improve the performance of the desired outputs until a target reliability is achieved, and it is often used in applications where sampling data is expensive to obtain. This study aims to address the above-mentioned challenges in engineering applications by developing Bayesian compressive sensing (BCS)-based active learning methods. BCS is a novel sampling theory originating from signal processing, and it has recently been used in spatial data interpolation from sparse measurements. One major advantage of BCS is that it is able not only to provide predictions of response at unsampled locations, but also to quantify the uncertainty associated with these predictions simultaneously. Leveraging on this advantage, BCS is combined with a series of learning functions to develop BCS-based active learning methods for adaptively determining the minimum number of sampling data required and their optimal sampling locations for achieving the target reliability of specific objectives in this study. Because BCS is a non-parametric method, it is directly applicable to both stationary and non-stationary data. The developed BCS-based active learning methods tackle the challenges in the following applications brought by small sample data, including planning of multi-stage site investigation, optimization design of spatial location of precipitation station, reliability analysis, and global optimization.

Site investigation is an integral and vital part of underground construction, and an active learning method for planning of multi-stage geotechnical site investigation in a cross-section is first proposed using Voronoi diagram, BCS, and information entropy. The proposed method can automatically determine the necessary sample number and their corresponding optimal sampling locations for achieving a target reliability of interpretation results for a cross-section. Subsequently, BCS is extended to spatio-temporal data modeling. A spatio-temporal BCS (ST-BCS) method for interpolation of spatio-temporally varying precipitation data from sparse measurements is developed and applied to the spatio-temporal interpolation of real monthly precipitation in Hong Kong. The proposed ST-BCS method is also combined with information entropy to develop an active learning method, for the optimization design of spatial location of precipitation, to improve the spatio-temporal interpolation accuracy of precipitation. Furthermore, BCS is further extended to high-dimensional data modeling required in response surface method (RSM), and the proposed BCS-based RSM is able to accurately reconstruct a highly nonlinear response surface from a small number of sampling points. Active learning reliability methods using adaptive BCS (ABCS) and reliability analysis algorithms (i.e., Monte Carlo simulation (MCS) and subset simulation (SS)) are also developed, denoted as ABCS-MCS and ABCS-SS (for rare events), respectively, and they can adaptively determine the minimum number of sampling points required and their sampling locations for achieving a target accuracy of reliability analysis. Moreover, the proposed BCS-based RSM is also applied to develop active learning method for unconstrained efficient global optimization (EGO). The proposed BCS-based EGO is illustrated using both analytic functions and engineering application examples, and it is shown to perform well.