Statistical Process Monitoring for High-Dimensional Processes

統計過程控制中的高維過程監測研究

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
Award date6 Dec 2022

Abstract

Statistical process monitoring offers tools for detecting changes in data streams and processes. These tools have been applied in manufacturing, public health surveillance, service industry, as well as other fields. Widely used sensors and internet technology create data-rich environments. Consequently, hundreds of measurements related to the process of interest are available for monitoring. The increasing dimension of features and complicated dependence structures in high-dimensional data challenge the applicability and performance of existing approaches. This dissertation proposes several new statistical process monitoring methods for detecting location changes in high-dimensional processes, specifically when the processes are inherently nonstationary.

The correlation structure among high-dimensional variables can be difficult to estimate accurately when a large amount of data is available. Furthermore, the underlying distribution of the process can be time-dependent, increasing the difficulty of estimation. The curse-of-dimensionality also affects the detection power of monitoring methods. Nonstationarity is a common feature in many processes. We propose solutions to detect changes in high-dimensional nonstationary processes. The first two chapters of this dissertation focus on detecting changes in the dynamic mean of a process. Next, in Chapters 4 and 5, we propose methods to monitor sparse shifts in the mean vectors, where we consider dynamic variance as a common cause variability.

In Chapter 2, we propose to use functional data analysis to model nonstationary mean levels, as the time series of each variable can often be represented by a curve (or profile). The functional data analysis method, which treats each profile as a linear combination of multiple functions, can model dynamic processes efficiently. After modeling the processes with a set of basis functions, we propose control charts based on extracted functional principal components and the residuals to detect location changes in the process online. The monitoring scheme can detect changes efficiently in a health-related case study.

Chapter 3 is an extension of Chapter 2 and is inspired by the same case. Here we deal with the problem of building a reliable monitoring scheme for a high-dimensional process when the historical data from that target process are limited. Nevertheless, we can collect enough data from other processes which correlate to the target process. To address this problem, we propose a transfer learning-based estimator to increase parameter estimation accuracy for the target process and answer the questions about what and how to transfer useful knowledge from related processes. A multivariate control chart based on transfer learned parameters is developed for online monitoring. Extensive simulation study results demonstrate the numerical performance of the proposed method.

Chapter 4 proposes two change-point-based control charts to detect mean shifts in heteroscedastic processes, where the underlying variance changes over time. The statistic is sensitive to significant sparse changes in the mean of the high-dimensional processes. A post-signal diagnosis method is proposed to estimate the change-point and assignable causes. An advantage of the proposed scheme is that it can start monitoring with limited in-control data. Two real cases illustrate the practicability of the proposed methods in monitoring high-dimensional heteroscedastic processes.

The shift size limits the performance of methods proposed in Chapter 4. To improve the sensitivity for detecting small and sparse mean changes from high-dimensional processes, we propose a rank-based EWMA monitoring scheme. Heteroscedasticity is assumed to be a common cause variability. A bootstrap algorithm determines the control limits by achieving a pre-specified false alarm probability. A post-signal diagnosis strategy is executed to cluster the shifted variables and estimate a time window for the change point. The simulation results verify the robustness of the proposed method under heteroscedasticity and the efficiency in detecting small changes.

Overall, this dissertation focuses on developing innovative methods to monitor high-dimensional processes, contributing to the field of statistical process monitoring. Various nonstationary conditions are considered in the assumption so that the proposed methods are applicable to various real applications.

    Research areas

  • Statistical process monitoring, Control charts, Nonstationary processes, High-dimensional