Statistical Analysis of Incomplete Failure Time Data in Reliability


Student thesis: Doctoral Thesis

View graph of relations


  • Xun XIAO


Awarding Institution
  • Min XIE (Supervisor)
  • Nozer Darabsha SINGPURWALLA (Supervisor)
Award date15 Dec 2015


Owing to rapid technological development, contemporary systems have become much more reliable than before. Notwithstanding the fact that failure time data have been used to interpret the system reliability from both the experiment and the field for over a half century, it is much harder to collect enough and exact failure time data from those highly reliable systems. The incomplete failure time data arise naturally. Here, 'incomplete' means the data is either inexact or insufficient due to the limitation of the time, budget and techniques. Clearly, there is a need for the statistical tools capable of analyzing incomplete failure time data and extracting the maximum possible information from the limited data while assuring a better interpretation of the system reliability. As reliability data come in various forms, our study is confined to the three most common types of data described below.
Failure time data: The failure time of an individual system is usually assumed to be a non-negative random variable which follows a specific distribution, e.g., the exponential, Weibull, or lognormal. Given a sample of the random variable, we aim to estimate the distribution parameters such as the mean time to failure and variance. Due to the natures of most sampling techniques and collection procedures, the data are usually incomplete (truncated or censored).
Recurrent failure time data: The failures of a repairable system are not isolated but recurrent. The recurrent failure time is usually modelled by a counting process, such as the Poisson process or a general renewal process. In particular, the aim is to estimate process parameters such as the intensity and the mean time between failures given realizations of the stochastic process. Again, the data might be incomplete (censored or grouped).
Degradation data: The analysis of degradation data involves the measurement of some physical characteristics of the product which can be directly related to the failure of the product. The failure of the product usually is defined as the degradation of the product reaching a predetermined threshold. The degradation of the characteristics can be modelled by a smooth function of time or a continuous stochastic process indexed by time. The failure time distribution of products can be derived by extrapolating the degradation path to the threshold.
This dissertation is composed of three parts corresponding to the statistical analyses addressing each of above three types of data. We will also examine the relationships between the different types of data with particular emphasis on the relationship between the stochastic process and the failure time data.
While failure time data arise in many areas, we are concerned about such data arising from life testing in reliability engineering and survival analysis in biostatistics. One shall realize that, owing to limitations related to time, budget and inspection policy, the collection of exact failure time data could be extremely costly. A compromise is to use censored data. The first part of this dissertation examines issues in parametric inference of failure time data with interval-censoring. An iterative single point imputation algorithm called quantile-filling algorithm is developed for the parametric analysis of interval-censored data. The algorithm generalizes the moment matching approach with exact data to the case with censored data. Extensive numerical studies have been carried out to compare the proposed method with some alternative methods, such as maximum likelihood estimation (MLE) and minimum X2 distance estimation.
Recurrent failure time data arise in repairable systems. The simplest model in this context is the homogeneous Poisson process (HPP). It is easy to show that making an inference on HPP with recurrent failure time data is equivalent to making an inference on an exponential distribution with failure time data. In the second part, we will try to estimate the failure rates of several different Poisson processes simultaneously with the failure count data which can be regarded as a special case of right censored data. The MLE for this problem is shown to be inadmissible. As an improved alternative, a James-Stein type estimator for the simultaneous estimation of failure rates is proposed and its risk is shown to always dominate the MLE. A case study and the simulation results verify the theoretical justification and reveal the superiority of the new estimator. As a more general model for recurrent failure time data, the problem of making inferences on nonhomogeneous Poisson processes (NHPP) has also been addressed in this part. We propose a semiparametric framework with splines for the estimation of the intensity function under the monotonic constraint.
Degradation data have been quite popular in reliability engineering, as more information on the failure of the product could be collected during its whole lifetime. In this part, we investigate the modeling, estimation and optimal design issues arising in accelerated destructive degradation tests (ADDT) with random initial values from the view point of the stochastic process. Degradation models based on the Wiener process are proposed both for the non-acceleration case and the acceleration case. As for the non-acceleration test, the closed-form MLE and optimal design procedures are presented. As for ADDT, since the closed-form MLE is not available, a numerical algorithm is used to optimize the likelihood function. Based on these results, we investigate the optimal design of ADDT by minimizing the asymptotic variance of specific statistics (the pth-quantile of the failure time distribution).
The latter two types of data are closely related to stochastic processes. Recurrent failure time data are realizations of a counting process and degradation data are the sample paths taken by a continuous process. Furthermore, these two types of data can be transformed to failure time data under certain conditions. When the counting process has independent increments, the exact recurrent failure time can be decomposed into segments which can be regarded as failure time data. The degradation data can be linked to the failure times via the hitting time of the continuous stochastic process. Furthermore, the failure of an individual can be regarded as a 0-1 counting process with only 1 jump. Therefore, the cumulative failure number of several products can be modelled by a general counting process which is a superposition of several 0-1 counting processes. From this viewpoint, we propose a general binomial model for reliability growth modeling. The new model fits certain data sets better than the commonly used NHPP models. The inference of the model is closely related to the failure time data with truncation and censoring.