Reliability Analysis and Optimization of Software System Considering Fault Detection and Correction Processes

考慮錯誤檢測和移除過程的軟件系統可靠性分析及其優化

Student thesis: Doctoral Thesis

View graph of relations

Author(s)

Detail(s)

Awarding Institution
Supervisors/Advisors
  • Min XIE (Supervisor)
  • Nozer Darabsha SINGPURWALLA (Supervisor)
Award date13 Sep 2017

Abstract

Computer systems play a critical role in many aspects of both scientific and industrial applications, most of which are either safety-critical or performance-critical. With the improvement of hardware reliability, the issue with software is getting more and more important. Defects remaining in the software after release are the fundamental causes of field system failures and the number of remaining faults changes over time during the software testing phase. Therefore, to study the dynamic defect behaviour would provide valuable information for software development management purpose.

To assess the reliability and also deploy test work to meet the safety requirements, one can throw the following three steps.
• Fault Dynamic Modelling: Manipulate raw data, including test record data, static code matrices data, and staffing data, and fit them with potential models.
• Fault Dynamic Prediction: Predict the failure behaviour with possible models. Model selection should be conducted if several models are adopted.
• Test Scheduling: Develop and evaluate the optimized test scheduling strategies with the selected model.

Therefore, to estimate the reliability metrics and carry out an optimal strategy for a particular purpose, this thesis consists of three successive parts corresponding to the above three consecutive steps.

The absence of evidence (of a fault), is not evidence of absence (of a fault). One way to figure out whether faults are adequately identified is to collect as much directive or in-directive evidence as one can. Manipulating and modelling the fault detection time, correction time, and test effort data are ways to get evidence. However, combining of all pieces of evidence are challenging.

In the first part, We developed a black-box software reliability modelling framework for fault detection and correction processes with a new type of dataset (i.e. semi-grouped data). Compared with existing modelling methods, our approach is much easier to implement and more stable in estimation and prediction. We applied the proposed model to predict the test process with different consideration, e.g., test-effort dependence and multi-release situation. Parameters used in those models are obtained with maximum likelihood method.

The objective of Fault Dynamic Prediction is to get a foresee of possible situations and also act as an evaluation of particular management strategy. In the second part, we considered the Bayes inference with MCMC which is widely used in reliability analysis, and its confidence interval can be given directly by the posterior. While MCMC is suffering from time-consuming, we derived the Variational Bayes to accelerate the calculation of posterior. Other than the analytical methods, we also developed a simulation approach to explore several factors' impacts on fault dynamics, including fault removal time delay, fault introduction rate, testing and debugging staffing.

Many optimization models have been developed for software projects to improve the efficiency of testing as well maximize the benefits. In the third part, we deal with two critical scheduling issues. First one is the releasing time determination. Various approaches are applied in the different situations, including cost-criteria optimization, MAUT, and simulation approach. When assessing the cost, the posteriors are involved, addressing parameters' uncertainty. Another concern is the dynamic allocation of testing and debugging effort. A simulation based optimization approach is developed to deal with this problem.

    Research areas

  • Software reliability