Diagnostics, prognostics and health assessment of hard disk drives
Student thesis: Doctoral Thesis
Related Research Unit(s)
A hard disk drive (HDD) is, perhaps, the most important storage device in modern society. It is estimated that 90% of the new information produced in the world is stored on magnetic media, mostly on HDDs. An HDD is widely used in personal computer, data center, and mobile phones as a key component. Designing and manufacturing highly reliable HDDs is therefore essential to modern society. Although an HDD is regarded as a very reliable device, its failure can still result in serious data loss and service downtime, which could incur tremendous financial and economic losses to users and communities. Therefore, the first objective of this thesis is to develop a prognostic model for predicting the HDDs’ failure, which can provide advance warnings to users. Then, to help the manufacturers design more reliable HDDs, we also carried out intensive investigations of the new failure mechanisms for the latest generation HDDs and provided the solutions to optimize the designs and manufacturing processes. On the basis of new failure mechanisms, we also developed new reliability evaluation methods for HDD manufacturers to reduce the cost and duration of the reliability evaluation process. The major work is outlined in the sections to come. Predicting the impending failures of HDDs is essential for users in data loss prevention. Although manufacturers have developed self-monitoring, analysis and reporting technology (SMART) to monitor the HDD health status, the SMART used in HDDs is not able to work as effectively as it should be due to the lack of an efficient prediction algorithm. To address this problem, we developed a two-step parametric method in Chapter 3. This method deals with the problem of failure prediction in two steps: anomaly detection and failure prediction. In addition to that, we derived a new cost function to adjust the prediction rate. This is important for balancing the failure detection rate and the false alarm rate as well as for providing advance warnings of HDD failures to users. Results from simulation and real data sets demonstrated the superiority of the new algorithm. In addition to the lack of efficient algorithms for failure prediction, the SMART parameters are also inefficient in identifying HDD failures. This is particularly true in the latest generation of HDDs where failure mechanisms remain largely unclear. In Chapters 4–6, we look into new failure mechanisms related to head-disk interface and head stack assembly, which are two most critical components in HDDs, respectively. In Chapter 4, the degradation mechanisms of head-disk interface including wear mechanism shift and head-disk interface instability propagation had been identified by analyzing the degradation behaviors. On the basis of the degradation mechanism, a stochastic model is applied in Chapter 5 to characterize the head wear, which is the main reliability issue in a latest generation HDD. To meet the requirement of the limited duration for reliability evaluation, an accelerated degradation strategy was adopted and an accelerated degradation model based on the stochastic wear model was developed. Taking the advantage of this model, the parameters/failure times under light loads could be predicted effectively. The results showed that our model is much more efficient than the traditional regression methods regarding the failure time estimation. In Chapter 6, we explored the head stack assembly related failure mechanisms. The low-frequency vibration related to head stack assembly were found to be able to induce vigorous head vibration or offtrack motion. These behaviors could cause the reading/ writing failures especially when the physical clearance between head and disk is small. To solve these problems, both experimental and finite element analysis had been carried out. Solutions for optimizing the design and the manufacturing process of head stack assembly had been given after the implementation of extensive analysis. The results show that our solutions are effective to improve the products’ performance.
- Maintenance and repair, Hard disks (Computer science), Data disk drives