Exploring the Benefits of Resource Disaggregation for Service Reliability in Data Centers

Research output: Journal Publications and ReviewsRGC 21 - Publication in refereed journalpeer-review

4 Scopus Citations
View graph of relations

Author(s)

Related Research Unit(s)

Detail(s)

Original languageEnglish
Pages (from-to)1651-1666
Journal / PublicationIEEE Transactions on Cloud Computing
Volume11
Issue number2
Online published16 Feb 2022
Publication statusPublished - Apr 2023

Link(s)

Abstract

By overcoming the “server box” barrier, resource disaggregation in data centers (DCs) can significantly improve resource utilization. This may then provide a more cost-efficient approach for resource upgrade and expansion. The advantages of resource disaggregation have been explored in earlier research to improve the efficiency of resource usage. This paper investigates the potential benefits of resource disaggregation from the aspect of reliability, which has not been considered before. Resource disaggregation gives rise to a new failure pattern. For example, in a conventional server, the failure of one type of resource leads to the failure of the entire server, so that other types of resources in the same server also become unavailable. After disaggregating, the failure of different types of resources becomes more isolated so that other resources are still available. In this paper, we model the reliability of a resource allocation request in a server-based or disaggregated DC based on whether the request is allocated with only working resources or is also provisioned with backup resources. We then consider a resource allocation problem to maximize the number of requests accepted with guaranteed reliability. This is formulated as an integer linear programming (ILP) problem, and a more straightforward heuristic approach is also proposed. Our numerical studies demonstrate that it may be possible to significantly improve service reliability with this resource disaggregation approach. © 2022 IEEE.

Research Area(s)

  • Computer architecture, Data center, Data centers, Graphics processing units, Hardware, ILP, reliability, resource disaggregation, Resource management, Servers

Download Statistics

No data available