Bottleneck-Aware Non-Clairvoyant Coflow Scheduling with Fai
Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review
Author(s)
Related Research Unit(s)
Detail(s)
Original language | English |
---|---|
Number of pages | 14 |
Journal / Publication | IEEE Transactions on Cloud Computing |
Online published | 16 Nov 2021 |
Publication status | Online published - 16 Nov 2021 |
Link(s)
Abstract
Coflow scheduling is critical to data-parallel applications in data centers. While schemes like Varys can achieve optimal performance, they require a priori information about coflows which is hard to obtain in practice. Existing non-clairvoyant solutions like Aalo generalize least attained service (LAS) scheduling discipline to address this issue. However, they fail to identify the bottleneck flows in a coflow and tend to allocate excessive bandwidth to the non-bottleneck flows, leading to bandwidth wastage and inferior overall performance. To this end, we present Fai that strives to improve the overall coflow performance by accelerating the bottleneck flows without prior knowledge. Fai employs bottleneck-aware scheduling. It adopts loose coordination to update coflow priority and flow rates based on total bytes sent. In addition, Fai detects bottleneck flows based on a flows rate and bytes sent, and de-allocates bandwidth for other flows to match the bottleneck rate without affecting the coflow completion time (CCT). The saved bandwidth is then distributed among coflows according to their priority to improve overall performance. Testbed evaluation on a 40-node cluster shows that Fai improves average (P95) CCT by 1.73X (3.43X), compared to Aalo. Large-scale trace-driven simulations also show that Fai outperforms Aalo substantially.
Research Area(s)
- Bandwidth, Bottleneck-aware, Cloud computing, Coflow completion time, Coflow scheduling, Data centers, Datacenter networks, Fabrics, Job shop scheduling, Processor scheduling, Uplink
Citation Format(s)
Bottleneck-Aware Non-Clairvoyant Coflow Scheduling with Fai. / Liu, Libin; Gao, Chengxi; Wang, Peng; Huang, Hongming; Li, Jiamin; Xu, Hong; Zhang, Wei.
In: IEEE Transactions on Cloud Computing, 16.11.2021.Research output: Journal Publications and Reviews (RGC: 21, 22, 62) › 21_Publication in refereed journal › peer-review