TY - GEN
T1 - Adaptive Erasure Coded Data Maintenance for Consensus in Distributed Networks
AU - Jia, Yulei
AU - Xu, Guangping
AU - Sung, Chi Wan
AU - Mostafa, Salwa
PY - 2021
Y1 - 2021
N2 - Distributed data services usually rely on consensus protocols, such as Paxos and Raft, to provide fault-Tolerance and data consistency across distributed data centers and even edge networks. In consensus protocols, erasure coded replication has appealing storage and network cost savings compared with full copy replication, which help achieve low latency, high fault-Tolerance and high throughput. However, the liveness level will inevitably decrease when erasure codes are naively applied in consensus protocols. To keep the original liveness level, an existing protocol, called CRaft, switches from erasure coded replication to full copy replication when the number of failures exceeds a certain threshold. Such a solution, however, degrades system performance sharply. To tackle this problem, this work proposes a novel protocol called HRaft to enable graceful degradation on storage and network efficiency when failures happen. Without using full copy replication, it replenishes some coded blocks in healthy servers to reduce storage and network costs and to keep data consistency. The performance of the proposed protocol will be evaluated by deploving it into practical networks.
AB - Distributed data services usually rely on consensus protocols, such as Paxos and Raft, to provide fault-Tolerance and data consistency across distributed data centers and even edge networks. In consensus protocols, erasure coded replication has appealing storage and network cost savings compared with full copy replication, which help achieve low latency, high fault-Tolerance and high throughput. However, the liveness level will inevitably decrease when erasure codes are naively applied in consensus protocols. To keep the original liveness level, an existing protocol, called CRaft, switches from erasure coded replication to full copy replication when the number of failures exceeds a certain threshold. Such a solution, however, degrades system performance sharply. To tackle this problem, this work proposes a novel protocol called HRaft to enable graceful degradation on storage and network efficiency when failures happen. Without using full copy replication, it replenishes some coded blocks in healthy servers to reduce storage and network costs and to keep data consistency. The performance of the proposed protocol will be evaluated by deploving it into practical networks.
KW - consensus protocol
KW - Erasure codes
KW - fault tolerance
KW - network storage
KW - Paxos
KW - Raft
UR - https://www.scopus.com/pages/publications/85123013348
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85123013348&origin=recordpage
U2 - 10.1109/SRDS53918.2021.00042
DO - 10.1109/SRDS53918.2021.00042
M3 - RGC 32 - Refereed conference paper (with host publication)
SN - 978-1-6654-3820-9
T3 - Proceedings of the IEEE Symposium on Reliable Distributed Systems
SP - 345
EP - 346
BT - Proceedings - 2021 40th International Symposium on Reliable Distributed Systems, SRDS 2021
PB - IEEE
T2 - 40th International Symposium on Reliable Distributed Systems, SRDS 2021
Y2 - 20 September 2021 through 23 September 2021
ER -