TY - GEN
T1 - Data Allocation for Approximate Gradient Coding in Edge Networks
AU - Li, Haojun
AU - Chen, Yi
AU - Shum, Kenneth W.
AU - Sung, Chi Wan
PY - 2023
Y1 - 2023
N2 - To leverage the computing power in an edge network, one can divide a machine learning task into several subtasks and assign the subtasks to several computing devices to complete. Under master-worker architecture, the master divides and distributes the data to several workers. In each iteration, the master asks the workers to compute some function of the local data stored in the workers. For example, in gradient-based learning, this function can be the partial gradient function. Since the workers have different computing resources, the speed of the distributed learning is hindered by some workers with long latency, called the stragglers. Gradient coding solves the problem of stragglers by allowing the master to recover the desired feedback information in the presence of s stragglers. If the total number of stragglers is n, the master can just wait for the n-s fastest workers. In this paper we consider the problem of data allocation so that the gradient vector can be approximated obtained by the master node with small error. A block repetition scheme is proved to be the optimal data allocation scheme if we want to minimize the average recovery error. © 2023 IEEE.
AB - To leverage the computing power in an edge network, one can divide a machine learning task into several subtasks and assign the subtasks to several computing devices to complete. Under master-worker architecture, the master divides and distributes the data to several workers. In each iteration, the master asks the workers to compute some function of the local data stored in the workers. For example, in gradient-based learning, this function can be the partial gradient function. Since the workers have different computing resources, the speed of the distributed learning is hindered by some workers with long latency, called the stragglers. Gradient coding solves the problem of stragglers by allowing the master to recover the desired feedback information in the presence of s stragglers. If the total number of stragglers is n, the master can just wait for the n-s fastest workers. In this paper we consider the problem of data allocation so that the gradient vector can be approximated obtained by the master node with small error. A block repetition scheme is proved to be the optimal data allocation scheme if we want to minimize the average recovery error. © 2023 IEEE.
UR - https://www.scopus.com/pages/publications/85171430793
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85171430793&origin=recordpage
U2 - 10.1109/ISIT54713.2023.10206830
DO - 10.1109/ISIT54713.2023.10206830
M3 - RGC 32 - Refereed conference paper (with host publication)
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 2541
EP - 2546
BT - 2023 IEEE International Symposium on Information Theory (ISIT) 2023
PB - IEEE
T2 - 2023 IEEE International Symposium on Information Theory, ISIT 2023
Y2 - 25 June 2023 through 30 June 2023
ER -