TY - GEN
T1 - Data Encoding for Byzantine-Resilient Distributed Gradient Descent
AU - Data, Deepesh
AU - Song, Linqi
AU - Diggavi, Suhas
N1 - Full text of this publication does not contain sufficient affiliation information. With consent from the author(s) concerned, the Research Unit(s) information for this record is based on the existing academic department affiliation of the author(s).
PY - 2018/10
Y1 - 2018/10
N2 - We consider distributed gradient computation, where both data and computation are distributed among m worker machines, t of which can be Byzantine adversaries, and a designated (master) node computes the model/parameter vector, iteratively using gradient descent (GD). The Byzantine adversaries can (collaboratively) deviate arbitrarily from their gradient computation. To solve this, we propose a method based on data encoding and (real) error correction to combat the adversarial behavior. We can tolerate up to t ≤ [m-1/2] corrupt worker nodes, which is information-theoretically optimal. Our method does not assume any probability distribution on the data. We develop a sparse encoding scheme which enables computationally efficient data encoding. We demonstrate a trade-off between the number of adversaries tolerated and the resource requirement (storage and computational complexity). As an example, our scheme incurs a constant overhead (storage and computational complexity) over that required by the distributed GD algorithm, without adversaries, for t ≤ m/3 .
AB - We consider distributed gradient computation, where both data and computation are distributed among m worker machines, t of which can be Byzantine adversaries, and a designated (master) node computes the model/parameter vector, iteratively using gradient descent (GD). The Byzantine adversaries can (collaboratively) deviate arbitrarily from their gradient computation. To solve this, we propose a method based on data encoding and (real) error correction to combat the adversarial behavior. We can tolerate up to t ≤ [m-1/2] corrupt worker nodes, which is information-theoretically optimal. Our method does not assume any probability distribution on the data. We develop a sparse encoding scheme which enables computationally efficient data encoding. We demonstrate a trade-off between the number of adversaries tolerated and the resource requirement (storage and computational complexity). As an example, our scheme incurs a constant overhead (storage and computational complexity) over that required by the distributed GD algorithm, without adversaries, for t ≤ m/3 .
UR - https://www.scopus.com/pages/publications/85062861362
UR - https://www.scopus.com/record/pubmetrics.uri?eid=2-s2.0-85062861362&origin=recordpage
U2 - 10.1109/ALLERTON.2018.8636017
DO - 10.1109/ALLERTON.2018.8636017
M3 - RGC 32 - Refereed conference paper (with host publication)
T3 - Annual Allerton Conference on Communication, Control, and Computing
SP - 863
EP - 870
BT - 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
PB - IEEE
T2 - 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton 2018)
Y2 - 2 October 2018 through 5 October 2018
ER -