Tong Zhang: "On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data"
Title: On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data
Abstract:
Existing theory predicts that data heterogeneity will degrade the performance of the Federated Averaging (FedAvg) algorithm in federated learning. However, in practice, the simple FedAvg algorithm converges very well. This paper explains the seemingly unreasonable effectiveness of FedAvg that contradicts the previous theoretical predictions. We find that the key assumption of bounded gradient dissimilarity in previous theoretical analyses is too pessimistic to characterize data heterogeneity in practical applications. For a simple quadratic problem, we demonstrate there exist regimes where large gradient dissimilarity does not have any negative impact on the convergence of FedAvg. Motivated by this observation, we propose a new quantity average drift at optimum to measure the effects of data heterogeneity, and explicitly use it to present a new theoretical analysis of FedAvg. We show that the average drift at optimum is nearly zero across many real-world federated training tasks, whereas the gradient dissimilarity can be large. And our new analysis suggests FedAvg can have identical convergence rates in homogeneous and heterogeneous data settings, and hence, leads to better understanding of its empirical success.
Bio:
Tong Zhang is a professor of Computer Science and Mathematics at The Hong Kong University of Science and Technology. Previously, he was a professor at Rutgers university, and worked at IBM, Yahoo, Baidu, and Tencent. Tong Zhang's research interests include machine learning algorithms and theory, statistical methods for big data and their applications. He is a fellow of ASA, IEEE, and IMS, and he has been in the editorial boards of leading machine learning journals and program committees of top machine learning conferences. Tong Zhang received a B.A. in mathematics and computer science from Cornell University and a Ph.D. in Computer Science from Stanford University.