

They derived a new algorithm, called Loopless Gradient Descent (L2GD), to solve it and showed that this algorithms leads to improved communication complexity guarantees in regimes when more personalization is required. Recently, Hanzely and Richtárik (2020) proposed a new formulation for training personalized FL models aimed at balancing the trade-off between the traditional global model and the local models that could be trained by individual devices using their private data only. Existing FL algorithms aim to learn a single global model for all participating devices, which may not be helpful to all devices participating in the training due to the heterogeneity of the data across the devices. In contrast to training traditional machine learning (ML) models in data centers, federated learning (FL) trains ML models over local datasets contained on resource-constrained heterogeneous edge devices. "Personalized Federated Learning with Communication Compression".

(*) Members of my Optimization and Machine Learning Lab at KAUST.

Slavomir Hanzely (*) Dmitry Kamzolov Dmitry Pasechnyuk Alexander Gasnikov, andĤ) "Variance Reduced ProxSkip: Algorithm, Theory and Application to Federated Learning" -ĥ) "Theoretically Better and Numerically Faster Distributed Optimization with Smoothness-Aware Quantization Techniques" -Ħ) "Communication Acceleration of Local Gradient Methods via an Accelerated Primal-Dual Algorithm with an Inexact Prox" -ħ) "Distributed Methods with Compressed Communication for Solving Variational Inequalities, with Theoretical Guarantees" -Ĩ) "BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression" -ĩ) "The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization" -ġ0) "Optimal Gradient Sliding and its Application to Optimal Distributed Optimization Under Similarity" -ġ1) "Optimal Algorithms for Decentralized Stochastic Variational Inequalities" -ġ2) "EF-BV: A Unified Theory of Error Feedback and Variance Reduction Mechanisms for Biased and Unbiased Compression in We've had several papers accepted to the 36th Annual Conference on Neural Information Processing Systems (NeurIPS 2022), which will run during November 28-Decemin New Orleans, USA.ġ) "Accelerated Primal-Dual Gradient Method for Smooth and Convex-Concave Saddle-Point Problems with Bilinear Coupling" -Ģ) "The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization" -ģ) "A Damped Newton Method Achieves Global $O(1/k^2)$ and Local Quadratic Convergence Rate". We also analyze our method's complexity in the nonconvex and convex cases and evaluate its performance on multiple machine learning tasks. At each iteration, MiSTP generates a random search direction in a similar manner to STP, but chooses the next iterate based solely on the approximation of the objective function rather than its exact evaluations.

It is based on the recently proposed stochastic three points (STP) method (Bergou et al., 2020). In this paper, we propose a new zero order optimization method called minibatch stochastic three points (MiSTP) method to solve an unconstrained minimization problem in a setting where only an approximation of the objective function evaluation is possible. "Minibatch Stochastic Three Points Method for Unconstrained Smooth Minimization".
