Federated Learning (FL) is a machine learning paradigm where multiple data owners collaboratively train a model under the coordination of a central server, while keeping all data decentralized. Such a paradigm allows models to be trained effectively while avoiding data privacy leakage. However, federated learning is vulnerable to various kinds of failures as a result of both intentional (malicious) and non-intentional (non-malicious) attacks.
To solve the problems, a research team led by Tianye ZHANG published their
new research on 15 July 2025 in
Frontiers of Computer Science co-published by Higher Education Press and Springer Nature.
The team proposed FedCare, a real-time visual diagnosis approach for handling failures in federated learning systems. The functionality of FedCare includes the identification of failures, the assessment of their nature (malicious or non-malicious), the study of their impact, and the recommendation of adequate defense strategies.
FedCare provides a tailored visual diagnosis pipeline whose core functionalities are manifested by three major modules: 1) a failure diagnosis module that leverages the ensemble method to identify failures and assess whether they are malicious in nature; 2) an impact diagnosis module that investigates the impact of failures on the global model and identifies group activities of highly relevant clients; 3) a model refinement module that recommends defense strategies to the analyst for the identified failures and helps to improve the performance of the FL system. Particularly, the first two modules provide in-depth insights for different failures and thus offer evidence and guidelines for the third module. The team integrated these means into a visual interface for the supervisory control of federated learning systems, which is more reliable, safe, and trustworthy than fully automatic methods.
The team reported the performance of the federated learning system with and without the diagnosis of FedCare. The reported scenarios include the two case studies. Both cases show that the federated learning system converges faster and the accuracy of the system shows a significant improvement.
In the future, they plan to apply FedCare to other federated learning tasks, e.g., recommendation and prediction. More complicated models are expected to be incorporated. Additionally, they plan to use more anomaly (contribution) assessment metrics and defense strategies, so that FedCare can be practically applied to diagnose more kinds of malicious attacks and non-malicious failures.
DOI:
10.1007/s11704-024-3735-7