Электронный научный журнал «ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ И ТЕЛЕКОММУНИКАЦИИ» - Privacy-Preserving Federated Learning: Balancing Accuracy and Data Protection in Distributed Machine Learning

Сообщение

Privacy-Preserving Federated Learning: Balancing Accuracy and Data Protection in Distributed Machine Learning

Malik A. M. Al Sweity,

Zlata Kim,

Daniil Marshev

The Bonch-Bruevich Saint Petersburg State University of Telecommunications,
St. Petersburg, 193232, Russian Federation

DOI 10.31854/2307-1303-2025-13-2-52-68

EDN YHQXCK

Full text

XML JATS

Abstract: Problem statement. With the growing volume of sensitive data and stricter requirements for their protection, traditional centralized machine learning methods are becoming unacceptable due to the risks of leaks and breaches of confidentiality. This problem is particularly acute in areas such as healthcare and finance, where the transfer of personal data to a central server is unacceptable. One of the promising solutions is federated learning, which allows global models to be trained without transferring source data, but maintaining a balance between model accuracy and privacy remains a key challenge. Methods. To solve the problem, an approach is proposed that combines the FedAvg aggregation algorithm with differential privacy mechanisms, including trimming gradients and adding Gaussian noise on the client side. Experimental validation was performed on the MNIST dataset using a convolutional neural network with various DP parameters. Results. With optimal settings (σ=0.5, ε≈3), 97.80% accuracy was achieved, which is only 1 % inferior to centralized training (98.79 %). Secure aggregation with 10 clients over 5 rounds showed an accuracy of 93.21 %. The analysis revealed a clear dependence of accuracy on privacy parameters, which allows you to flexibly customize the system to meet specific requirements. Practical significance. The proposed methodology provides a transparent and reproducible assessment of the “accuracy-privacy” compromise, which makes it applicable for implementation in real systems with sensitive data. The results can be used as a basis for adapting PHI in medical, financial, and other mission-critical applications where confidentiality is a priority.

Keywords: federated learning, differential privacy, machine learning, data protection, accuracy-privacy trade-off, secure aggregation.

Reference for citation

Al Sweity M. A. M., Kim Z., Marshev D. Privacy-Preserving Federated Learning: Balancing Accuracy and Data Protection in Distributed Machine Learning // Telecom IT. 2025. Vol. 13. Iss. 2. PP. 52‒68 (in Russian). DOI: 10.31854/2307-1303-2025-13-2-52-68. EDN: YHQXCK

This article is distributed under a license Creative Commons Attribution 4.0 License.

The metadata of the article is distributed under a license CC0 1.0 Universal

Privacy-Preserving Federated Learning: Balancing Accuracy and Data Protection in Distributed Machine Learning

Journal menu