Malik A. M. Al Sweity, Zlata Kim, Daniil Marshev
The Bonch-Bruevich Saint Petersburg State University of Telecommunications, St. Petersburg, 193232, Russian Federation
DOI 10.31854/2307-1303-2025-13-2-52-68
EDN YHQXCK
|
Full text
XML JATS
Abstract: Problem statement. With the growing volume of sensitive data and stricter requirements for their protection, traditional centralized machine learning methods are becoming unacceptable due to the risks of leaks and breaches of confidentiality. This problem is particularly acute in areas such as healthcare and finance, where the transfer of personal data to a central server is unacceptable. One of the promising solutions is federated learning, which allows global models to be trained without transferring source data, but maintaining a balance between model accuracy and privacy remains a key challenge. Methods. To solve the problem, an approach is proposed that combines the FedAvg aggregation algorithm with differential privacy mechanisms, including trimming gradients and adding Gaussian noise on the client side. Experimental validation was performed on the MNIST dataset using a convolutional neural network with various DP parameters. Results. With optimal settings (σ=0.5, ε≈3), 97.80% accuracy was achieved, which is only 1 % inferior to centralized training (98.79 %). Secure aggregation with 10 clients over 5 rounds showed an accuracy of 93.21 %. The analysis revealed a clear dependence of accuracy on privacy parameters, which allows you to flexibly customize the system to meet specific requirements. Practical significance. The proposed methodology provides a transparent and reproducible assessment of the “accuracy-privacy” compromise, which makes it applicable for implementation in real systems with sensitive data. The results can be used as a basis for adapting PHI in medical, financial, and other mission-critical applications where confidentiality is a priority.
Keywords: federated learning, differential privacy, machine learning, data protection, accuracy-privacy trade-off, secure aggregation.
Reference for citation
Al Sweity M. A. M., Kim Z., Marshev D. Privacy-Preserving Federated Learning: Balancing Accuracy and Data Protection in Distributed Machine Learning // Telecom IT. 2025. Vol. 13. Iss. 2. PP. 52‒68 (in Russian). DOI: 10.31854/2307-1303-2025-13-2-52-68. EDN: YHQXCK
|