A Systematic Comparison of Horizontal Federated Learning Algorithm Based On Random Forests In A Medical Setting
The medical industry generates vast amounts of data suitable for machine learning during patient-clinician interaction in hospitals. However, as a result of data protection regulations like the General Data Protection Regulation (GDPR), patient data cannot be shared freely across institutions. In these cases, federated learning (FL) is a viable option where a global model learns from multiple data sites without moving the data. In this paper we decided to focus on random forests (RFs) for its effectiveness in classification tasks and widespread use throughout the medical industry and compared two popular federated random forest aggregation algorithms on horizontally partitioned data.