MPC and Federated Learning, complementary tools for secure collaboration
Customers often ask how Multi-Party Computation compares to Federated Learning. In this post we explain the differences and synergies.
Federated Learning (FL) offers a way to train AI/ML-models while keeping the inputs decentralized. By sharing local training parameters rather than actual data, FL aims to protect participants’ privacy. However, the privacy properties of FL have limitations, as a survey study from 2021 published in ‘Future Generation Computer Systems’ explains:
“Federate Learning still has some privacy threats because the adversaries can partially reveal each participants’ training data in the original training dataset based on their uploaded parameter.”
In short: Machine learning parameters are exchanged between the parties participating in the training. It is possible to infer information from these parameters, which makes the residual risk hard to evaluate. This could cause delays in setting up the collaboration. ?
Multi-party Computation (MPC), by contrast, does not focus on training an AI/ML alone, but supports a broad set of computations, e.g. tabular joins, queries and general statistics. It offers stronger privacy guarantees because inputs remain encrypted during the entire processing. This eliminates the risk of inference attacks on intermediate results.
领英推荐
However, the stronger privacy property of MPC comes at higher computational cost. For example, training a decision tree with MPC on 10,000 records takes minutes, and training a neural net takes hours, while pure FL can handle these tasks faster. (Try this user-friendly open-source MPC example for training a Convolutional Neural Network on the well-known MNIST handwritten digits training set of 60,000 images.)
Whether you choose Federated Learning or MPC for your next project depends on your needs: Do you need to train an AI/ML model on a large dataset, and you don’t have the strictest privacy requirements, then FL is a great option. If you need to support a broad range of analyses, queries and statistics or the data is very sensitive, you want to consider MPC.
And as a bonus, MPC can help you in case you need to train a large model and require strict privacy: Use FL to train on the local sets, and use MPC to combine the parameters!
What's your data collaboration challenge? Let us know in the comments or contact us directly.
Program Manager Artificial Intelligence - UMC Utrecht | MSc Health Care Management
2 年M.H. Vink