Secure aggregation on top of Federated Learning: Same faces, blurred masks
Photos credit goes to DC and Marvel

Secure aggregation on top of Federated Learning: Same faces, blurred masks

To prepare this synthesis work we referred to the paper "Practical Secure Aggregation for Privacy-Preserving Machine Learning".?

In the previous episode, we have seen?that?federated learning?(FL)?is?a?machine learning setting where multiple?participants?collaborate to train a shared model without the need to?share?data.?However, research has shown that?maintaining?data localy?during the training process?does?not?guarantee complete?privacy.?

In fact,?one of the most prominent challenges, is?data reconstruction:?the?aggregator, who collects all?local models' parameters from the participants?in order to?aggregate them?and produce?the global model, may reveal extensive information about the private datasets of each participant. Thus, defeating the entire purpose of federated learning.?

Today, we will synthesize two techniques that can be used to overcome this limitation:

I. Secure aggregation,?

II. Differential?privacy.

So,?fasten your?belt!?

No alt text provided for this image

I. What is secure?aggregation??

Secure aggregation?(SA)?is?a?protocol that has been proposed to address?the previously mentioned problem by hindering the aggregator from analyzing the participants' individual model updates.?Current implementations of?SA in FL frameworks fall under?one of two main categories:??

1- Multi-party computation?

No alt text provided for this image

The first one is multi-party computation; the privacy of the locally trained models is?carried out?by applying techniques from cryptography. More specifically, the participants cooperate to generate random mask vector (a length n vector, where n is the number of the model parameters) and use it to obfuscate their own update before sending it to the server. The pairwise masks have a special property that once all masked models of all the clients are summed up at the aggregator (server), the pairwise masks cancel out,?subsequently, the server learns the aggregate of all models without revealing any individual client update.??

No alt text provided for this image

Figure 1.?Secure Aggregation / Source:?keynote?for the Secure Aggregation paper?

Challenges with multi-party computation?

The overhead of secure aggregation creates a substantial bottleneck in scaling secure federated learning.?Let’s?imagine thousands or even millions of clients in a network requiring a pairwise mask of length n to be generated and coordinated where the mask vectors are as big as the neural network model, therefore the overhead grows quadratically in the number of users. Fortunately, the Diffie-Hellman Key Exchange saves the day; Instead of sending gigantic vectors around the clients, this algorithm allows two parties to generate the same vector mask using only one single integer instead without revealing the mask to the server.?

No alt text provided for this image

In a federated learning setup, users can drop out at any moment due to several reasons such as low battery or poor connectivity. As a result, the pairwise mask generated for other users will not be canceled out and the output will not be recoverable since it is still masked. Hence, the secure aggregation protocol must be robust to operate in such environments where users can drop at any stage of the protocol execution.??

No alt text provided for this image

Figure 2.?Dropout risk in secure aggregation /Source:?keynote?for the Secure Aggregation paper?

The existing SA protocols

Over the years, several secure aggregation protocols have been proposed. Each one of them has its own strengths and weaknesses. For example,?SecAgg?protocol has a strong privacy guarantee and good dropout resilience (tolerates up to 1/3 user devices dropping out midway through the protocol), in contrast, it has significant computation and communication costs since each client generates shares for all the other clients participating in each round of the FL.?SecAgg+?is an improved version of its predecessor and the key difference is that SecAgg+ generates shares only for a fixed number of close neighbors rather than all the clients, thus, reducing the computation and communication costs.?HybridAlpha?is another protocol that employs differential privacy which is a technique will address in next sections, it also uses functional encryption that allows participants to encrypt their data, while computation of a specific function is still possible on the encrypted input so that the aggregator cannot learn anything about user’s data. We also mention?FastSecAgg?a protocol that introduced a novel secret sharing technique called FastShare which lowers the computation and communication costs while maintaining a strong privacy guarantee.?

No alt text provided for this image

2- Trusted Execution Environments?

The second approach to?perform secure aggregation is?by leveraging a hardware-based trusted execution environment?that ensures the security of sensitive data?shared by multiple parties.?In?the?case?of federated learning, the server?will?only?be?capable?of examining?the result of the computation, but no intermediate inputs to the computations performed in the?secure area of the main processor.?However, it is not straightforward and simple to use?such hardware due to the complicated and long configuration steps as well as the lack of documentation.?Trusted execution environments allow the creation of?only?small?fixed-sized secure memory?regions?to reduce the attack surface, consequently, it is not possible to simultaneously place all?weights from thousands of clients for aggregation.?

No alt text provided for this image

II. Differential privacy?

As we mentioned above, secure aggregation is a first step?towards ensuring data privacy by preventing the aggregator from seeing?individual client updates, however,?an adversary may be able to perform a number of attacks?such as membership inference attack to?learn if a record?was used in the training set.?Fortunately, this issue can be mitigated thanks to differential privacy?which is an approach to prevent information leakage.?More specifically, after each?model?training that happens?locally?at?client-level,?noise?is added to this model parameters?using a random perturbation algorithm?without affecting?its?overall pattern.?Still, such an approach?brings?a trade-off between privacy and?model performance. The more noise you add, the?less?accurate?the model?becomes,?especially for small datasets.?

No alt text provided for this image

1- How does it work??

Generally speaking,?we?can think of differential privacy as a small distortion of the?model updates.?Let’s?consider, for example, D = {x1,?x2, x3, …,?xn} a data?set?that?contains?n datapoints and D’ = {x1,?x2’, x3, …,?xn} a neighbouring data?set to?D?with a small distortion (a difference with only one datapoint). When we?apply?the same differential privacy algorithm on both data sets, we get similar distributions as illustrated in Figure 3.?

No alt text provided for this image

? Figure 3. An example of differential privacy?

More specifically, differential privacy introduces a privacy loss?parameter called?ε?to?the data. This parameter is responsible for controlling how much noise (or distortion) is added to the raw data?set.?

To make things more understandable, let’s consider a?column?that contains 0 or 1 as values.?For each value, we flip a coin. If it is heads, the value remains as it is. Else, we flip the coin again. If it is heads, we save the value as 1. if it is tails, we save it as 0. However, in real-life?applications, the noise adding process?is more complicated than a simple coin flip?and the randomness rate is controlled through?ε.?As ε grows, we gain in terms of model performance but lose privacy?and vice versa.?

In federated leaning,?differential privacy is fulfilled?either?by adding gaussian noise to the?global weights?(known as central differential privacy)?or at client-level where each client adds?gaussian?noise to?their local weights?(known as local differential privacy).?

2- Differential privacy tools in Python?

There are various python libraries that?support differential privacy. For instance, Diffprivlib is an?open-source library introduced by IBM.?Tensorflow?Privacy?is provided by Google. PyDP?is also provided by Google?and?contains a variety of?ε-differentially private algorithms. Moreover, Facebook?introduced?Opacus?to train?Pytorch?models with differential privacy.?

3- Challenges of differential privacy?

  • Differential privacy is applicable only on large data since model inaccuracy?can be tolerated to a certain extent.?This is not possible for small data.?
  • There is no fixed value of?ε (privacy loss?parameter).?If?ε=0,?we have perfect privacy, but the data is completely distorted and unusable. If ε?very large, there is no privacy at all. Consequently, it is hard to find the optimal value of ε that will?guaranty?the trade-off.?

Preserving the privacy of users’?data is the main purpose of federated learning, thus, secure aggregation should be a priority in all FL systems and frameworks to build?a?successful relationship of trust with the?participating?clients.?

Both?multiparty?computation,?trusted execution environments?and differential privacy?have their own limitations and challenges.?Nevertheless,?with?the?accelerated?adoption of federated learning by?multiple companies,?engineers?will?eventually?overcome these obstacles?and federated learning will be normalized as the default?setting for training ML models especially in the healthcare and finance sectors?where the privacy of the data matters the most.??

Next episode, we will reveal our?proposed Blockchain-orchestrated?FL architecture.??For once?in history, Joker and Batman?will be associates.?So,?Stick around!?:)

No alt text provided for this image

?

要查看或添加评论,请登录

Imen Ayari的更多文章

社区洞察

其他会员也浏览了