Federated Learning for ML
Srikanth Devarajan
Transformative Technology Leader | AI & Cloud Innovation | Product Strategy & Execution | Creative Problem-Solver
Machine Learning (ML) solutions are hungry for data. However, various data-sharing concerns stand between the data provider and the ML teams seeking useful data. Understandably, privacy takes the highest precedence when dealing with data sharing. At the same time, stringent privacy regulations come with a price in model accuracy and computation. Overly redacted data is completely useless for machine learning. At the other extreme, giving up privacy fosters risks. So the ML development teams must consider privacy-preserving-ML, i.e., increasing privacy without a trade-off in model accuracy.
Privacy challenges cannot come at the cost of advancement
Federated learning helps us to manage privacy issues. It is a?decentralized?form of Machine Learning that can help governmental agencies and other data-producing facets improve modeling accuracy while keeping an uncompromising check on data privacy.
Federated Learning enables a distributed machine learning computation by distributing the models across multiple types of devices localizing modeling within the source realm. Then, the trained models are combined on a central server. In this approach, every data supplying client receives the model architecture and some instructions for the training. The model gets trained on local infrastructure/devices and returns only the weights to a central server.
The key point is that the source data never leaves the source devices or is pooled in one location. It is very different from the traditional architecture of gathering a data set in a central location and then training a model.?
Though the above patterns improve privacy by making it difficult to intercept localized data from model weights from the raw data, it is not 100% foolproof. There is a possibility to reverse engineer and fish information back to raw data using the weights. To avoid such possibilities, the pattern can employ an additional mechanism such as establishing a neutral governance team to secure aggregation and averaging of the weights into the central model. At the same time, process locks are in place to deter the newly established governance team from seeing or tampering with the model data received from the localized models.?
领英推荐
TensorFlow supports federated learning. TensorFlow Federated (TFF), an open-source framework for machine learning on decentralized data. TFF enables developers to simulate the federated learning algorithms on their models and data. For example, prediction models on mobile keyboards without uploading sensitive typing data to servers use TFF
Here are some links to?starting points and complete examples. Besides, the building blocks provided by TFF could be used to implement non-learning computations, such as?federated analytics.?
To conclude, data providers are hesitant to share data, and given the regulations and the privacy constraints, it is a valid concern. However, privacy challenges cannot come at the cost of advancement. The concepts came from Edge-AI/IoT-Edge computing, which is still at its early stages. As the pattern matures, federated learning can enable ML-development teams and data teams to find a win-win to produce meaningful solutions without compromising security, privacy, and other legal difficulties.?
Good Luck.