登录查看更多内容

Google AI Making Progress in Federated Learning with Formal Differential Privacy Guarantees

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

发布日期: 2022年3月10日

Google AI has been involved in a flurry of research papers that has recently peaked my interest during the last few weeks, namely?this one, this?one?and?this.

If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below).

https://aisupremacy.substack.com/subscribe

AiSupremacy is a Newsletter at the intersection of A.I. and breaking news.?You can keep up to date with the articles?here.

Since AiSupremacy is not Synced (a great resource for AI academic research summaries), we aren’t going to go into all of them. However let’s try to unpack this:

Google AI Blog

While my?Datascience Learning Center?usually goes into the technical jargon, we can also easily understand this with a bit of context.

Tracking the evolution of federated learning, differential privacy, DP-FTRL is pretty crucial to making sure A.I. is secure, anonymized and allows machine learning to work in more sensitive data fields.

Fair warning this article is going to be a bit more technical so if the topic doesn’t interest you, just skip it. I personally do my best to?give credit where credit is due?with regards to the work related to A.I. that Google, Microsoft, Facebook do, which is significant for the field as a whole.

What is Federated Learning (FL)?

Federated learning is a machine learning technique that trains an algorithm across multiple decentralized edge devices or servers holding local data samples, without exchanging them.
In 2017, Google?introduced federated learning?(FL), an approach that enables mobile devices to collaboratively train machine learning (ML) models while keeping the raw training data on each user's device, decoupling the ability to do ML from the need to store the data in the cloud.
Since its introduction, Google has continued to?actively engage in FL research?and deployed FL to power many features in?Gboard, including next word prediction, emoji suggestion and out-of-vocabulary word discovery. Federated learning is improving the?“Hey Google”?detection models in Assistant,?suggesting replies?in Google Messages,?predicting text selections, and more.

Differential Privacy (DP)

Differential privacy is a system for publicly sharing information about a dataset by describing the patterns of groups within the dataset while withholding information about individuals in the dataset.
While FL allows ML without raw data collection,?differential privacy?(DP) provides a quantifiable measure of data anonymization, and when applied to ML can address concerns about models memorizing sensitive user data. This too has been a top research priority, and has yielded one of the first production uses of DP for analytics with?RAPPOR?in 2014,?our open-source DP library,?Pipeline DP, and?TensorFlow Privacy.

What is DP-FTRL

Differentially Private Follow-the-Regularized-Leader (DP-FTRL).?So typically this is code for Code for?"Practical and Private (Deep) Learning without Sampling or Shuffling".

Check it out on GitHub

Google AI says through a multi-year, multi-team effort spanning fundamental research and product integration, today we are excited to announce that we have deployed a production ML model using federated learning with a rigorous differential privacy guarantee.
For this proof-of-concept deployment, they utilized?the DP-FTRL algorithm?to train a recurrent neural network to power next-word-prediction for Spanish-language Gboard users. To our knowledge, this is the first production neural network trained directly on user data announced with a formal DP guarantee (technically ρ=0.81?zero-Concentrated-Differential-Privacy, zCDP, discussed in detail below). Further, the federated approach offers complimentary data minimization advantages, and the DP guarantee protects all of the data on each device, not just individual training examples.

Read the Paper on DP-FTRL

This research is important especially as Google AI and Google Brain and other teams like DeepMind get into healthcare AI and healthcare data.

Data Minimization and Anonymization in Federated Learning

Along with fundamentals like transparency and consent, the?privacy principles of data minimization and anonymization?are important in ML applications that involve sensitive data.

Read Federated Learning and Privacy

Federated learning systems structurally incorporate the principle of?data minimization.?FL only transmits minimal updates for a specific model training task (focused collection), limits access to data at all stages, processes individuals’ data as early as possible (early aggregation), and discards both collected and processed data as soon as possible (minimal retention).

Another principle that is important for models trained on user data is?anonymization, meaning that the final model should not?memorize?information unique to a particular individual's data, e.g., phone numbers, addresses, credit card numbers. However, FL on its own does not directly tackle this problem.

In 2022 Apple, Google and Microsoft are renewing their emphasis on privacy as they move into new areas of our private and personal data and expand their Cloud Computing services.

领英推荐

TAI #125: Training Compute Scaling Saturating As…

Towards AI 4 个月前

How can machine learning be used to improve existing…

Machine Learning 2 年前

4 Key Differences between Federated Learning and…

Naveen Joshi 2 年前

Jeff Dean is head of AI at Google in some capacity. Especially regarding Research and Health. With Google moving into the healthcare sector more, this kind of research becomes more important.

The Challenging Path to Federated Learning with Differential Privacy

In 2018, Google AI introduced?the DP-FedAvg algorithm, which extended the DP-SGD approach to the federated setting with user-level DP guarantees, and in 2020 Google?deployed this algorithm?to mobile devices for the first time. This approach ensures the training mechanism is not too sensitive to any one user's data, and?empirical privacy auditing techniques?rule out some forms of memorization.

As Google has gotten into hardware and the smartphones and device business in recent years, this technology also becomes more salient to their business model.

However, the amplification-via-samping argument is essential to providing a strong DP guarantee for DP-FedAvg, but in?a real-world cross-device FL system?ensuring devices are subsampled precisely and uniformly at random from a large population would be complex and hard to verify. Their challenge is that devices choose when to connect (or "check in") based on many external factors (e.g., requiring the device is idle, on unmetered WiFi, and charging), and the number of available devices can vary substantially.

Read Google AI Blog

About My Work

Did you know, I also run other related Newsletters? The front page of my Newsletters are small treasure droves of articles now.

AiSupremacy?(Dozens of free articles)
Datascience?Learning Center (Dozens of free articles)
Quantum Foundry?(new Quantum computing based Newsletter).

I’ve also started a recent Newsletter on A.I. bite-size news articles.

Join A.I. Survey Newsletter

To my knowledge, I’m among?just a handful of indie media startups?developing multiple Newsletters simultaneously.

FL with DP

Achieving a formal privacy guarantee requires a protocol that does?all?of the following:

Makes progress on training even as the set of devices available varies significantly with time.
Maintains privacy guarantees even in the face of unexpected or arbitrary changes in device availability.
For efficiency, allows client devices to locally decide whether they will check in to the server in order to participate in training, independent of other devices.

While Google says it reached the milestone of deploying a production FL model using a mechanism that provides a meaningfully small zCDP, their research journey?continues.

They are still far from being able to say this approach is possible (let alone practical) for most ML models or product applications, and other approaches to private ML exist. They note that they are excited to continue the journey toward maximizing the value that ML can deliver while minimizing potential privacy costs to those who contribute training data.

I was excited to reach in Sync, that a research team from Cornell University and Google Brain introduces FLASH, a model family that achieves quality on par with fully augmented transformers while maintaining linear scalability over the context size on modern accelerators.

Read Synced Blog on FLASH

NOTE FROM THE AUTHOR

If you enjoy articles about A.I. at the intersection of breaking news join AiSupremacy?here. I cannot continue to write without community support. (follow the link below).

https://aisupremacy.substack.com/subscribe

AiSupremacy is a Newsletter at the intersection of A.I. and breaking news.?You can keep up to date with the articles?here.

AiSupremacy is the fastest Substack Newsletter in AI at the intersection of breaking news.?It’s ranked #1 in Machine Learning as of January 22nd, 2022.

Thanks for reading!

Artificial Intelligence Report

242,647 位关注者

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

3 年

Google AI, DeepMind and Google Brain continue to really lead the pack in research. How do you beat a trifecta like that?

查看更多评论

要查看或添加评论，请登录

Michael Spencer的更多文章

Top AI Newsletters of 2025, unstacked

2025年4月1日

Top AI Newsletters of 2025, unstacked

Cover: writer & researcher Grace Shao - Newsletter: https://aiproem.substack.

5 条评论
About that Manus AI Thing

2025年3月27日

About that Manus AI Thing

And welcome to our next article in our series on AGI. To read my full work, for less than $2 week subscribe here.

2 条评论
How far ahead is ChatGPT as a first mover?

2025年3月26日

How far ahead is ChatGPT as a first mover?

This is an incredible deep dive into OpenAI, its history, business model and ChatGPT's huge lead in usage and as an…

2 条评论
Is BYD Disrupting Tesla in 2025?

2025年3月25日

Is BYD Disrupting Tesla in 2025?

When we think of AI Supremacy as in the AI race between China and the U.S.

24 条评论
The Fundamental Lie of OpenAI's Mission

2025年3月20日

The Fundamental Lie of OpenAI's Mission

Welcome Back, Everyone from OpenAI to DeepSeek claims they are an AGI startup, but the way these AI startups are…

13 条评论
Vibe Coding: Revolution or Regression Students and Non-coders?

2025年3月19日

Vibe Coding: Revolution or Regression Students and Non-coders?

Good Morning, As the vibe coding interface takes shape, I’ve been checking out a new startup coming out of stealth this…

10 条评论
The Truth about DeepSeek's Integration in China and WeChat Explained

2025年3月18日

The Truth about DeepSeek's Integration in China and WeChat Explained

DeepSeek's rapid integration in China is a bigger story that is being told. It's not just the China Cloud leaders…

6 条评论
How AI Datacenters Work

2025年3月13日

How AI Datacenters Work

Good Morning, Get the full inside scoop on key AI topics for less than $2 a week with a premium subscription to my…

5 条评论
How Nvidia is down 30% from its Highs

2025年3月12日

How Nvidia is down 30% from its Highs

If like me, you are wondering why Nvidia is down more than 20% this year even when the demand is still raging for AI…

8 条评论
What DeepSeek Means for AI Innovation

2025年3月10日

What DeepSeek Means for AI Innovation

Welcome to another article by Artificial Intelligence Report. LinkedIn has started to "downgrade" my work.

16 条评论

See all articles

Google AI Making Progress in Federated Learning with Formal Differential Privacy Guarantees

Michael Spencer

A.I. Writer, researcher and curator - full-time Newsletter publication manager.

What is Federated Learning (FL)?

Differential Privacy (DP)

What is DP-FTRL

Data Minimization and Anonymization in Federated Learning

领英推荐

The Challenging Path to Federated Learning with Differential Privacy

About My Work

FL with DP

NOTE FROM THE AUTHOR

Artificial Intelligence Report

242,647 位关注者

Michael Spencer的更多文章

社区洞察

其他会员也浏览了

Blockchain-Backed Federated Learning: A Breakthrough for AI Privacy

The Federated Learning market is moving

Introducing InkyMM: The First Commercial Open Source Multimodal Model

A Practical Guide to Federated Learning for Enterprise

Shadow AI: AI Adoption Catalyst or Security Risk?

Balancing Accuracy and Privacy in Machine Learning

Google Cloud Next 2024,Claude 3 Opus Matches Human Persuasiveness, AI to Grade in Texas Education, Cohere Rerank 3,Mixtral 8x22B update,Meta’s OpenEQA

?? AI Announcements at Build 2022 - My Personal Top 3

Unleashing the Power of Azure and OpenAI: Building the Future of IT Services

Unleashing the Power of Azure Machine Learning

What is Federated Learning (FL)?

Differential Privacy (DP)

What is DP-FTRL

Data Minimization and Anonymization in Federated Learning

领英推荐

The Challenging Path to Federated Learning with Differential Privacy

About My Work

FL with DP

NOTE FROM THE AUTHOR

Artificial Intelligence Report

242,647 位关注者

Michael Spencer的更多文章

Top AI Newsletters of 2025, unstacked

About that Manus AI Thing

How far ahead is ChatGPT as a first mover?

Is BYD Disrupting Tesla in 2025?

The Fundamental Lie of OpenAI's Mission

Vibe Coding: Revolution or Regression Students and Non-coders?

The Truth about DeepSeek's Integration in China and WeChat Explained

How AI Datacenters Work

How Nvidia is down 30% from its Highs

What DeepSeek Means for AI Innovation

社区洞察

其他会员也浏览了

Blockchain-Backed Federated Learning: A Breakthrough for AI Privacy

The Federated Learning market is moving

Introducing InkyMM: The First Commercial Open Source Multimodal Model

A Practical Guide to Federated Learning for Enterprise

Shadow AI: AI Adoption Catalyst or Security Risk?

Balancing Accuracy and Privacy in Machine Learning

Google Cloud Next 2024,Claude 3 Opus Matches Human Persuasiveness, AI to Grade in Texas Education, Cohere Rerank 3,Mixtral 8x22B update,Meta’s OpenEQA

?? AI Announcements at Build 2022 - My Personal Top 3

Unleashing the Power of Azure and OpenAI: Building the Future of IT Services

Unleashing the Power of Azure Machine Learning