登录查看更多内容

Privacy-Preserving Machine Learning with Fully Homomorphic Encryption

Anand Iyer

Managing Partner Canonical | Venture Partner Lightspeed

发布日期: 2024年3月19日

Most LLMs (Large Language Models) today are trained primarily on publicly available data. This can limit their applicability in domains with strict data privacy requirements. While training data impacts these models’ capacity, other factors such as model architecture, training techniques, and computational resources are significant as well.

The ability to operate on private data is crucial for many use cases such as:

Medical
Finance
Government
Legal

Unfortunately, the one-size-fits-all LLM approach may not always be suitable.

However, techniques exist to adapt LLMs to work with private data securely. These include:

Fine-tuning models on encrypted or anonymized datasets
Using federated learning
Privacy-preserving methods like differential privacy
Using Full/Partial Homomorphic Encryption

Applications of Fully Homomorphic Encryption (FHE) for on-chain use cases have emerged, thanks to organizations like Zama + Inco

Now there's potential to bring these primitives to privacy-preserving machine learning (PPML) without compromising data privacy. This benefits model training on private, exclusive, proprietary data, as well as inference on encrypted weights.

What is FHE?

FHE allows:

Computations on encrypted data without decryption
The result remains encrypted, only accessible with the secret key
Encrypted data is secure through all computational processes (e.g., model training, inference)

FHE is particularly relevant for areas where data protection is crucial, including:

领英推荐

This week’s latest generative AI updates – August 13…

SymphonyAI 7 个月前

DeepSeek: A Case Study in AI Innovation, Security…

???? Billy Spears ???? 1 个月前

Security and Privacy Implications for the…

Andrew Aken, PhD, CISSP 5 个月前

? Genomic LLMs for inference or training foundation models

? Patient-specific medical data

? Tailored customer support solutions using personal data

? Secure R&D using proprietary information and intellectual property

? Secure multi-party computation for jointly developing models without sharing individual datasets

As the demand for generative AI in sensitive domains continues to grow, the importance of these techniques cannot be overstated. By allowing computations on encrypted data, FHE enables the development of powerful ML models that can leverage private and proprietary data without compromising privacy. This opens up a wide range of possibilities for industries such as healthcare, finance, and government, where data privacy is of utmost importance. As research in this field advances, we can expect to see more widespread adoption of privacy-preserving machine learning, unlocking the full potential of AI while ensuring the protection of sensitive information.

The drawback however is that FHE is inherently compute intensive. Coupling that AI/ML could really exacerbate the needs for compute when training or fine-tuning on encrypted data. BUT, there is promise here. For example until recently, most zk proofs were too intensive for comfortably running locally ICICLE and other GPU acceleration libraries have emerged (but they're CUDA-based and today lack support for the Apple chipset) More on that here: https://blog.ezkl.xyz/post/acceleration/… Last, FHE FPGAs are also scheduled to hit the market in 2025 and there are several companies are working on it, including Cornami, Inc. , Intel, Duality, Fabric, etc.

For a much more detailed overview of privacy-preserving machine learning (PPML) check out this great post from Bagel ??

https://twitter.com/bagel_network/status/1765016627530154292

These series of posts by Daniel Huynh / Mithril Security is definitely worth digging into: https://towardsdatascience.com/homomorphic-encryption-intro-part-1-overview-and-use-cases-a601adcff06c

Also check out Zama’s ConcreteML: https://docs.zama.ai/concrete-ml

Special thanks to Remi Gai Bidhan R. Sree Duggirala Shrey J. for helping me with this post!

Mike Schrock

Passionate, strategic, business development servant leader for technology platform partnerships & GTMs.

11 个月

OpenFHE has good strides to unify FHE libraries and schemes for some time. I agree HAL for acceleration via intel and new fpga and othe compute improvements needed. Prehaps the new to this world crypto, AI and old-timers(Academia, Duality, Microsoft Research…) should come together more often?

Sahil Thaker

Cofounder @ Async Labs | Glean | Facebook | Uber

1 年

While I'm familiar with all of the techniques listed, what's unclear to me is whether operating in an encrypted space truly hides identity - which is crucial to privacy. For example, can a chat over medical records not match the patient name when the inference is also happening on encrypted space? In my understanding FHE is great at data security while being analyzable directly by those with private key; hence the usecases in medical industry. differential-privacy's goal revolves more toward privacy (inability to identify unique feature) but comes at the cost of significant quality deterioration. ZKML is great in an adversarial environment, and fed-learning for keeping raw data on the edge while sharing learnt patterns, although in the case of LLMs weights memorize lots of raw data. Lots of nuances here.. Interesting to think more though

Yash N.

Machine Learning Intern | AI, Cloud Computing, Python Programming | Leveraging tech skills for solving complex problems facing mankind.

1 年

The?capacity?of?Fully?Homomorphic?Encryption?(FHE)?to?facilitate?computations?on?encrypted?data?without?the?need?for?decryption,?so?preserving?data?privacy?throughout?computational?procedures?such?as?model?training?and?inference,?is?undeniably?revolutionary. The?prospective?uses?of?Fully?Homomorphic?Encryption?(FHE)?in?areas?such?as?genomic?Longitudinal?Latent?Markov?Models?(LLMs),?analysis?of?patient-specific?medical?data,?and?secure?multi-party?computation?offer?great?potential?for?advancing?these?crucial?sciences. Indeed,?as?you?correctly?highlighted,?the?computationally?demanding?characteristics?of?Fully?Homomorphic?Encryption?(FHE)?provide?a?notable?obstacle,?especially?when?combined?with?Artificial?Intelligence/Machine?Learning?(AI/ML)?activities. However,?the?development?of?GPU?acceleration?libraries?such?as?ICICLE?is?a?positive?advancement?in?addressing?these?difficulties,?however?there?are?now?restrictions?in?terms?of?support?for?the?Apple?chipset.

Lena Dubov

Principal PM Lead at Microsoft with expertise in Machine Learning and AI

1 年

Great read, thanks. Never thought of encrypted weights, makes a lot of sense. It surely comes at a computational cost, but it should shrink with advalcemwnts in silicon and local compute for inference.

H?usler-Leutgeb Michael

Strategic Solution Architect in Healthcare – Leadership, Innovation, and Sustainable Partnerships for Success

1 年

Acknowledging the complexities of integrating Fully Homomorphic Encryption for LLMs. It's indeed cutting-edge but challenging. How do you see technologies like Secure Multi-Party Computation fitting with this priorities?

1 次回应

查看更多评论

要查看或添加评论，请登录

Anand Iyer的更多文章

The Four Acts of Virtuals' Evolution

2025年2月18日

The Four Acts of Virtuals' Evolution

Shopify transformed the ecommerce landscape by enabling millions of entrepreneurs to easily launch online stores – a…

4 条评论
Breaking Big Tech's AI Stranglehold: The Case for Distributed Artificial Intelligence

2024年11月15日

Breaking Big Tech's AI Stranglehold: The Case for Distributed Artificial Intelligence

Microsoft and BlackRock are raising a $30B fund just to build AI data centers. That's more than NASA's entire budget…

3 条评论
AI x Blockchain: Key Takeaways from our Generative NYC event

2023年9月29日

AI x Blockchain: Key Takeaways from our Generative NYC event

Fascinating chat with Illia Polosukhin Alex Atallah on the future of AI x Blockchain. 3 high-level takeaways: Hardware…

3 条评论
Introducing Canonical Crypto

2022年6月2日

Introducing Canonical Crypto

From Trusted to Trustless..

62 条评论
Welcoming SZNS to the Pear Family

2021年10月19日

Welcoming SZNS to the Pear Family

Non-Fungible Tokens (NFTs) have proven to be the perfect on-ramp, bridging non-crypto natives to the crypto world. With…

1 条评论
$SUSHI

2021年7月19日

$SUSHI

A few friends, especially those entrenched in VC, have reached out to me over the last few days asking for a take on…

1 条评论
The Evolution of Managed Marketplaces

2017年6月9日

The Evolution of Managed Marketplaces

Some of you may be too young to remember, but the physical Yellow Pages was one of the main ways you could find local…

2 条评论
On Market Sizing

2017年5月17日

On Market Sizing

I meet with like-minded product managers or engineers about 2–3 times a month who are interested in or are working on…
How much does an employer pay for a W2 Full Time Employee?

2015年8月18日

How much does an employer pay for a W2 Full Time Employee?

[originally posted this on Medium -…

1 条评论
The Work-Family Imbalance

2015年4月6日

The Work-Family Imbalance

This post originally appeared on TechCrunch on April 4, 2015:…

5 条评论

See all articles

Privacy-Preserving Machine Learning with Fully Homomorphic Encryption

Anand Iyer

Managing Partner Canonical | Venture Partner Lightspeed

领英推荐

Anand Iyer的更多文章

社区洞察

其他会员也浏览了

?? Build an LLM app

Why Your AI Needs Guardrails (and How to Build Them)

The OpenAI o1 Gamble

Top 5 Machine Learning Trends to Watch in 2025: Navigating the Cutting Edge

Weekly AI Agents report

The Synergy Between LangChain and Azure OpenAI

Local Deployment of Large Language Models using Llama

Navigating the Risks and Mitigating the Challenges of Generative AI

The 13 ‘Must-Read’ AI Papers of 2021

Rise of Synthetic Data: Shaping the Future of Privacy, AI, and Data-driven Innovation

领英推荐

Anand Iyer的更多文章

The Four Acts of Virtuals' Evolution

Breaking Big Tech's AI Stranglehold: The Case for Distributed Artificial Intelligence

AI x Blockchain: Key Takeaways from our Generative NYC event

Introducing Canonical Crypto

Welcoming SZNS to the Pear Family

$SUSHI

The Evolution of Managed Marketplaces

On Market Sizing

How much does an employer pay for a W2 Full Time Employee?

The Work-Family Imbalance

社区洞察

其他会员也浏览了

?? Build an LLM app

Why Your AI Needs Guardrails (and How to Build Them)

The OpenAI o1 Gamble

Top 5 Machine Learning Trends to Watch in 2025: Navigating the Cutting Edge

Weekly AI Agents report

The Synergy Between LangChain and Azure OpenAI

Local Deployment of Large Language Models using Llama

Navigating the Risks and Mitigating the Challenges of Generative AI

The 13 ‘Must-Read’ AI Papers of 2021

Rise of Synthetic Data: Shaping the Future of Privacy, AI, and Data-driven Innovation