Privacy-Preserving Machine Learning with Fully Homomorphic Encryption

Privacy-Preserving Machine Learning with Fully Homomorphic Encryption

Most LLMs (Large Language Models) today are trained primarily on publicly available data. This can limit their applicability in domains with strict data privacy requirements. While training data impacts these models’ capacity, other factors such as model architecture, training techniques, and computational resources are significant as well.

The ability to operate on private data is crucial for many use cases such as:

  • Medical
  • Finance
  • Government
  • Legal

Unfortunately, the one-size-fits-all LLM approach may not always be suitable.

However, techniques exist to adapt LLMs to work with private data securely. These include:

  • Fine-tuning models on encrypted or anonymized datasets
  • Using federated learning
  • Privacy-preserving methods like differential privacy
  • Using Full/Partial Homomorphic Encryption

Applications of Fully Homomorphic Encryption (FHE) for on-chain use cases have emerged, thanks to organizations like Zama + Inco

Now there's potential to bring these primitives to privacy-preserving machine learning (PPML) without compromising data privacy. This benefits model training on private, exclusive, proprietary data, as well as inference on encrypted weights.

What is FHE?

FHE allows:

  • Computations on encrypted data without decryption
  • The result remains encrypted, only accessible with the secret key
  • Encrypted data is secure through all computational processes (e.g., model training, inference)

FHE is particularly relevant for areas where data protection is crucial, including:

? Genomic LLMs for inference or training foundation models

? Patient-specific medical data

? Tailored customer support solutions using personal data

? Secure R&D using proprietary information and intellectual property

? Secure multi-party computation for jointly developing models without sharing individual datasets

As the demand for generative AI in sensitive domains continues to grow, the importance of these techniques cannot be overstated. By allowing computations on encrypted data, FHE enables the development of powerful ML models that can leverage private and proprietary data without compromising privacy. This opens up a wide range of possibilities for industries such as healthcare, finance, and government, where data privacy is of utmost importance. As research in this field advances, we can expect to see more widespread adoption of privacy-preserving machine learning, unlocking the full potential of AI while ensuring the protection of sensitive information.

The drawback however is that FHE is inherently compute intensive. Coupling that AI/ML could really exacerbate the needs for compute when training or fine-tuning on encrypted data. BUT, there is promise here. For example until recently, most zk proofs were too intensive for comfortably running locally ICICLE and other GPU acceleration libraries have emerged (but they're CUDA-based and today lack support for the Apple chipset) More on that here: https://blog.ezkl.xyz/post/acceleration/… Last, FHE FPGAs are also scheduled to hit the market in 2025 and there are several companies are working on it, including Cornami, Inc. , Intel, Duality, Fabric, etc.

For a much more detailed overview of privacy-preserving machine learning (PPML) check out this great post from Bagel ??

https://twitter.com/bagel_network/status/1765016627530154292

These series of posts by Daniel Huynh / Mithril Security is definitely worth digging into: https://towardsdatascience.com/homomorphic-encryption-intro-part-1-overview-and-use-cases-a601adcff06c

Also check out Zama’s ConcreteML: https://docs.zama.ai/concrete-ml

Special thanks to Remi Gai Bidhan R. Sree Duggirala Shrey J. for helping me with this post!

Mike Schrock

Passionate, strategic, business development servant leader for technology platform partnerships & GTMs.

11 个月

OpenFHE has good strides to unify FHE libraries and schemes for some time. I agree HAL for acceleration via intel and new fpga and othe compute improvements needed. Prehaps the new to this world crypto, AI and old-timers(Academia, Duality, Microsoft Research…) should come together more often?

回复
Sahil Thaker

Cofounder @ Async Labs | Glean | Facebook | Uber

1 年

While I'm familiar with all of the techniques listed, what's unclear to me is whether operating in an encrypted space truly hides identity - which is crucial to privacy. For example, can a chat over medical records not match the patient name when the inference is also happening on encrypted space? In my understanding FHE is great at data security while being analyzable directly by those with private key; hence the usecases in medical industry. differential-privacy's goal revolves more toward privacy (inability to identify unique feature) but comes at the cost of significant quality deterioration. ZKML is great in an adversarial environment, and fed-learning for keeping raw data on the edge while sharing learnt patterns, although in the case of LLMs weights memorize lots of raw data. Lots of nuances here.. Interesting to think more though

回复
Yash N.

Machine Learning Intern | AI, Cloud Computing, Python Programming | Leveraging tech skills for solving complex problems facing mankind.

1 年

The?capacity?of?Fully?Homomorphic?Encryption?(FHE)?to?facilitate?computations?on?encrypted?data?without?the?need?for?decryption,?so?preserving?data?privacy?throughout?computational?procedures?such?as?model?training?and?inference,?is?undeniably?revolutionary. The?prospective?uses?of?Fully?Homomorphic?Encryption?(FHE)?in?areas?such?as?genomic?Longitudinal?Latent?Markov?Models?(LLMs),?analysis?of?patient-specific?medical?data,?and?secure?multi-party?computation?offer?great?potential?for?advancing?these?crucial?sciences. Indeed,?as?you?correctly?highlighted,?the?computationally?demanding?characteristics?of?Fully?Homomorphic?Encryption?(FHE)?provide?a?notable?obstacle,?especially?when?combined?with?Artificial?Intelligence/Machine?Learning?(AI/ML)?activities. However,?the?development?of?GPU?acceleration?libraries?such?as?ICICLE?is?a?positive?advancement?in?addressing?these?difficulties,?however?there?are?now?restrictions?in?terms?of?support?for?the?Apple?chipset.

回复
Lena Dubov

Principal PM Lead at Microsoft with expertise in Machine Learning and AI

1 年

Great read, thanks. Never thought of encrypted weights, makes a lot of sense. It surely comes at a computational cost, but it should shrink with advalcemwnts in silicon and local compute for inference.

回复
H?usler-Leutgeb Michael

Strategic Solution Architect in Healthcare – Leadership, Innovation, and Sustainable Partnerships for Success

1 年

Acknowledging the complexities of integrating Fully Homomorphic Encryption for LLMs. It's indeed cutting-edge but challenging. How do you see technologies like Secure Multi-Party Computation fitting with this priorities?

要查看或添加评论,请登录

Anand Iyer的更多文章

  • The Four Acts of Virtuals' Evolution

    The Four Acts of Virtuals' Evolution

    Shopify transformed the ecommerce landscape by enabling millions of entrepreneurs to easily launch online stores – a…

    4 条评论
  • Breaking Big Tech's AI Stranglehold: The Case for Distributed Artificial Intelligence

    Breaking Big Tech's AI Stranglehold: The Case for Distributed Artificial Intelligence

    Microsoft and BlackRock are raising a $30B fund just to build AI data centers. That's more than NASA's entire budget…

    3 条评论
  • AI x Blockchain: Key Takeaways from our Generative NYC event

    AI x Blockchain: Key Takeaways from our Generative NYC event

    Fascinating chat with Illia Polosukhin Alex Atallah on the future of AI x Blockchain. 3 high-level takeaways: Hardware…

    3 条评论
  • Introducing Canonical Crypto

    Introducing Canonical Crypto

    From Trusted to Trustless..

    62 条评论
  • Welcoming SZNS to the Pear Family

    Welcoming SZNS to the Pear Family

    Non-Fungible Tokens (NFTs) have proven to be the perfect on-ramp, bridging non-crypto natives to the crypto world. With…

    1 条评论
  • $SUSHI

    $SUSHI

    A few friends, especially those entrenched in VC, have reached out to me over the last few days asking for a take on…

    1 条评论
  • The Evolution of Managed Marketplaces

    The Evolution of Managed Marketplaces

    Some of you may be too young to remember, but the physical Yellow Pages was one of the main ways you could find local…

    2 条评论
  • On Market Sizing

    On Market Sizing

    I meet with like-minded product managers or engineers about 2–3 times a month who are interested in or are working on…

  • How much does an employer pay for a W2 Full Time Employee?

    How much does an employer pay for a W2 Full Time Employee?

    [originally posted this on Medium -…

    1 条评论
  • The Work-Family Imbalance

    The Work-Family Imbalance

    This post originally appeared on TechCrunch on April 4, 2015:…

    5 条评论

社区洞察

其他会员也浏览了