登录查看更多内容

How to keep the servers blind?

Hardik Katyarmal

Building LattIQ | Ex-CRED, Flipkart

发布日期: 2024年5月23日

Imagine the legendary late Charlie Munger, sitting in his office, wanting to research a new stock on his Bloomberg Terminal for the latest market analysis. As a seasoned expert, Charlie knows the value of keeping his investment strategies under wraps. He didn't want anyone, not even the powerful Bloomberg Terminal, to know which stocks he is interested in. So, how did he achieve this? Private Information Retrieval.

What is Private Information Retrieval (PIR)?

If a layman were to design a PIR system how would they do it? A trivial implementation of PIR is for the client to download the whole database from the servers and query it locally. Problem solved! Technical or not, you can tell that's not a very intelligent way to do it. Why? The download costs are too high. What if data changes frequently? Do I download it every day?

The trivial solution may not be practical, but that gives us an objective function to optimise. Download/ Upload costs or the network bandwidth usage between a client and a server. Now the trick is just to find intelligent ways to maintain privacy of query parameters while we reduce bandwidth usage. Are there ways where I don't need to download the whole dump and still be able to privately query to retrieve data of my interest? Can we do it at half the cost? Can we do it at lower? The answer to all of these is YES!

Intelligent ways of implementing PIR

To keep Charlie's investment strategies under wraps, researchers have developed sophisticated cryptographic protocols to implement PIR without the hefty upload/ download costs. Popular approaches include:

Using homomorphic encryption, a technique that allows computations to be performed on encrypted data without needing to decrypt it first. This means Charlie can send encrypted queries to the Bloomberg Terminal, which processes these queries without knowing the contents, and then sends back the encrypted results for Charlie to decrypt locally.
Another method is secret sharing, where the query is split into parts and sent to multiple servers. Each server processes its part of the query without knowing the whole picture. When Charlie receives the parts back, he combines them to get his answer. This way, no single server has enough information to determine what Charlie is looking for. It still remains open to the threat of these distributed servers colluding to infer what you retrieved.
Another method is garbled circuits, where the server doesn't store the data in its plain form. Instead, for each data element, the server creates two garbled values. These garbled values are essentially random encodings that cloak the actual data. One encoding corresponds to the real data element, while the other acts as a dummy value. Crucially, the server itself doesn't know which garbled value represents the real data. The client uses a special key to create a garbled query that interacts with the garbled circuit. The server processes these queries within the circuit, evaluating both the real and dummy values without knowing which is which. The client then uses its key to decrypt the results, filtering out the dummy values and revealing only the actual data that matches the client's query.

Huawei IT Products & Solutions 1 年前

Domain Name System, Simplified!

Rajesh Dangi 2 年前

Route Server.

Theophilus Bittok 7 个月前

How far has this tech gotten?

The latest developments on PIR have astonished the cryptography world. Alexandra Henzinger and her teammates have established protocols called SimplePIR & DoublePIR that work with a single server (thus eradicating the need to trust multiple servers not to collude), reduces the client "hint" to be downloaded before querying a 1GB database upto 16MB, has limited network costs to <240KB per query, can achieve a throughput of 10GB/s/core which is pretty close to the theoretical limit considering memory bandwidth of a machine.

The reference paper also compares all previously proposed PIR schemes like SealPIR, OnionPIR, FastPIR, SpiralStream, XOR PIR on communication bandwidth, throughput and client hint size. Check it out.

Where can we apply PIR in the real world?

PIR has the potential to turn competitors into collaborators. Getting private access to confidential data with no risk of leaking querying parameters sounds like a pretty sweet deal.

A few examples where this could be helpful:

Cybersecurity: Tech companies querying another company's threat intelligence data to identify potential vulnerabilities without revealing their own security gaps.
Finance: Investment firms querying transaction databases to analyse market trends without exposing their specific interests or strategies.
Journalism: Journalists querying a database for sensitive political information without revealing the subject of their investigation, ensuring the safety and confidentiality of their sources.

Most businesses have implemented ML algorithms to optimise their decision making. However, they have limited 1st Party training data and that limits how accurately a ML algorithm can perform. More often than not, complimentary datasets that can give you incremental goodness sit with your competitors that operate in the same/ adjacent space.

A cryptographic protocol that can turn Enemies into Frenemies, or even Friends? Do we need more validation that advancement of tech is better for the society?

#data #privacy #cryptography #PETs #cleanrooms

How to keep the servers blind?

Hardik Katyarmal

Building LattIQ | Ex-CRED, Flipkart

What is Private Information Retrieval (PIR)?

Intelligent ways of implementing PIR

领英推荐

How far has this tech gotten?

Where can we apply PIR in the real world?

更多精彩文章

社区洞察

其他会员也浏览了

ECC Memory: The Unsung Hero of Server Stability ?????

Consistent Hashing

Don’t just run DNS, run it FAST

1 Gbps uplink speed now available countrywide

Optimizing HTTP/2 server push

Accelerated your secure server with TLS1.3 Server 10Gbps IP Core (TLS10GS-IP)

What is DNS?

Azure App Service - Backup and Restore over Azure Virtual Network

The Damning Experience with IBM Cloud Servers: A Saga of Negligence and Incompetence

What is Private Information Retrieval (PIR)?

Intelligent ways of implementing PIR

领英推荐

How far has this tech gotten?

Where can we apply PIR in the real world?

Not your average movie review: The Arrival

2024年8月26日

Randomness to the Rescue

2024年6月11日

Footprints of a Digital Traveller

2024年3月21日

Privacy: Does India really care?

2023年6月23日

The Last Bite: Farewell to 3P cookies

2023年6月21日

The Market of Attention

2023年6月16日

社区洞察

其他会员也浏览了

ECC Memory: The Unsung Hero of Server Stability ?????

Consistent Hashing

Don’t just run DNS, run it FAST

1 Gbps uplink speed now available countrywide

Optimizing HTTP/2 server push

Accelerated your secure server with TLS1.3 Server 10Gbps IP Core (TLS10GS-IP)

What is DNS?

Azure App Service - Backup and Restore over Azure Virtual Network

The Damning Experience with IBM Cloud Servers: A Saga of Negligence and Incompetence