Decentralized AI and Data Privacy
Networked technologies make our modern life possible and data plays a crucial part in making those technologies work together. Organizations use networked technologies and data to provide us with products and services. But the data that they use belongs to us, it is our data. Every time we interact with these organizations, we leave a digital footprint, our private data. It is critical for organizations to ensure that our private data receives the right level of protection in building and enhancing consumers’ trust in new technologies like AI and Machine Learning.
AI is becoming all too pervasive and it is helping our devices like mobile phones, tablets, and computers behave more intelligently. AI-enabled devices facilitate our communication with them and have them do things for us - understand our text or voice commands to do our mundane and repetitive activities, adding convenience and efficiency to our life. AI, especially involving neural networks, is good at clustering, classification, and pattern recognition. Some of the use cases that are ideal for neural network processing are computer vision, speech recognition, natural language processing (NLP), optical character recognition, and recommendation engines.
All of these use cases employ artificial neural networks through AI models, more specifically Machine Learning (ML) models that are trained using annotated or labeled data. These ML models have one thing in common and that is, they all use a centralized AI paradigm – data is uploaded to a central server and then models are run against that data. This paradigm has two issues that are getting a lot of attention these days both from consumers and lawmakers and all for good reasons – data privacy and data security. We all have heard about many data breaches that have happened over the past few years and in 2020 alone, tens of major data breaches were reported involving almost every industry:
Some of the hacking targets were prominent and well-known brand names which included audio streaming service, healthcare provider, health benefit management company, camera manufacturer, a popular online game for kids, hotel reservation platform, immigration law firm, national BBQ chain, video game publisher, etc., just to name a few.
Here are some examples of the types of data that was exposed –
- In one of the healthcare-provider cyberattacks, patient names, addresses, dental diagnosis and treatment information, patient account numbers, billing information, bank account numbers, the name of the patient’s dentist, and health insurance information were exposed
- Another attack on a health benefits provider disclosed names, addresses, dates of birth, phone numbers, email addresses, vision insurance account/identification numbers, health insurance account/identification numbers, Medicaid or Medicare numbers, driver’s license numbers, birth or marriage certificates
- A home meal delivery service was also successfully targeted for stealing user data. This data was put up for sale on the dark web. The data included names, email addresses, phone numbers, scrambled passwords, and the last four digits of credit card numbers.
Not exactly a confidence booster for letting someone else manage our data privacy and data security for us. These stories are not rare or sporadic events, they are becoming commonplace these days despite the best attempts by companies and social platforms to safeguard our personal data. We willingly share our personal data with the organizations we interact with for convenience. We trust them to protect our privacy and our data. But with these breaches becoming almost regular occurrences, that trust has taken a major hit in recent times and is fast eroding but at the same time, predictive AI and personalization adoption are accelerating to improve customer experience.
Realizing that privacy protections are critical to maintaining consumer trust in technologies like AI, companies have begun to take the necessary steps. Organizations and AI experts are looking at developing robust approaches to data security and protection to increase customer trust and experience and while safeguarding their innovations that are driven by data. There has been and will be increased scrutiny on how businesses manage and use our data from customers and governmental organizations. Some governments have already passed laws that give consumers the power to exercise their privacy rights and control their personal data.
In the United States, California passed a Consumer Privacy Act in 2018. Under the law, as a California resident and consumer, you have a right to ask businesses to disclose what personal information they have about you and what they do with that information, to delete your personal information, and not to sell your personal information. You also have the right to be notified, before or at the point businesses collect your personal information, of the types of personal information they are collecting and what they may do with that information. **11**
GDPR also covers consumers’ data privacy rights **22**:
- Transparency about how data is being used
- Access to personal information if the owner asks for it
- The ability to request that data be deleted or corrected for accuracy
- The right to object to data processing and restrict processing
- The right to have their data provided in a standard format that can be transferred elsewhere
Model learning does not have to be centralized as it is currently (a large portion of model-learning uses this paradigm) with major cloud providers and social network platforms collecting data into a central location and running their ML models against that it. Google in fact introduced a decentralized AI learning in 2016 (in a paper titled Communication-Efficient Learning of Deep Networks from Decentralized Data) called Federated Learning. In federated learning, data from devices like mobile phones and tablets is not uploaded to the remote server but stays locally on the device, preserving the user’s data privacy. ML algorithms run locally on the device and only the results of the model training are sent to the server.
Federated Learning makes it possible to gain insight from highly sensitive data such as peoples’ text messages or their medication history. Because the primary objective here is statistical results rather than access to raw data.
In 2017, Apple also with its privacy focus unveiled a new machine learning framework API for developers called Core ML. It supports several essential ML tools, including all types of neural networks (deep, recurrent, and convolutional), linear models, and tree ensembles. Apple’s documentation says, Core ML optimizes on-device performance by leveraging CPU, GPU, and Neural Engine while minimizing its memory footprint and power consumption. Running a model strictly on the user’s device removes any need for a network connection, which helps keep the user’s data private and developers’ app responsive.
Another option that is being looked at is the integration of Blockchain and AI technologies. Blockchain can be used to share models and data. It offers data storage encryption on a decentralized system and uses a secured and protected database that only authorized users are allowed to access. It keeps a tamper-proof audit trail of all transactions (end-to-end traceability). In addition to decentralization, it could bring transparency and verifiability to the mix. However, there are a couple of technical challenges with Blockchain – transactions and queries are slow, and scaling of large amounts of data is also a problem. These realities will need to be considered in designing the Blockchain and AI integration.
References:
**11** https://oag.ca.gov/privacy/ccpa
**22** https://gdpr.eu/consumers-gdpr-data-privacy-rights/
This disclaimer informs readers that the views, thoughts, and opinions expressed in the article belong solely to the author, and not necessarily to the author's employer.
Ravi: Nice article on data privacy, ai, and decentralisation . This exact challenge is what KIP (kip.foundation) protocol and moinet (moibit.io) addresses. KIP is world’s first context-aware intelligent p2p protocol which enables creation of personalised decentralised networks. Through KIP, we are enabling humanisation of the network and personalisation of digital interactions and blockchain. In KIP, we have taken a complete user and application centric view of value networks as opposed to the current protocols which take a systemic view. Using KIP, application and users are are able to completely personalise and control their identity, storage, compute, level of trust and type of value. This allows all types of applications and users to build decentralised solutions which are personalised, user-centric, and targeted. Please check out www.kip.foundation and moibit.io #blockchain #decentralization Ganesh Prasad Kumble Karthik Balasubramanian Sujatha Krishnan @