Dealing with User Profiling while Preserving User Privacy and Security | Solving the User Security Problem with Web3 and Related Tech: Part 4
In this series of articles, we are looking into how various new-age technologies can increase user security online.
Now, we discussed various types of security challenges in the first part and how companies can minimize data collection in the second part.
In the third part, we discussed how the emerging technologies mentioned in the second part can help protect user identity in the digital ecosystem.
In this part, we are discussing user profiling and how emerging technologies can restrict the adverse impact of user profiling on the users from privacy as well as a security perspective.
The Patel v. Facebook lawsuit?was filed in 2015 in the U.S. District Court for the Northern District of California. The case was brought on behalf of Facebook users in Illinois, and the lead plaintiff was a user named Carlo Licata.
The lawsuit alleged that Facebook's "Tag Suggestions" feature, which utilized facial recognition technology to automatically identify and tag individuals in photos, violated the Illinois?Biometric Information Privacy Act (BIPA). The plaintiffs argued that Facebook collected and stored biometric data without obtaining the required informed consent from users.
Let us understand the background a little bit. In 2010, Facebook implemented a new feature, Tag Suggestions, that used facial recognition technology to identify individuals in photographs uploaded by a user who had enabled the Tag Suggestions function. This feature allowed for automatic tagging of other users in photographs uploaded onto Facebook. It went beyond the original tagging which was manual and allowed a user to identify other users in their photographs, with a link to those users’ profiles.?
Now, I am pretty sure that most readers here came across this feature but may not be aware of what Facebook did with the tagging behind the scenes. Facebook used various technologies to scan the images when they had been uploaded and extracted the various geometric data points that make a face unique, such as the distance between the eyes, nose, and ears, to create a face signature or map, compared that map to templates of users’ faces saved in its databases and then suggested tagging that particular user in the photograph. So, basically, Facebook created a facial biometric database.?
Now, coming back to the lawsuit.?
Later, the court granted?class certification, allowing the case to proceed as a class-action lawsuit. This meant that the lawsuit represented a class of Facebook users in Illinois who were similarly affected by the alleged BIPA violations.
Facebook defended its practices, asserting that its facial recognition technology provided users with valuable features, such as automatic tagging, and that the company had not violated BIPA. The defense also argued that users had the ability to turn off the facial recognition feature if they wished.
Eventually, Facebook settled the lawsuit for a?$550 million settlement in August 2020. The settlement covered approximately 1.6 million Illinois Facebook users who could claim a share of the settlement fund.
Okay, Good story.?
Now, I have a question.
Why does a Social Media company need to create a biometric database of its users?
They do not.?
If we consider the kind of services social media and related platforms provide, they absolutely do not have any need to collect and store biometric data.?
It is not that we are storing our digital assets with social media platforms or that social media platforms are giving us access to the magical lands of Narnia.
I mean, with all that data collection, social media has yet to solve the fake account and bot problem. Then why do they collect all the data?
User Profiling.
So, what is user profiling?
User profiling involves the systematic collection, analysis, and interpretation of data to create a detailed representation of an individual's behavior, preferences, and characteristics.
Now, one important point about user profiling is that the ultimate goal is not understanding user behavior but rather predicting and influencing user behavior.
This is what makes user profiling so dangerous.
If user profiling gives these platforms the ability to “nudge†us to click on an advertisement, does not it also give them the ability to “nudge†us to vote for a specific political party or hate a particular community?
This is where the Cambridge Analytica scandal comes in.
I am not writing about the Cambridge Analytica scandal for the first time, and this will not be the last time given how important this event was in exposing the menace of user profiling.
The Cambridge Analytica scandal, which emerged in 2018, revealed that the political consulting firm Cambridge Analytica had improperly obtained and exploited the personal data of millions of Facebook users. The company harvested this data through a third-party app that claimed to be a personality quiz, allowing them to access not only the information of users who took the quiz but also that of their friends.
The data was then allegedly used to create targeted political advertisements during the 2016 U.S. presidential election, the Brexit referendum in the United Kingdom, and a few other large-scale political events.
It was reported that Cambridge Analytica utilized the psychographic profiles derived from the Facebook data to identify potential voters, understand their preferences, and tailor political messages to influence their decision-making.
So, we now know that unchecked user profiling can be used for large-scale mass manipulation.
And it is not only about that, large scale user profiling by digital platforms is fast becoming a surveillance tool even when the platforms do not want it that way.
Digital platforms, especially social media, have all the qualities of an effective surveillance tool. These platforms have
- Comprehensive data on the users enables precise user profiling.
- User tracking mechanisms, such as cookies, device identifiers, and tracking pixels, allow platforms to follow users across various online activities.
- The combination of data from multiple sources facilitates granular tracking, creating a detailed map of users' digital footprints and behaviors.
- Through machine learning and artificial intelligence, these platforms can make predictions about users' preferences, interests, and future actions. This behavioral analysis adds a predictive dimension to surveillance, allowing platforms to anticipate user actions.
- Most platforms collect location data, either explicitly through user consent or implicitly through IP addresses and device sensors. The continuous monitoring of users' locations contributes to the creation of detailed movement profiles, turning platforms into location-based surveillance tools.
- Platforms that facilitate social interactions and connections also indirectly create detailed social network maps. Analyzing these connections provides insights into users' social circles, affiliations, and relationships, contributing to a form of social surveillance.
So, these platforms can track where you are, and what are you doing, and with significant precision can predict what you can do in the future.
Now, take some time to deliberate on the implications.
What kind of mass manipulations are being enabled by large-scale user profiling -
hate-mongering towards specific communities, limiting access or reach for people from specific communities or who hold specific socio-political views, and the rise of authoritarian regimes?
What is the limit?
Okay, how can the emerging technologies help us limit user profiling and even if mandatory, preserve user privacy and limit the sharing of sensitive personal data?
User Profiling with Prioritizing User Privacy and Security
So,
What are the problems with the current way of user profiling??
Unsafe Centralization of Data
User profiling generally involves centralized storage of user data raising the risk of data breaches. Data breaches can have severe consequences, exposing personal details, financial information, and other sensitive data. The compromised data may then be sold on the dark web or used for various malicious purposes.
领英推è
Apart from that, centralized data repositories increase the risk of mass surveillance by both governmental and non-governmental entities. The aggregation of extensive user profiles in one location can facilitate unwarranted monitoring and tracking of individuals, potentially infringing on civil liberties and privacy rights.
Lack of clear and discrete permissions
Another issue with the current system is that users are not asked for discrete permissions for data sharing.?
Platforms often request broad permissions from users for data collection to enable a range of features and services. Users are often presented with lengthy and complex terms of service and privacy policies, making it challenging for them to fully understand the extent of data collection and how their information will be used. Broad, vague language in these agreements can allow platforms to collect more data than users might reasonably expect.
Broad permissions can enable platforms to repurpose user data for purposes beyond the original intent. This might involve sharing data with third-party advertisers, conducting market research, or building comprehensive user profiles. Users may not be aware of these secondary uses when granting permissions.
In short, vague permissions allow platforms to collect much more data than users think they are sharing and use the data beyond the scope users think they are allowing.?
But, vagueness is not limited just to scope.?
Perpetual access to data
In the current system users simply do not have any control of the data after it is shared with platforms i.e. the sharing is perpetual.
Once users share their data on digital platforms, it can become permanent, residing in databases and archives even if the user deletes the original content. There are cases where user data is kept even after users delete their profiles.?
Many digital platforms engage in data sharing with third-party entities, such as advertisers, business partners, or other service providers. Once data is shared with these external parties, users often have limited control over how it is used or whether it is further shared with other entities.
Moreover, platforms may update their terms of service or privacy policies over time. Users may inadvertently agree to new terms that allow broader data usage without fully understanding the implications. This can result in a shift of control over shared data without clear and explicit consent from users.
To summarize, in the current system users do not have the ability to retract their permissions and even if they do that may not have much effect as there is no way to track the data shared.
No economic cost for data collection.?
Now, the lack of significant economic implications for platforms engaging in data collection and user profiling has contributed to unchecked practices and the potential misuse of data.
Many platforms operate on advertising-driven revenue models where user data serves as a valuable commodity. The more comprehensive and detailed the user profiles, the more targeted and effective advertising or product placements can be. This financial incentive encourages platforms to collect and utilize user data extensively without significant economic consequences for doing so.
Some platforms may structure their operations in a way that limits their liability for the misuse of user data. This might involve complex contractual arrangements, terms of service agreements, or legal disclaimers that shield the platform from significant legal and financial consequences, even in cases of data breaches or privacy violations.
The point is that in the current system, there is an imbalance. Platforms see economic upside in unlimited collection of data but they do not have to pay anything for the collection of the data. Also, the infrastructure cost is fixed cost and does not depend on the scope of data use.?
In short, platforms have economic incentives to collect as much data for user profiling and then higher incentives to use the data for any purpose they deem profitable.
So, there needs to be a system where platforms are exposed to economic costs not only for data collection but also the cost should be linked with what data is being collected, duration of data use, and scope of data use.?
IMHO - fines are not enough, and consent is not enough. Unless there is a cost associated with data collection and use, it is quite impossible to curb unchecked user profiling.
Use of Emerging Tech to protect users
In the second article in this series, we talked about various available technologies that can enable privacy preservation and online security for users.
We mentioned Zero-Knowledge Proofs (ZKPs), Decentralized Identity Solutions, Federated Learning, Homomorphic Encryption, Differential Privacy, and Secure Multi-Party Computation (SMPC).?
Please refer to the article to get a feel for these technologies.?
Let us see how these technologies can help in the present context.?
Now, the first problem we discussed is the unsafe centralization of data.?
One of the ways of avoiding centralized collection of data for user profiling is federated learning. The primary goal of federated learning is to enable model training without requiring raw data to be centralized. Federated learning aims to preserve privacy by allowing devices to train models locally on their data and only share model updates or aggregated information with a central server.
Now, federated learning can be fine-tuned for targeted advertisement or targeted product placement without centralization of data and avoiding all the issues linked with data centralization.?
Now, there can be scenarios where data centralization cannot be avoided. In these situations, homomorphic encryption can prove to be useful. Homomorphic encryption allows the processing of encrypted data without decrypting it. This means that even if data is stored in a centralized location, without the decryption keys, they are meaningless. This can be helpful in preserving user privacy in the case of data breaches and unauthorized access.?
Also, homomorphic encryption can be useful while sharing data with third parties for further processing.?
Okay, now we need to cover two more issues - limiting the use of data and the economic perspective.
Now, these issues require tracking of individual pieces of data - from collection to use for different purposes.?
For that, we can implement an NFT-based system where each piece of data is linked to individual NFTs and each transfer of the piece of data involves a blockchain transaction.
Economic cost is associated not only with the data but also with the transfer of data.?
Yes, due to network fees, this kind of system is expensive but it should be. Neither the acquisition of user data nor the transfer of it should be free - IMHO.?
Let us end our discussion here.
Find out more about Sam:
User profiling is a serious concern in today's digital landscape. Cost implications could be a game-changer in minimizing its adverse effects. " Sam Ghosh