登录查看更多内容

How Big Data was created, or why we have neither privacy nor transparency.

Manuela Battaglini Manrique de Lara

Lawyer | Researcher on Democratic Governance of Cyberspace | AI Policy | PhD candidate on AI at UPC | Solving human problems | The Force is with me

发布日期: 2019年7月7日

Why are we currently in a situation where privacy and lack of transparency have become central legal issues? Obviously, it is because of rapid technological development, but it is perhaps useful for our discussion about transparency, privacy, and profiling to dig a bit deeper.

By understanding a bit more about how technology has changed our world so radically in recent years we shall therefore briefly review two major technological circumstances that made this transformation possible:

One is about computer hardware development and the other is about the development of computer software that enables many computers to work as one.

But, to understand Big Data, and these two technological milestones, we must talk about Moore's Law, created by Gordon Moore, Intel's co-founder, and is better known for a prediction he made in 1965 in an article he wrote for Electronics magazine with the title 'Cramming More Components onto Integrated Circuits'. He stated:

The complexity for minimum component costs has increased at a rate of roughly a factor of two per year. Certainly, over the short term, this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although there is no reason to believe it will not remain nearly constant for at least 10 years.

The prediction was quoted as 18 months as the doubling period for general computing power. Moore’s law, in part, explains the sustained exponential growth in the Big Data era.

It implies ever-expanding huge numbers and is explained by Ray Kurzweil in his book "The age of spiritual machines: When computers exceed human intelligence" through the story of the inventor of chess in India.

No hay texto alternativo para esta imagen

When the inventor of chess presented his game to the emperor, the Emperor was very impressed by the game, and he asked the inventor to ask for any reward he wanted.

The inventor only wanted rice to feed his family and he used the chessboard to show the amount of rice he would like. He put one grain of rice in the first square of the chessboard, two in the second square, four in the third one, eight in the fourth square, and repeated this process until the last square of the chessboard was filled with rice grains.

On the first part of the chessboard, the human brain can imagine the number of rice grains, but on the latter part of the chessboard, the numbers become too big to imagine: trillions, quadrillions and quintillions.

When this action is repeated until the last square of the chessboard, more than a quintillion grains of rice is obtained. It is more rice that has been produced in the history of the world.

Moore’s law was formulated in 1965 and close to 18 months was predicted as the doubling time for transistors in use.

Moore’s law shows exponential growth in transistors as a doubling approximately every 18 months, and how the price of transistors is falling every 18 months. See Hutcheson, D. (2015) ‘Graphic: transistor production has reached astronomical scales’, available at: https://spectrum.ieee.org/computing/ hardware/transistor-production-has-reached-astro- nomical-scales (accessed 12th December, 2018)). Resource: VLSI Research.

After 32 of these doublings, since 1965, we are now on the second half of the chessboard. From this point on, we were able to digitize almost everything and the immense numbers of computers enabled us to store all of these new data.

But, there was a challenge: how could we access and manipulate data stored across many different computers? We needed ‘the cloud’.

The era of big data computing started in 2007, when it became widely possible to ‘upload data to the cloud’ because effective shared memory software became available so that thousands of computers could work as one.

In 2003, Google published a paper that included a basic innovation called the Google File System (GFS). This software allowed Google to access and manage a huge amount of data from thousands of computers.

At this time, Google’s main goal was to organize all the world’s information through its search engine. However, they were not able to do that without their second basic innovation, MapReduce, which was published in 2004.

These two innovations allowed Google to process and explore a huge quantity of data in a manageable way.

Google shared these two basic innovations with the Open Source community so that the community could build on their insights. Even better, the community was able to improve the software and as a result, Hadoop was created in 2006.

Hadoop is an open source software that allows hundreds of thousands of computers to work like one giant computer. Hadoop developers: Mike Cafarella y Doug Cutting.

Facebook, LinkedIn, and Twitter already existed in 2006, and they started building on Hadoop straightaway. This is the reason why these platforms companies became global in 2007.

With Hadoop, easily accessible storage capacity for computing exploded making ‘big data’ available for all. Thanks to Hadoop, internet platforms could store all their data across many computers while still having access to their data. Furthermore, they could store every click of every user on every web page. This gave them a much better understanding of what users were doing over time, thus providing the basis for Big Data Analytics.

Thanks to Hadoop, other companies were born in 2007, including Airbnb. Amazon also launched Kindle and the first iPhone was released. According to AT&T, mobile data traffic on its national wireless network increased by more than 100,000% from January 2007 to December 2014.

The year 2007 was a crucial point in the global economy. This paved the way for the emergence of a new category of companies that reshaped how people and machines communicate, create, collaborate and think.

Since 2007, the big internet platform companies, through Big Data Analytics, have had the opportunity to store all their data in one place and thus have a greater in-depth knowledge of the market than traditional companies.

Furthermore, more customers on one platform mean better service (eg. social media or Airbnb), which favors larger platforms that over a few years can act as monopolies owing to their market dominance.

The main consequence for users was the benefit of a number of new services but at the same time a total loss of control over their personal data and the possibility of being analyzed and profiled thanks to big data analytics and machine learning algorithms.

This means that company decisions started to be made via automated decision- making processes through profiling of individuals and groups. In some cases, this was for advertising purposes; in other cases, these automated decisions could become life-changing owing to biased results and discrimination.

In other cases, however, owing to lack of privacy and through market domination, these automated decision- making processes can distort fair markets (tailoring prices) as well as fair elections (Cambridge Analytics case).

There are many examples of assessments being made based on online automated decision-making processes. For example, Facebook can predict your political views, your race, religion, and sexual orientation, and even predict when you are going to die. Facebook can predict individual future behaviors, allowing third parties to target these individuals with advertisements that can change their decisions entirely. Facebook calls it ‘improved marketing efficiency’.

Another example is given by Amazon’s ‘Alexa Hunches’ feature and its capacity to predict future needs based on a user’s behavior to make suggestions, and furthermore, predict a user’s health status through analyzing voice and coughing, which is followed by sending advertisements for sore throat products. Insurance companies are also collecting data from social networks to predict how much users’ healthcare could cost them.

The central question is if a data subject has real control over her personal data and whether the General Data Protection Regulation (GDPR) protects a data subject from their inherent risks.

Bad news: Even if profiles are personal data, it does not mean individuals have full access to them and is not possible to rectify any of the assessments.

Data subject rights to transparency are described in Articles 13–15 GDPR. The right to be notified (Articles 13-14 GDPR), is a data controller’s duty and covers data provided directly by the data subject, observed data, and data from a third party. Also, the right to access (Article 15 GDPR) has to be appealed for by the data subject.

Does article 15 GDPR allow individuals to their profiles? The phrase ‘envisaged consequences’ suggests that the data controller has to give an explanation to the data subject about the consequences of the automated decision- making before the processing of the data. And with a lack of an explicit deadline for appealing, the right of access is limited to explanations of systems functionalities. This is an ex-ante explanation.

We can access only to input data, no output data (profiles). If we look to the diagram below, we only can access until the point C).

From point C) until point F) companies are protected by other legislation, like Trade Secret Directive, and they are allowed to be opaque. If they decide to be Transparent on these points it is because of an ethical decision.

要查看或添加评论，请登录

Manuela Battaglini Manrique de Lara的更多文章

Techno-fascism wants to abolish democracy and governments. And it may succeed

2025年3月1日

Techno-fascism wants to abolish democracy and governments. And it may succeed

A group of Silicon Valley billionaires intends to establish techno-fascism, not only in the United States, but in the…

8 条评论
El tecno-fascismo pretende acabar con la democracia y los gobiernos. Y puede que lo consiga.

2025年2月28日

El tecno-fascismo pretende acabar con la democracia y los gobiernos. Y puede que lo consiga.

Un grupo de billonarios de Silicon Valley pretende acabar con la democracia, no sólo estadounidense, sino en el mundo…

44 条评论
DeepSeek, the GenAI world's economic revolution

2025年1月31日

DeepSeek, the GenAI world's economic revolution

On January 20, 2025, DeepSeek was released to the public. A language model that, as I will explain in more detail in…

6 条评论
DeepSeek, la revolución económica en el mundo de la IA (?Generativa?)

2025年1月31日

DeepSeek, la revolución económica en el mundo de la IA (?Generativa?)

El 20 de enero de 2025, DeepSeek fue lanzado al público. Un modelo de lenguaje que, como explicaré con más detalle en…

8 条评论
Taiwan’s TSMC is investing $11 billion in Germany to manufacture chips. What does this mean for Europe?

2024年8月24日

Taiwan’s TSMC is investing $11 billion in Germany to manufacture chips. What does this mean for Europe?

TSMC, the Taiwanese world leader in chip production, has invested $11 billion in Germany to build a state-of-the-art…
La taiwanesa TSMC invierte 11.000mill$ en Alemania para fabricar chips. ?Qué consecuencias tiene esto para Europa?

2024年8月24日

La taiwanesa TSMC invierte 11.000mill$ en Alemania para fabricar chips. ?Qué consecuencias tiene esto para Europa?

TSMC la compa?ía taiwanesa líder global en producción de chips ha invertido 11.000 mill$ en Alemania para construir una…

5 条评论
?De qué va la demanda de The Times a OpenAI por infracción de Copyright?

2024年5月18日

?De qué va la demanda de The Times a OpenAI por infracción de Copyright?

El 27 de diciembre de 2023, The Times presentó la Demanda contra Microsoft y OpenAI por lo siguiente: (i) infracción de…
What is the U.S. Department of Justice's lawsuit against Apple about?

2024年5月12日

What is the U.S. Department of Justice's lawsuit against Apple about?

In 2010, a senior Apple executive emailed Apple's then-CEO about an advertisement for the new Kindle e-reader. The ad…
?De qué trata la denuncia del Departamento de Justicia de US contra Apple?

2024年5月12日

?De qué trata la denuncia del Departamento de Justicia de US contra Apple?

En 2010, un alto ejecutivo de Apple envió un correo electrónico al entonces consejero delegado de Apple acerca de un…
What do the AI Act, a former French minister, Microsoft, and copyright have to do with each other?

2024年5月5日

What do the AI Act, a former French minister, Microsoft, and copyright have to do with each other?

During the AI Act negotiations last December, a lot of things happened that most people didn’t know or didn’t find out…

9 条评论

See all articles

How Big Data was created, or why we have neither privacy nor transparency.

Manuela Battaglini Manrique de Lara

Lawyer | Researcher on Democratic Governance of Cyberspace | AI Policy | PhD candidate on AI at UPC | Solving human problems | The Force is with me

Manuela Battaglini Manrique de Lara的更多文章

社区洞察

其他会员也浏览了

The Birth of the Internet (The Dark Ages)

The Future of Technology Needs More Pirates, Less Emperors

A closer look at Microsoft’s new Copilot+ PC initiative

The future for mankind, part 1: Homo Digitalis

October 11, 2023

AQ Newsflash: February 2024 Newsletter

Why Big Tech Matters for National Security

The rules they are a changin’

Navigating the Hurdles: Overcoming Challenges in Implementing Computer Vision on Edge

Manuela Battaglini Manrique de Lara的更多文章

Techno-fascism wants to abolish democracy and governments. And it may succeed

El tecno-fascismo pretende acabar con la democracia y los gobiernos. Y puede que lo consiga.

DeepSeek, the GenAI world's economic revolution

DeepSeek, la revolución económica en el mundo de la IA (?Generativa?)

Taiwan’s TSMC is investing $11 billion in Germany to manufacture chips. What does this mean for Europe?

La taiwanesa TSMC invierte 11.000mill$ en Alemania para fabricar chips. ?Qué consecuencias tiene esto para Europa?

?De qué va la demanda de The Times a OpenAI por infracción de Copyright?

What is the U.S. Department of Justice's lawsuit against Apple about?

?De qué trata la denuncia del Departamento de Justicia de US contra Apple?

What do the AI Act, a former French minister, Microsoft, and copyright have to do with each other?

社区洞察

其他会员也浏览了

The Birth of the Internet (The Dark Ages)

The Future of Technology Needs More Pirates, Less Emperors

A closer look at Microsoft’s new Copilot+ PC initiative

The future for mankind, part 1: Homo Digitalis

October 11, 2023

AQ Newsflash: February 2024 Newsletter

Why Big Tech Matters for National Security

The rules they are a changin’

Navigating the Hurdles: Overcoming Challenges in Implementing Computer Vision on Edge