Hitchhiker's Guide To Privacy Engineering Chapter 5: How Websites Work?
Mert Can Boyar
Founder at Compliance Detective I Director at Bilgi University Privacy Innovation Lab I Author of Hitchhiker's Guide to Privacy Engineering
?? Author's Salute - Community Update #5
Hello dear reader, welcome to the fifth issue of my "creative privacy" passion project, where we try to discover experiences that harmonize?creativity and privacy.
?? Last week, I was the guest of the Shifting Privacy Left Podcast where we discussed #creativeprivacy and HGPE. You can check out the podcast episode?here.
?? If you wonder what happened to Red's brother during the last episode, jump on to the new chapter to discover how the story unfolds?in Chapter V : The Approaching Darkness.
?? The website of?HGPE?is live, a single source of truth where you can read about privacy engineering, creative privacy, creative AI, comic book?storylines, reading episodes, videos and soundtracks.?
What you will learn?
Understanding the architecture and the technologies used in websites is mandatory for privacy engineers. This chapter helps you get the full picture of data processing on modern websites.
?? This chapter will consist of three parts, and the first part will take a look at these: ??
? HTML
? CSS
??Javascript
??Web Server
? Browser
? HTTP
? Email Server
You can?check out the first part?of this chapter?here.
How do websites work?
In order for you to see the Hitchhiker's Guide to Privacy Engineering, the Gitlab server hosting this content is sent over the Internet piece by piece in the form of several thousand data packets.
Data packets traveled over cables and radio waves and through routers and switches from our web server to your computer or device.
Days passed, and Red was not convinced that her brother, Ray was dead. She knew he was out there, somewhere in the galaxy. For the first time in my life in the Academy, I am reluctant to attend any of the classes. They are wasting my time and I need to find out what happened. I need to find a way to talk to the Grandmaster. Now. - Journals of Order of Epoch, 2341 Anno Domini
Your computer or smartphone received those packets and passed them to your device's browser, and your browser interpreted the data within the packets in order to display the text you are reading now.
?? Here is what goes under the hood in milliseconds between your browser and HGPE server after you opened this website: ??
?? DNS query
When your browser started to load this webpage, it likely first made a DNS query to find out this website's IP address.
???? TCP handshake
Your browser opened a connection with that IP address.
?? TLS handshake
Your browser also set up encryption between Hitchhiker's Guide to Privacy Engineering web server and your device so that attackers cannot read the data packets that travel between those two endpoints.
?? HTTP request
Your browser requested the content that appears on this webpage.
There must be something that I am not seeing and I need to find out what. It's no mere luck that the voices, however crazy it sounds, I can hear my Ray’s voice as well. Master Talia believes that my mind is playing tricks on me due to my stress. But no, I don’t really feel that way. I don’t know how, but I am sure that it's him calling for help. Or is it a warning? - Journals of Order of Epoch, 2345 Anno Domini
?? HTTP response
Servers transmit the content in the form of HTML, CSS, and JavaScript code, broken up into a series of data packets. Once your device received the packets and verified it had received all of them, your browser interpreted the HTML, CSS, and JavaScript code contained in the packets to render this article about how the Internet works.
Now let's take a deeper look at how actually websites work.?
HyperText Markup Language (HTML)
HTML (HyperText Markup Language) is the code used to create websites. It tells the web browser how to display text and images on a web page.
HTML is a formal recommendation by the World Wide Web Consortium (W3C) and is generally adhered to by all major web browsers, including both desktop and mobile web browsers.
HTML tags mark the text on a web page in its most basic form. This is the most important markup language that can be used to create a webpage. It is used to display text, images, audio, and video on a webpage.
“Alright, I know you more than anyone in the Academy. So you sneaking out in the middle of the night to go after the signals is a very high probability. I believe I just wanted to protect you, you know.” said Kyle. The anger on Red’s anger seemed to dissolve into a more innocent and caring expression. “I hate it when I feel like you can read my mind. This was the first idea that occurred to me the second you told me that the signal was lost.” - Journals of Order of Epoch, 2341 Anno Domini
HTML for web pages has two main sections: the head and the body. The head contains information about the page, while the body contains the content.
Here's a basic example of a Login Page in HTML code:
<!DOCTYPE html>? ?
<html>? ??
<head>? ??
? ? <title>Login Form</title>? ??
? ? <link rel="stylesheet" type="text/css" href="css/style.css">? ??
</head>? ??
<body>? ??
? ? <h2>Login Page</h2><br>? ??
? ? <div class="login">? ??
? ? <form id="login" method="get" action="login.php">? ??
? ? ? ? <label><b>User Name? ? ?
? ? ? ? </b>? ??
? ? ? ? </label>? ??
? ? ? ? <input type="text" name="Uname" id="Uname" placeholder="Username">? ??
? ? ? ? <br><br>? ??
? ? ? ? <label><b>Password? ? ?
? ? ? ? </b>? ??
? ? ? ? </label>? ??
? ? ? ? <input type="Password" name="Pass" id="Pass" placeholder="Password">? ??
? ? ? ? <br><br>? ??
? ? ? ? <input type="button" name="log" id="log" value="Log In Here">? ? ? ?
? ? ? ? <br><br>? ??
? ? ? ? <input type="checkbox" id="check">? ??
? ? ? ? <span>Remember me</span>? ??
? ? ? ? <br><br>? ??
? ? ? ? Forgot <a href="#">Password</a>? ??
? ? </form>? ? ?
</div>? ??
</body>? ??
</html>???
HTML5 is the latest version of the specification.
Cascading Style Sheets (CSS)
CSS stands for Cascading Style Sheets. It is used to style HTML documents.
CSS can make responsive web pages and is used for styling and its collection of formatting rules. It is used for designing purposes. You can use CSS to change the text's color, size, and font and add background colors and images. CSS can also create responsive layouts that adapt to different screen sizes.
After adding the CSS to our HTML code, the login page would look like this.
Here is the CSS code we used to turn the login page that we have built using HTML more alive and colorful.
body?
{??
? ? margin: 0;??
? ? padding: 0;??
? ? background-color:#0096FF;??
? ? font-family: 'Arial';??
}??
.login{??
? ? ? ? width: 382px;??
? ? ? ? overflow: hidden;??
? ? ? ? margin: auto;??
? ? ? ? margin: 20 0 0 450px;??
? ? ? ? padding: 80px;??
? ? ? ? background: #0096FF;??
? ? ? ? border-radius: 15px ;? ??
}??
h2{??
? ? text-align: center;??
? ? color: #FFFFFF0;??
? ? padding: 20px;??
}??
label{??
? ? color: #FFFFFF;??
? ? font-size: 17px;??
}??
#Uname{??
? ? width: 300px;??
? ? height: 30px;??
? ? border: none;??
? ? border-radius: 3px;??
? ? padding-left: 8px;??
}??
#Pass{??
? ? width: 300px;??
? ? height: 30px;??
? ? border: none;??
? ? border-radius: 3px;??
? ? padding-left: 8px;? ?
}??
#log{??
? ? width: 300px;??
? ? height: 30px;??
? ? border: none;??
? ? border-radius: 17px;??
? ? padding-left: 7px;??
? ? color: blue;??
}??
span{??
? ? color: white;??
? ? font-size: 17px;??
}??
a{??
? ? float: right;??
? ? background-color: white;??
}?
Using CSS, you can control exactly how HTML elements look in the browser, presenting your markup using whatever design you like.
Javascript
JavaScript can manipulate the HTML elements and add interactivity.
For instance, you can use JavaScript to add interactivity to websites, validate forms or track users' actions on a website. So yeah, a big part of web tracking is powered by javascript.
Red arrived at Master Talia’s lab where she saw Master Ya’zz and Master Talia talking, actually, it was more like an argument. When Red entered the building Master Ya’zz cybernetics warned him that someone entered his proximity. “It seems you have a visitor. We can continue this later on, I need to get to Master’s Keep to coordinate the extraction. told Master Ya’zz before leaving the lab. She stopped in front of Red, gave a warm smile, and looked back to Master Talia “This is our only chance Talia. We should act now”. Then he moved pass touching Red’s shoulder to show her sympathy. - Journals of Order of Epoch, 2341 Anno Domini
Some privacy zealots even block all javascript just as an adblocker to minimize their digital footprints.
If you want your web page to do more than just sit there and display static information, you need to implement JavaScript to make it more interactive.
So think of javascript as the third layer of the standard web technologies that lives on top of HTML and CSS.
So let's put Javascript in action, shall we?
We have already created a beautiful login page using HTML and CSS. Now we would like to verify whether the inputs we collect are indeed in the format we requested. We can set up a validation mechanism through Javascript. Thus, we can take a measure to ensure the accuracy and correctness of the data collected from the data subjects.
function ValidateEmail(inputText
{
var mailformat = /^\w+([\.-]?\w+)*@\w+([\.-]?\w+)*(\.\w{2,3})+$/;
if(inputText.value.match(mailformat))
{
alert("Valid email address!");
document.form1.text1.focus();
return true;
}
else
{
alert("You have entered an invalid email address!");
document.form1.text1.focus();
return false;
}
})
We can go even further and use more complex snippets of Javascript to measure the people who come to the page where the login form is located, but did not provide any inputs.
So we might want to get some metrics and measure how well our registration form tunnel is working. By adding a tracking code, we can start measuring incoming visitors. We will need to obtain the consent of the users in order to make the follow-up we want to carry out here in compliance with the relevant data protection legislation. We can start the user tracking we want to perform by adding the following piece of code to our registration form page.
领英推荐
<script type="text/javascript
? var _paq = window._paq = window._paq || [];
? _paq.push(['trackPageView']);
? _paq.push(['enableLinkTracking']);
? (function() {
? ? var u="https://{$VERILOGY_URL}/";
? ? _paq.push(['setTrackerUrl', u+'matomo.php']);
? ? _paq.push(['setSiteId', {$IDSITE}]);
? ? var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0];
? ? g.type='text/javascript'; g.async=true; g.src=u+'verilogy.js'; s.parentNode.insertBefore(g,s);
? })();
</script>>"
So there you go, your first step into to start getting used to why you'd use JavaScript and what kind of things you can do with it.
Web Server
Web servers are computers that store and serve content like websites, databases, applications, music, images, and videos to users.
The web server is the server responsible for hosting and publishing of websites. It is used to connect websites to the internet. Depending on the contents and features of the website, it can be hosted on-premise or by intergalactic cloud service providers.
Red walked inside the lab. “Hello, Master Talia, sorry to interrupt but my class was canceled and I thought you were the only one who can give me an update on the situation.” What was that about? “Red in the I know you want answers. Things are more complicated than you know. We have a plan, yet I believe the risks are too high.” replied Master Talia. “How can I help?” asked Red. “The funny thing is you are already helping. Come, let’s start today's training and I will tell you more “ replied Master Talia. - Journals of Order of Epoch, 2341 Anno Domini
The location of the server and whom it is managed by are very important in terms of both security and data privacy. By detecting the location of the web server, an important determination can be made about intergalactic data transfers.
Servers also respond to DNS queries and perform other important tasks to keep the Internet up and running. Most servers are kept in large data centers, which are located throughout the world.
A web server connects to the Internet and supports physical data interchange with other devices connected to the web. It includes several parts that control how web users access hosted files.
Static Web Server
This is a computer with an HTTP server and is called "static" because the server sends its hosted files as just the way they are to your browser.
Dynamic Web Server
Includes some extra software, most commonly an application server and a database, and is called "dynamic" because the application server updates the hosted files before sending content to your browser via the HTTP server.
Master Talia synced Red’s cyberware with the amplifier and when she saw that the data stream has started told Red : Never let yourself near the darkness when you are in there, alright? This was the risk I was telling you about earlier. If you let the voices cover you in the darkness you might not wake.” replied Master Talia. “Don’t worry Master Talia, I will be careful” said Red and closed her eyes. Are you ready? I am amplifying the resonance in three, two, one.” replied Master Talia. But as soon as the resonance amplified Red couldn’t even hear the countdown and found herself in a dark , dream-like world. - Journals of Order of Epoch, 2341 Anno Domini
Browser
A Web browser or browser is a program that retrieves and displays pages from the Web, and lets users access further pages through hyperlinks.
The information is transferred using the Hypertext Transfer Protocol, which defines how text, images, and video are transmitted on the web. This information needs to be shared and displayed in a consistent format so that people using any browser, anywhere in the world can see the information.
Browsers translate the code used to create websites into text, images, video, and other elements. In other words, a browser takes something most of us don’t understand, like HTML and JavaScript, and turns it into the content we come to enjoy online.
Most devices with an internet connection come with a browser application. Web browsers aren’t the only way to access the internet, but they’re the primary way most of us access information and services online.
It was pitch black, and Red felt alone, cold inside darkness. She tried to control her breath and focus on the white noise that she can hear in the background. And as she focused more that noise amplified and started to illuminate the darkness. She caught her breath and calmed herself down. She called out “brother?”. Her voice also created a new shape in the air, a new energy flow that illuminated its own light. But this light was different than the rest of the shapes as it was brighter than the rest. And just as this new shape appeared as a manifestation of Red’s voice in the darkness, all the other voices stopped just as a sentient being realized something and become alerted. - Journals of Order of Epoch, 2341 Anno Domini
HTTP
HTTP headers are the code that transfers data between a Web server and a client. HTTP headers are mainly intended for communication between the server and client in both directions.
In 1989, while working at CERN, Tim Berners-Lee wrote a proposal to build a hypertext system built over the existing TCP and IP protocols over the internet.
The evolution of HTTP has led to the creation of many applications and has driven the adoption of the protocol. The environment in which HTTP is used today is quite different from that of the early 1990s.
HTTP headers let the client and the server pass additional information with an HTTP request or response.
Request headers
A request header is an HTTP header that can be used in an HTTP request to provide information about the request context so that the server can tailor the response.
Response headers
Holds additional information about the response
Representation headers
Contains information about the body of the resource
Payload headers
A payload header contains information about payload data
Databases
Databases are made up of relational and non-relational structures that allow you to store and query all your real-time and historical data.
It's important to remember that the type of hosting we choose such as on-premise or managed will trigger different compliance requirements.
“Something is wrong with GrandMaster. He was the reason why we got ambushed by Lilith. As if he wanted to be found by the spawns of Lilith.” said Ray. “Are you okay, where are you? replied Red but Ray’s voice stopped suddenly as the other dark energy fields started to break inside their white bright energy field. They surrounded Red as a dark cloud and just as the thunderous voices overwhelmed Master Talia closed the amplifier. Red felt a sudden pain in her head after disconnecting from the amplifier, she went into a shock and lost consciousness. - Journals of Order of Epoch, 2341 Anno Domini
We can assume that we maintain our databases as PostgreSQL on Amazon Web Services so that our website can scale easily. Here, we can choose Frankfurt in terms of GDPR compliance from the geography options of the server offered to us by AWS.
At this point, we should not forget that there are risks of transferring personal data abroad due to the fact that the servers are located abroad within the scope of different data protection legislation.
We may want to receive reports on our number of users on a weekly basis.
For this, we can make certain queries in our user tables in our PostgreSQL database. Limiting who can make such queries is very important for data management and control of access rights. For example, we can query our database to find out how many new users registered for our product last week. With the following sample SQL query, we can generate reports about the users in our database.
<SELECT *
FROM call
ORDER BY
? ? call.user_id ASC,
? ? call.register_time ASC;?
Email Server
An email server, also called a mail server, is essentially a computer system that sends and receives emails. When you send an email, it goes through a series of servers to reach its final destination.
Over 300 billion emails are sent and received daily around the world, making it one of the most popular forms of communication.
For a computer system to function as a mail server, it must have mail server software installed. This software then allows the system administrator to manage and create email accounts for any of the domains hosted on the server.
An email server is a computer that has a complete system with different applications or services. Based on the type of action they perform, email servers can be categorized into incoming and outgoing email servers.
Incoming mail servers, also known as POP3 (Post Office Protocol) or IMAP (Internet Access Message Protocol) servers
Outgoing mail servers, also known as SMTP servers (Simple Mail Transfer Protocol)
SMTP
This protocol handles any outgoing mail requests and sends emails. So SMTP is short for Simple Mail Transfer Protocol and is the outgoing mail server. We can think of SMTP as moving your email on and across networks.
POP / IMAP
For incoming mail servers, there are two main varieties — POP3 and IMAP. POP3 servers, are best known for fetching the content of the Inbox on your computer’s hard drive. IMAP servers, short for Internet Message Access Protocol, are used for o way synchronization of the entire mailbox.
How to secure your emails?
A secure email server works just like a regular email server, except it uses advanced security protocols to protect your emails:
?? Strong encryption: A secure mail server only uses secure connections to transmit data. This includes protocols such as Transport Layer Security (TLS) and end-to-end encryption (E2EE).
??? Mail and sender authentication: Using anti-spoofing protocols such as Sender Policy Framework (SPF), you can cryptographically verify if an email was sent by trusted servers and has not been tampered with.
?? Anti-phishing measures: Anti-phishing measures implemented on an email server prevent phishing attacks and mitigate the impact of an attack.
??? Server location : A secure email server should be located in a country with strong data protections, allowing you to benefit from high levels of privacy.
This is the end of the first part of the?How the Internet Works Chapter.
Next week?we will continue to discover the second part of how?web applications work.
?? Comic Book Issue #5 - Want to read the plot and immerse yourself in the story even more?
?? ♂? Chapter 5: The Approaching Darkness?--> Jump on the?story from here, as it is written and hosted on the Gitbook platform.
? Early Access is available to the?privacy engineering materials. You can still witness the story of our protagonist Red, and her journey to find her brother through environmental storytelling while you learn about privacy engineering.
???Listen to the story of Red as Reading Episodes?released every week with original music. You can check out the?HGPE Trailer,?Chapter 1: The Prologue, and?Chapter 2 : The Battle for Earth,?Chapter 4 : The Academy, Chapter 5 : The Approaching Darkness reading sessions on Youtube with subtitles. The narrator is supported by the original soundtrack where the music is also composed by me, so I would love to hear what you think about them.