What happens when you click www.google.com and press enter
This might be a fascinating question that attracts the attention of anyone. Everybody uses the web every day, but do you bother to know what happens in the background when different activities on the web server are carried out? The Picture above shows the landing page of google, Specifically when you type www.google.com and press enter. So many activities are carried out in the background to ensure the display comes to you.
This article is meant to adumbrate the different things that happen in the background before you get the output above.
Before I begin, let me explain some important terms that will be relevant to this article;
Server: Another word for a computer or computer program can refer to either hardware or software. Servers provide functionality for other devices. In the context of this article, usage of the term “server(s)” will refer to the computer system(s) hosting?https://www.google.com. Servers are of many types, but let us cover the ones within the scope of this article.
Client: Also a computer or computer program, but one that can access services and functionalities hosted on a server. Most familiarly, clients are the personal devices — laptops, smartphones, etc. — that we use to access services through the internet, among other things. In the context of this article, usage of the term “client” will refer to the web browser.
Protocol: Or, more specifically, communication protocol — a general term for a system of rules, or methods, for transmitting data between two devices. The Open System Interconnections (OSI) model, the conceptual model used to describe telecommunications between computers, consists of a myriad of protocols.
Even if there are many protocols, I want to explain two of them TCP/IP (which stands for Transmission Control / Internet Protocol).
TCP/IP is a set of standardized rules that allow computers to communicate on a network such as an internet.
Firewall:?a network security device that monitors and filters incoming and outgoing network traffic based on an organization’s previously established security policies. At its most basic, a firewall is essentially the barrier that sits between a private internal network and the public Internet. A firewall’s main purpose is to allow non-threatening traffic in and to keep dangerous traffic out.
HTTPS/SSL:?Hypertext Transfer Protocol Secure is an extension of the Hypertext Transfer Protocol. It is used for secure communication over a computer network and is widely used on the Internet. In HTTPS, the communication protocol is encrypted using Transport Layer Security or, formerly, the Secure Sockets Layer(SSL).
Now, let us dive into the main objective of this article. First, what does www.google.com stand for?
Step 1: URL Parsing
protocol://hostname: port/path_of_filename === https://www.google.com/
From these terms:
In our case:
Step 2: DNS Lookup
The browser sends www.google.com?to the DNS server to request the IP address of?www.google.com. By using the dig command from Linux os the IP address of?www.google.com?is 142.250.179.164. By using this IP address, the browser changes https://www.google.com?to?https://142.250.179.164.
https://www.google.com?===?https://142.250.179.164?– has similar output.
Step 3 – TCP/IP
领英推荐
Finally, our web browser is ready to go. Having resolved the IP address associated with www.google.com, the browser proceeds to begin communication with the corresponding server through port number 443. The communication between the browser and server occurs over what is referred to as Transmission Control Protocol/Internet Protocol (TCP/IP). This communication protocol is not mandatory — any working protocol goes — but is a standard when it comes to web infrastructure and the OSI model. An alternative transport-layer protocol,?User Datagram Package?(UDP) is faster but less reliable — packet delivery is not double-checked. UDP is typical of streaming services where instant content takes priority; TCP is used most everywhere else.
Step 3 – SSL
The first thing the web browser sends to the resolved IP address of?www.google.com?is a message containing its?Transport Layer Security?(TLS) version along with a list of supported cypher algorithms and compression methods. TLS is a symmetric cryptography encryption method used to keep communicated data?private, authenticated, and?reliable.
Upon receiving this initial communication, the server chooses its preferred TLS algorithm and method and responds with a certificate and security approval including the server’s TLS public key. Back at the client side, the browser uses this public key to encrypt a pre-master key that is sent back to the server.
If the public key sent to our browser was authentic, then the server is able to decrypt the pre-master key with its TLS private key. Upon proof of successful decryption, the browser and server have effectively established a trusted connection and symmetric method of sending messages back and forth.
Step 4 – Load Balancer
A load balancer is an intermediary responsible for handling this traffic-splitting work. A load balancer is software that can be configured either on the same server as that hosting web content or on a server all its own. One such common and free load balancer software is HAProxy. HTTP request traffic is split up by a program such as HAProxy according to a load-balancing algorithm. There are various types of load-balancing algorithms, each with its own advantages and disadvantages.?
Backtracking in our example, the resolved IP address of?www.google.com?was truly the IP address of the load balancer server. The web browser completed the TLS handshake with this load balancer server, thus making it the?TLS termination proxy. Almost like a post office, this server, which we’ll imagine is configured with a round-robin algorithm on HAProxy, was the receiver of our HTTP GET request. HAProxy took the request, pulled up the IP address of the next web server in its queue, and sent it off that way.
Step 5 – Firewall
Through the TLS handshake, our browser came to an agreement with the load balancer server as to how to encrypt messages as they are passed back and forth. TLS achieves three crucial security purposes — privacy, integrity, and identification — yet it fails to account for a fourth — honesty. Contextualizing firewalls in our example, at this point, our GET request has already passed one firewall, installed on the load balancer. It will next pass another installed on whichever host server it is distributed to.
Step 6 – Hosts Server
The host server is a web stack consisting of multiple parts that is traditionally set up along the lines of what is termed the LAMP(Linux Apache MySQL Python/PHP) model.
Delivery of a web page works as follows:
Step 7 – Page Rendering
It’s been a long journey, but our web browser has finally received the web page we requested. After pulling up the HTML file configured at the root of?www.google.com, the host server sent it back to the web browser in an HTTP response message.
The initial status line of this response message includes a status code indicating the success of the handled request. Upon successful retrieval and delivery of the web page, the host server signals?200. Other common status codes include?301?(page redirection) and?404?(page not found).
In the response header, the host server states information about the delivered page such as its type (HTML, in our case) and size.
Finally, in the response message body, the host server delivers the actual, entire HTML code itself. This is what the browser has been looking for since the start! Now it shows off, utilizing its HTML and CSS engines to parse the code, break it down into its Document Object Model, and render the page. Any JavaScript scripts written in the file are run. When it's all said and done, Firefox displays a beauty, a joy, a realization of our dream — the Google home page.
A quick rundown of the process just described:
Now you are able to fathom what actually happens when you type www.google.com. Its all web infrastructure. Kindly share for others to learn too.