WHAT HAPPENS WHEN YOU TYPE A URL IN YOUR WEB BROWSER AND PRESS “ENTER”?

Surfing the internet ...

A curious young blue tabby Maine Coon cat is standing on a chair in front of a table, looking at a laptop screen while browsing the internet. — photo credit?

Wait! Have you ever wondered what happens behind the scenes when you visit a website, blog, or social media platform for the first time, or wonder how your favorite artist videos are displayed to you on your browser after typing the URL… Do you? ??

Let me brief you. Your browser returns a requested page in a matter of seconds or probably less, this may be very quick, but a lot of complex web infrastructural processes take place just to make you smile :). In this post, we’ll go through several processes involved in loading a website on your browser, from the initial stage (IP Address lookup and DNS Request) to the final stage (web page display).

?

Technical terms used:

·???????? URL (Uniform Resource Locator) is a web address that specifies the location of a resource on the internet, such as a webpage or a file.

·???????? IP Address (Internet Protocol Address): It is an identifying number associated with a specific computer or computer network. It allows computers to send and receive information from one another when connected to the internet.

·???????? Cache (Caching) - is a technique used in computing to temporarily store and manage copies of frequently accessed or computationally expensive data to improve overall system performance and response times.

?

TABLE OF CONTENTS

·???????? IP Address Lookup and DNS Request: Initial Stage

·???????? TCP/IP: The Protocols that Power the Internet

·???????? HTTP/HTTPS (SSL certified): The Importance of Secure Connections

·???????? Load Distribution and Security (Load Balancer and Firewall)

·???????? CDN (Content Delivery Network)

·???????? Web and Application Servers (Codebase)

·???????? Database: Data storage and retrieval.

·???????? Summary

·???????? Conclusion

?

IP Address Lookup and DNS Request: Initial Stage

Credit:

This is the initial stage or starting point in the system and it requires several steps to achieve success.

Note: These steps only occur when the URL is typed for the first time on a device. Caching processes will prevent it from occurring again and make the process quicker.

Steps involved in DNS:

1. For example when a user types ‘https://facebook.com’ into his web browser, the browser first checks its cache if the IP is present, if the result isn't productive, it checks the system cache and if the result is also not productive, then the process is transferred to the resolver (the resolver is usually our Internet Service Provider).

2. The resolver then queries a DNS root nameserver, if it doesn’t find the IP in its cache. The root server then responds to the resolver with the address of a TLD – Top Level Domain. From the above URL, the TLD is the last character after the domain name. For example, the TLD in “https://facebook.com” is “.com”, and in “https://tediyangcodes.tech” is “.tech”. Other types of TLD are .net, .edu, etc.

3. Then the resolver makes a request to the .com TLD. If present, the TLD server responds with the IP address of the domain requested. Top companies usually have many name servers attached to the domain name.

4. After getting proper info from the TLD, the resolver sends a query to the domain’s name server.

5. On success, the IP address is returned to the resolver from the domain’s name server. The DNS resolver then responds to the web browser with the IP address of the domain requested initially.

6. An HTTP request is then made by the web browser to the IP address, and in response, the server at that IP address may return the webpage to be rendered in the browser. But hold on, I’m not going to jump in too fast.

NB. This will return an error if the web browser isn’t connected to the internet.


TCP/IP: The Protocols that Power the Internet

A diagram of how the TCP/IP model divides data into packets and sends it through 4 different layers. Credit:?

To be clear, this is a set of communication protocols that mainly form the necessary parts of the internet. It provides the fundamental framework for transmitting data between devices over a network, ensuring that information is delivered reliably and efficiently. The TCP/IP suite is organized into layers, each serving a specific purpose in the communication process. For accuracy and precision, the TCP/IP suite breaks down messages into layers at one end and compiles them at the receiving end. Common protocols include HTTP (Hypertext Transfer Protocol), FTP (File Transfer Protocol), and SMTP (Simple Mail Transfer Protocol), and among all these, HTTP is the most common and it is usually sent using HTTPS – Hypertext Transfer Protocol Secure. Your web browser mostly sends requests using HTTPS to prevent sensitive information from being exposed on the internet. Trust me, you don’t want someone to know your credit card details. ??

?

HTTP/HTTPS (SSL certified): The Importance of Secure Connections

Upon obtaining the IP address of the URL, the browser now handles the remaining part of the URL, specifically the "http(s)://" segment. HTTPS, or Hypertext Transfer Protocol Secure, represents a secure iteration of the standard HTTP. Its purpose is to encrypt data exchanged between a web server and a user's browser, heightening the challenge for potential attackers attempting to intercept and pilfer sensitive information like login credentials and payment details.


Not secure
Secure

NB. Please never share important details (credit cards, PIN, …) on an unsecured website and a secure website with an untrusted domain ??.

?

SSL, which stands for Secure Sockets Layer (or TSL, Transport Layer Security), is a standard security protocol designed to establish secure and encrypted communication channels over a computer network, commonly the Internet. SSL provides a secure connection between a client (typically a web browser) and a server, ensuring that the data exchanged between them remains private and protected from potential eavesdropping or tampering by malicious actors. The key aspect of SSL includes;

1.?????? Symmetric or Asymmetric Encryption: Data transferred are encrypted; this ensures that even if the data is breached, it is indecipherable without the appropriate decryption keys. This is mostly done using a private key – handled by the website owner and a public key – used by anyone accessing the website.

2.?????? Authentication: This ensures that users are connected to the right and legitimate servers.

3.?????? Data integrity: SSL ensures the integrity of data by using hashing algorithms. This means that any tampering or modification of the data during transit will be detected.

4.?????? SSL/TLS Handshake: The SSL/TLS handshake is a process that occurs when a client and a server establish a secure connection. It involves multiple steps, including:

a.?????? ClientHello: The client initiates the handshake by sending a message indicating its intention to establish a secure connection.

b.?????? ServerHello: The server responds by selecting the strongest encryption algorithm and generating a unique session key.

c.?????? Certificate Exchange: The server presents its digital certificate to prove its identity.

d.?????? Key Exchange: The client and server exchange information to derive the symmetric encryption key.

e.?????? Finished: Both parties confirm the completion of the handshake.

SSL has evolved into its successor, TLS (Transport Layer Security), but the term "SSL" is still commonly used to refer to the broader concept of secure communication over the Internet. The use of SSL/TLS is critical for securing sensitive data, such as login credentials, personal information, and financial transactions, during online communication.

?

Load Distribution and Security (Load Balancer and Firewall)

Oh, thank God I’ve made a connection I can transfer data now. “Sends request... waiting… waiting… waiting... 1min later, finally I got the data”. But why did it take that long? This is where a Load Balancer may come into play.

A Load Balancer is vital in evenly distributing incoming traffic across multiple servers, preventing any single server from becoming overloaded. This optimization is instrumental in enhancing website performance and mitigating the risk of server crashes.

Especially for high-traffic websites like "google.com, facebook.com, instagram.com, etc" the implementation of load balancing is imperative to ensure consistently high performance and reliability. Upon receiving a user's request, the load balancer allocates the request to a specific server, repeating this process for each subsequent request. The determination of which server should handle each request is made by the load balancer, employing various algorithms for effective distribution. Some load-balancing algorithms are Round-Robin, Least-Connection, Least-Response-Time, Randomized, Weighted-Round-Robin, etc and some providers are HAProxy, Amazon Elastic Load Balancer (ELB), NGINX Load Balancer, etc.

However, as traffic is directed to different servers, security concerns arise. This is where the Firewall comes into play. The firewall acts as a protective barrier, regulating and monitoring incoming and outgoing network traffic based on predetermined security rules. It serves as a gatekeeper, ensuring that only authorized and safe connections are established while protecting against potential security threats. Without a firewall, a network is more vulnerable to a wide range of security risks, malware, and unauthorized access.

In essence, the combination of load balancing and a robust firewall mechanism contributes to both the optimal performance and security of a web infrastructure, creating a balanced and protected environment for handling varying levels of user traffic.

?

CDN (Content Delivery Network)

A Content Delivery Network (CDN) is a distributed network of servers strategically located across various geographical regions to deliver web content efficiently to users. The primary purpose of a CDN is to enhance the performance, reliability, and availability of web services by reducing latency and minimizing the load on a single server. When a user requests content, the CDN can retrieve it from the origin server or serve it directly from the edge server cache, which are servers placed in different locations, often referred to as Points of Presence (PoPs). The edge servers store cached copies of static content from the origin server.

CDNs usually cache static content like images, stylesheets, and scripts on edge servers. This minimizes the need to fetch the same content repeatedly from the origin server, reducing latency and improving load times.

?

Web and Application Servers (Codebase)

After establishing the connection and successfully receiving the user's request, the web server has to deliver the requested web page.

A web server is a software or hardware component that plays a central role in delivering web content to users by processing and responding to their requests. It serves as the foundation for hosting websites, applications, and other online services. It responds to client requests over the Hypertext Transfer Protocol (HTTP) or its secure variant, HTTPS. Most content delivered by web servers are HTML pages, images, CSS files, etc.

Web servers handle both static and dynamic content. Static content remains unchanged and is directly delivered to clients. Dynamic content is generated on the fly, often by applications running on the server (e.g., content management systems, web frameworks). There are several popular web servers, including Apache HTTP Server, Nginx, Microsoft Internet Information Services (IIS), and others. Each has its strengths and is commonly used in different scenarios.

An Application Server is a software framework or platform that provides an environment for running and managing applications. Its primary purpose is to host, deploy, and execute applications, handling the business logic and facilitating communication between the application and other components such as databases, web servers, and clients.

In modern web applications, a common approach involves the combined use of web servers and application servers to efficiently handle various elements of the application stack. This cooperative strategy is referred to as either the two-tier architecture or, when a distinct database server is part of the setup, the three-tier architecture.

?

Database: Data storage and retrieval

Hold on..., where are all these data coming from? Is the server generating them?

Now I’ll walk you through the last stage in our simple web infrastructure is Data Base System Management. A Database Management System (DBMS) is software that provides an interface for managing and interacting with databases. It facilitates the creation, organization, retrieval, updating, and management of data in a structured manner. DBMS acts as an intermediary between the database and the users or applications, offering a systematic way to store and retrieve data. One widely used example of a Database Management System is Microsoft SQL Server. Others are Oracle Database, MySQL, PostgreSQL, MongoDB (a NoSQL database management system), SQLite, Redis, etc.

Image illustrating how the web server, application server, and database work together. Source:

?

Summary

A simple web infrastructure typically includes the fundamental components necessary for hosting and delivering web content.

Upon entering a URL, your computer initiates a DNS request to a DNS server, seeking the resolution of the IP address linked to the domain name. Once the IP address is acquired, your computer establishes a connection with the website server, and the exchange of data between your computer and the website server adheres to the TCP/IP protocols.

The website server utilizes web server software to furnish the requested web page, and when necessary, employs an application server to execute server-side code. To regulate traffic, thwart unauthorized access, and ensure secure data transmission, the website may integrate security measures such as a load balancer, firewall, and HTTPS/SSL encryption.

In the broader context, web applications commonly rely on a database for the storage and retrieval of data. Additionally, a Content Delivery Network (CDN) plays a crucial role in optimizing content delivery. The CDN strategically caches and distributes static content, such as images and scripts, across a network of edge servers. This reduces latency by serving content from servers closer to the user, enhancing overall performance and providing a better user experience.

Conclusively, the web infrastructure is a sophisticated and interconnected system, facilitated by components like DNS, TCP/IP, web servers, application servers, security measures, databases, and CDNs, all working harmoniously to ensure the seamless delivery of web content to users globally.

Illustration showing all the stages when you visit a URL in the web browser


要查看或添加评论,请登录

社区洞察

其他会员也浏览了