What happens when you type google.com in your browser and press Enter
image@betaScribbles

What happens when you type google.com in your browser and press Enter

What happens when you type google.com in your browser and press Enter

=====================================================================


(https://miro.medium.com/max/1400/1*1NuSjTdxAYYaYiLjeP3avg.jpeg)

Introduction

When you type a URL into your browser and press enter, a complex series of events take place in the background before the webpage is loaded on your screen. This process involves multiple parties, including your computer, the Domain Name System (DNS), the web server hosting the website, and the browser itself. In this blog post, we'll explore each step in detail to understand what happens when you type https://www.google.com in your browser and press Enter.

(https://miro.medium.com/max/1276/1*oz1W26rGL77Hso_STfQEXg.png)

URL parsing

URL parsing is a function of traffic management and load-balancing products that scan URLs to determine how to forward traffic across different links or into different servers. A URL includes a protocol identifier (HTTP, for Web traffic) and a resource name, such as [www.microsoft.com](https://www.microsoft.com/).?

DNS Lookup

When you type a URL into your browser, the first thing that happens is a Domain Name System (DNS) lookup. The DNS is like a phone book for the internet, translating human-readable domain names like google.com into machine-readable IP addresses like 216.58.194.174. This is important because computers communicate with each other using IP addresses, not domain names.

The DNS lookup process involves several steps:

  • Your computer sends a DNS query to a DNS server, either one provided by your internet service provider (ISP) or a public DNS service like Google DNS or Cloudflare DNS.
  • The DNS server checks its cache to see if it has the IP address for the domain name in its records. If it does, it returns the IP address to your computer.
  • If the DNS server doesn't have the IP address in its cache, it sends a query to another DNS server higher up in the hierarchy, typically the root DNS server.
  • The root DNS server directs the query to the top-level domain (TLD) DNS server for the domain name (in this case, the .com TLD server).
  • The TLD server directs the query to the authoritative DNS server for the domain name (in this case, Google's DNS server).
  • The authoritative DNS server responds to the query with the IP address for the domain name.
  • The DNS server caches the IP address for future use, and your computer uses it to establish a connection with the web server hosting the website.

(https://miro.medium.com/max/1400/1*yfKwsrIgreeB02WLsF0dwg.jpeg)

DNS Lookup journey

So, what exactly happened after I hit the Enter key?

1.?My **browser** looks at its DNS cache to see if it'd been there before and knows the IP address mapped to it. Let's say it didn't.

2.?My **computer** also looks into its local DNS cache to see if it knows an IP address mapped to that domain. Nope.

3.?My home **router** comes in to try its local DNS cache. Still no luck.

4.?We go out to ask my **ISP's DNS server** if that domain is in its cache. Sorry it is not there, but that recursive DNS server can help us resolve it.

5.?The resolver goes to ask the **root name servers** about the domain name, let's say it is *example.com*

6.?Root name servers know all the **TLD (top level domain) name servers**. Since we came with a *.com* domain, it forwards our query to one TLD name server that handles *.com* domains.

7.?The .com TLD name server knows the **authoritative name server** who stores the DNS records for the domain *example.com* so it forwards the query ahead.

8.?The authoritative name server responds with an *A record* (address record which is an IP address) mapped to the domain name.

9.?The IP address was then passed all the way back to our browser, each one in between who has a DNS cache will cache it on the way so next time when we or someone else asks about example.com, the answer comes faster.

10. Browser opens a TCP/IP connection to the IP address, which is the address of the server hosting *example.com*, then sends an HTTP request. If the server is up and running, it sends back HTTP responses to our browser.

TCP / IP

![TCP/IP process](https://miro.medium.com/max/868/1*AKwC9h1wKki1_dzdzOXLvA.gif)

Establish a connection

Once your computer has the IP address for the website, it establishes a connection with the web server hosting the website using the Transmission Control Protocol (TCP). TCP is a reliable protocol that ensures that packets of data are transmitted and received correctly and in order.

The connection process involves a three-way handshake:

  • Your computer sends an SYN packet to the web server, indicating that it wants to establish a connection.
  • The web server responds with an SYN-ACK packet, indicating it's willing to establish a connection.
  • Your computer sends an ACK packet, indicating that the connection has been established.

Once the connection is established, your computer and the web server can begin to communicate.

SSL

(https://miro.medium.com/max/1400/1*Y4AzZOlwNgaOrbwSDHibeg.png)

SSL handshake

How do SSL certificates work?

SSL works by ensuring that any data transferred between users and websites, or between two systems, remains impossible to read. It uses encryption algorithms to scramble data in transit, which prevents hackers from reading it as it is sent over the connection. This data includes potentially sensitive information such as names, addresses, credit card numbers, or other financial details.

The process works like this:

1.?A browser or server attempts to connect to a website (i.e., a web server) secured with SSL.

2.?The browser or server requests that the web server identifies itself.

3.?The web server sends the browser or server a copy of its SSL certificate in the response.

4.?The browser or server checks to see whether it trusts the SSL certificate. If it does, it signals this to the web server.

5.?The web server then returns a digitally signed acknowledgment to start an SSL-encrypted session.

6.?Encrypted data is shared between the browser or server and the web server.

HTTPS

HTTPS requires a TLS certificate to be installed on your server. You can apply certificates to different protocols, like HTTP (web), SMTP (email), and FTP. An SSL or TLS certificate works by storing your randomly generated keys (public and private) in your server. The public key is verified with the client and the private key is used in the decryption process.

HTTP is just a protocol, but when paired with TLS or transport layer security it becomes encrypted.

(https://miro.medium.com/max/1400/1*AR83GZhw_6WMuGozCtXmHQ.jpeg)

HTTPS Stack

When your browser connects to an HTTPS server, the server will answer with its certificate. The browser checks if the certificate is valid:

1.?the owner information needs to match the server name that the user requested

2.?the certificate needs to be signed by a trusted certification authority

If one of these conditions is not met, the user is informed about the problem.

(https://miro.medium.com/max/1400/1*neEbP2ZXFVasPmSPFo_J2w.png)

When HTTP is used, a series of handshakes take place.

The initial request is sent to the server for verification. When the server responds that it is the desired server the client then sends a hello message.

At this point the communication becomes encrypted.

Is to exchange encryption keys or ciphers.

At this point, the reader's communication can proceed. The initial handshake steps take place in a matter of milliseconds.


load Balancer

(https://miro.medium.com/max/1120/1*s2lnov3kDTN3seSpB8BWmQ.png)

Load balancing

As an organization meets the demand for its applications, the load balancer plays the role of the traffic cop in the network, deciding which servers can handle that traffic. This [traffic management](https://dzone.com/articles/load-balancers-and-high-volume-traffic-management-1#:~:text=Load%20balancers%2C%20also%20referred%20to,performance%20of%20websites%20and%20applications.&text=Load%20balancers%20are%20responsible%20for%20the%20traffic%20distribution.) is intended to deliver a good user experience. Load balancers [monitor the health](https://blogs.tensult.com/2020/01/29/how-to-configure-verify-and-update-health-checks-of-classic-load-balancer/) of web servers and backend servers to ensure they can handle requests. If necessary, it removes unhealthy servers from the pool until they are restored. Some even trigger the creation of new virtualized application servers to cope with increased demand and maintain response times. The most effective load balancers operate with workloads across multiple environments (on-premises and cloud) and diverse infrastructures (bare metal servers, VMs, and containers).

A load balancer is an intermediary responsible for handling this traffic-splitting work. A load balancer is software that can be configured either on the same server as that hosting web content or on a server all its own. One common and free load balancer software is [HAProxy](https://www.haproxy.org/).

(https://miro.medium.com/max/1400/1*GCvWoR1uAoF9dEPtf9cWnA.png)

Haproxy

HTTP request traffic is split up by a program such as HAProxy according to a load-balancing algorithm. There are various types of load-balancing algorithms, each with its own advantages and disadvantages. One such example includes round-robin load balancing, which sends requests to servers in turn according to a queue. Another is the least connections, which sends a new request to the server currently handling the least number of connections. You can read about more load-balancing algorithms (https://devcentral.f5.com/articles/intro-to-load-balancing-for-developers-ndash-the-algorithms).

Firewall

(https://miro.medium.com/max/636/1*exUmkkFw6shiin4xJw0-hw.png)

Firewalls are software or hardware that work as a filtration system for the data attempting to enter your computer or network. [Firewalls scan packets for malicious code](https://www.n-able.com/blog/malware-analysis-steps) or attack vectors that have already been identified as established threats. Should a data packet be flagged and determined to be a security risk, the firewall prevents it from entering the network or reaching your computer.

Firewalls in load balancing configuration are sandwiched between Server Load Balancing systems. Traffic from the Internet is directed to one firewall within a group of firewalls. Traffic from the organization's internetwork is distributed in a similar fashion.

Host server

A hosting server is a generic term for a type of server that hosts or houses websites and/or related data, applications, and services. It is a remotely accessible Internet server with complete Web server functionality and resources.

A hosting server is also known as a Web hosting server.

(https://miro.medium.com/max/518/1*LJOL10wI320kST6SBI0njQ.png)

Lamp stack

-??**L** --- Linux --- the operating system on which the host server runs. Pick your favorite distribution.

-??**A** --- Apache --- the HTTP web server. This is the software that handles HTTP request/response messages and ultimately delivers the static web page. Apache is the most common HTTP web server used, although others such as Nginx are equally capable.

-??**M** --- MySQL --- the database server. This is the database software, typically SQL-based, that store's information such as user accounts. MySQL is a free and popular one, but again, any database software works. A typical website will be configured with multiple database servers, with one configured as a "primary" database having exclusive write privileges whose changes are echoed out to "replicant" databases only having read privileges. This setup is referred to as a "primary replica" model.

-??**P** --- PHP/Python --- the application server. Web servers are fine for delivering static, unchanging web pages, but lack the capability of representing dynamic content crucial to modern sites. PHP and Python are two high-level languages supported by web servers that can handle dynamic content, but other languages include JavaScript and Ruby.

Delivery of a web page works as follows:

-??A GET request is received by the web server. The web server pulls up the file configured at the given location (in our example, the HTML file configured at the root (`/`) of the machine).

-??If the file contains dynamic content, the application server is run (ie. the corresponding Python scripts are run). The result of these scripts is inserted into the web page.

-??If the dynamic content involves stored data, the Python scripts query from the database server (probably through Python libraries such as [MySQLdb](https://mysqlclient.readthedocs.io/#) or [SQLAlchemy](https://www.sqlalchemy.org/)).

-??The web server delivers the web page.

Page rendering

The initial status line of this response message includes a status code indicating the success of the handled request. Upon successful retrieval and delivery of the web page, the host server signals `200`. Other common status codes include `301` (page redirection) and `404` (page not found).

Nice one??

回复

要查看或添加评论,请登录

Paul Ajeigbe的更多文章

  • Noteing

    Noteing

    1. Introduction We wanted to work on something simple and mobile-friendly that helps everyone keep track of thoughts or…

  • Postmortem Report - Incident: Service Downtime

    Postmortem Report - Incident: Service Downtime

    **Summary:** On 3 Mar 2023, our service experienced an unplanned downtime that lasted for some hours. The incident…

社区洞察

其他会员也浏览了