What happens when you type google.com in your browser and press Enter
Paul Ajeigbe
Passionate Software/Web Developer with expertise in HTML, CSS, JavaScript, React, MongoDB, Express, Node, and UX/UI design. Dedicated to user-centric solutions and innovation. Volunteer and Tech Coach. Let connect!!
What happens when you type google.com in your browser and press Enter
=====================================================================
Introduction
When you type a URL into your browser and press enter, a complex series of events take place in the background before the webpage is loaded on your screen. This process involves multiple parties, including your computer, the Domain Name System (DNS), the web server hosting the website, and the browser itself. In this blog post, we'll explore each step in detail to understand what happens when you type https://www.google.com in your browser and press Enter.
URL parsing
URL parsing is a function of traffic management and load-balancing products that scan URLs to determine how to forward traffic across different links or into different servers. A URL includes a protocol identifier (HTTP, for Web traffic) and a resource name, such as [www.microsoft.com](https://www.microsoft.com/).?
DNS Lookup
When you type a URL into your browser, the first thing that happens is a Domain Name System (DNS) lookup. The DNS is like a phone book for the internet, translating human-readable domain names like google.com into machine-readable IP addresses like 216.58.194.174. This is important because computers communicate with each other using IP addresses, not domain names.
The DNS lookup process involves several steps:
DNS Lookup journey
So, what exactly happened after I hit the Enter key?
1.?My **browser** looks at its DNS cache to see if it'd been there before and knows the IP address mapped to it. Let's say it didn't.
2.?My **computer** also looks into its local DNS cache to see if it knows an IP address mapped to that domain. Nope.
3.?My home **router** comes in to try its local DNS cache. Still no luck.
4.?We go out to ask my **ISP's DNS server** if that domain is in its cache. Sorry it is not there, but that recursive DNS server can help us resolve it.
5.?The resolver goes to ask the **root name servers** about the domain name, let's say it is *example.com*
6.?Root name servers know all the **TLD (top level domain) name servers**. Since we came with a *.com* domain, it forwards our query to one TLD name server that handles *.com* domains.
7.?The .com TLD name server knows the **authoritative name server** who stores the DNS records for the domain *example.com* so it forwards the query ahead.
8.?The authoritative name server responds with an *A record* (address record which is an IP address) mapped to the domain name.
9.?The IP address was then passed all the way back to our browser, each one in between who has a DNS cache will cache it on the way so next time when we or someone else asks about example.com, the answer comes faster.
10. Browser opens a TCP/IP connection to the IP address, which is the address of the server hosting *example.com*, then sends an HTTP request. If the server is up and running, it sends back HTTP responses to our browser.
TCP / IP
data:image/s3,"s3://crabby-images/0139e/0139efbba12f8aea0964fb2f42d0c5d712ac276f" alt="TCP/IP process"
Establish a connection
Once your computer has the IP address for the website, it establishes a connection with the web server hosting the website using the Transmission Control Protocol (TCP). TCP is a reliable protocol that ensures that packets of data are transmitted and received correctly and in order.
The connection process involves a three-way handshake:
Once the connection is established, your computer and the web server can begin to communicate.
SSL
SSL handshake
How do SSL certificates work?
SSL works by ensuring that any data transferred between users and websites, or between two systems, remains impossible to read. It uses encryption algorithms to scramble data in transit, which prevents hackers from reading it as it is sent over the connection. This data includes potentially sensitive information such as names, addresses, credit card numbers, or other financial details.
The process works like this:
1.?A browser or server attempts to connect to a website (i.e., a web server) secured with SSL.
2.?The browser or server requests that the web server identifies itself.
3.?The web server sends the browser or server a copy of its SSL certificate in the response.
4.?The browser or server checks to see whether it trusts the SSL certificate. If it does, it signals this to the web server.
5.?The web server then returns a digitally signed acknowledgment to start an SSL-encrypted session.
6.?Encrypted data is shared between the browser or server and the web server.
领英推荐
HTTPS
HTTPS requires a TLS certificate to be installed on your server. You can apply certificates to different protocols, like HTTP (web), SMTP (email), and FTP. An SSL or TLS certificate works by storing your randomly generated keys (public and private) in your server. The public key is verified with the client and the private key is used in the decryption process.
HTTP is just a protocol, but when paired with TLS or transport layer security it becomes encrypted.
HTTPS Stack
When your browser connects to an HTTPS server, the server will answer with its certificate. The browser checks if the certificate is valid:
1.?the owner information needs to match the server name that the user requested
2.?the certificate needs to be signed by a trusted certification authority
If one of these conditions is not met, the user is informed about the problem.
When HTTP is used, a series of handshakes take place.
The initial request is sent to the server for verification. When the server responds that it is the desired server the client then sends a hello message.
At this point the communication becomes encrypted.
Is to exchange encryption keys or ciphers.
At this point, the reader's communication can proceed. The initial handshake steps take place in a matter of milliseconds.
load Balancer
Load balancing
As an organization meets the demand for its applications, the load balancer plays the role of the traffic cop in the network, deciding which servers can handle that traffic. This [traffic management](https://dzone.com/articles/load-balancers-and-high-volume-traffic-management-1#:~:text=Load%20balancers%2C%20also%20referred%20to,performance%20of%20websites%20and%20applications.&text=Load%20balancers%20are%20responsible%20for%20the%20traffic%20distribution.) is intended to deliver a good user experience. Load balancers [monitor the health](https://blogs.tensult.com/2020/01/29/how-to-configure-verify-and-update-health-checks-of-classic-load-balancer/) of web servers and backend servers to ensure they can handle requests. If necessary, it removes unhealthy servers from the pool until they are restored. Some even trigger the creation of new virtualized application servers to cope with increased demand and maintain response times. The most effective load balancers operate with workloads across multiple environments (on-premises and cloud) and diverse infrastructures (bare metal servers, VMs, and containers).
A load balancer is an intermediary responsible for handling this traffic-splitting work. A load balancer is software that can be configured either on the same server as that hosting web content or on a server all its own. One common and free load balancer software is [HAProxy](https://www.haproxy.org/).
Haproxy
HTTP request traffic is split up by a program such as HAProxy according to a load-balancing algorithm. There are various types of load-balancing algorithms, each with its own advantages and disadvantages. One such example includes round-robin load balancing, which sends requests to servers in turn according to a queue. Another is the least connections, which sends a new request to the server currently handling the least number of connections. You can read about more load-balancing algorithms (https://devcentral.f5.com/articles/intro-to-load-balancing-for-developers-ndash-the-algorithms).
Firewall
Firewalls are software or hardware that work as a filtration system for the data attempting to enter your computer or network. [Firewalls scan packets for malicious code](https://www.n-able.com/blog/malware-analysis-steps) or attack vectors that have already been identified as established threats. Should a data packet be flagged and determined to be a security risk, the firewall prevents it from entering the network or reaching your computer.
Firewalls in load balancing configuration are sandwiched between Server Load Balancing systems. Traffic from the Internet is directed to one firewall within a group of firewalls. Traffic from the organization's internetwork is distributed in a similar fashion.
Host server
A hosting server is a generic term for a type of server that hosts or houses websites and/or related data, applications, and services. It is a remotely accessible Internet server with complete Web server functionality and resources.
A hosting server is also known as a Web hosting server.
Lamp stack
-??**L** --- Linux --- the operating system on which the host server runs. Pick your favorite distribution.
-??**A** --- Apache --- the HTTP web server. This is the software that handles HTTP request/response messages and ultimately delivers the static web page. Apache is the most common HTTP web server used, although others such as Nginx are equally capable.
-??**M** --- MySQL --- the database server. This is the database software, typically SQL-based, that store's information such as user accounts. MySQL is a free and popular one, but again, any database software works. A typical website will be configured with multiple database servers, with one configured as a "primary" database having exclusive write privileges whose changes are echoed out to "replicant" databases only having read privileges. This setup is referred to as a "primary replica" model.
-??**P** --- PHP/Python --- the application server. Web servers are fine for delivering static, unchanging web pages, but lack the capability of representing dynamic content crucial to modern sites. PHP and Python are two high-level languages supported by web servers that can handle dynamic content, but other languages include JavaScript and Ruby.
Delivery of a web page works as follows:
-??A GET request is received by the web server. The web server pulls up the file configured at the given location (in our example, the HTML file configured at the root (`/`) of the machine).
-??If the file contains dynamic content, the application server is run (ie. the corresponding Python scripts are run). The result of these scripts is inserted into the web page.
-??If the dynamic content involves stored data, the Python scripts query from the database server (probably through Python libraries such as [MySQLdb](https://mysqlclient.readthedocs.io/#) or [SQLAlchemy](https://www.sqlalchemy.org/)).
-??The web server delivers the web page.
Page rendering
The initial status line of this response message includes a status code indicating the success of the handled request. Upon successful retrieval and delivery of the web page, the host server signals `200`. Other common status codes include `301` (page redirection) and `404` (page not found).
Software Engineer
1 年Nice one??