Demystifying TCP …

Demystifying TCP …

TCP is one of the most widely used protocols on the internet, and at the same time very complex too. TCP has decades of research behind it, and it is a result of research spanning over 20 years to make TCP stand where it is today.

TCP has been designed for?Reliable Data Delivery in a lossy network

Need for TCP

TCP Peers ( TCP sending and receiving process ) may reside anywhere on the network — separated by tens of intermediate routers in the network. Intermediate routers can impose problems — packet loss, slow routers, etc. The network itself (like bandwidth) imposes problems on the rate of communication. In a nutshell, there can be ’n’ a number of factors that causes disruption in data flow between the tcp-sender and tcp-receiver. The network is like an open ocean, anything can happen at any time! Consider you are downloading a software file of 1GB from the Server. Even a loss of 1 Byte in the network would leave the entire downloaded software copy useless. These challenges required the Engineering community to design a protocol that could deal with the uncertainty of the open network, and that's why TCP was born.

Challenges the TCP Designers and researchers had :

1. How Receiver detects that the packet is malformed?

2. How sender can determine whether the receiver has received the packet?

3. How long the sender should wait for ACK from the Receiver?

4. What if ACK itself is lost?

5. How receiver will manage when it receives packets out of sequence?

6. What if the receiver is slow than Sender Or the Receiver receives duplicate copies of the packet?

7. What if the network itself is slower or recovers over a period of time?

8. With how much rate should the sender send the packets to the receiver?

Not implemented overnight, TCP is an outcome of research spanning around 20 years with 100s of research papers and RFCs.

??▓█?Let us touch base with some of the fundamental features of TCP in the remaining section of this Article.?█▓??

??TCP is a connection Oriented Protocol

This is the least understood feature of TCP. By?connection,?it doesn't mean that there is a dedicated wire laid out between the TCP sender and receiver. It simply means — the communicating parties know each other, and they keep complete track of how much data has already been exchanged, how much is pending, how much is yet to send, and how much can be sent at the moment. TCP on both parties maintain data structures and updates them when each data byte is received from the other end. Remember, in between two TCP communicating parties exist the insecure, unreliable, lossy internet. TCP creates an illusion as if the TCP peers are communicating directly — there is no one (internet/routers/ switches etc) between them. TCP takes immediate action to neutralize OR adjust to any adverse impact arising by virtue of the existence of an open unreliable/lossy network between TCP peers. TCP does 3-way-handshake before carrying out any data exchange with the peer to ensure the connection is fully established. 3-way-handshake ensures that the TCP connection is duplex and that both parties are willing to communicate with each other. To be sure it is not one-sided love.

???TCP is Byte-Oriented Protocol

Each TCP Peers keeps the track of data being exchanged over TCP connection at the byte level. That is, the status of each byte is recorded — sent, yet to be sent, Acknowledged, yet to acknowledged, resent, etc. Secondly, TCP protocol knows to know message boundaries, it sees application data as a stream of bytes. If the application sends two different messages cascaded one after the other, TCP cannot know where the Ist msg ends, and where the 2nd begins. For example, if the application sends — “Hello Abhishek”, for TCP it is a sequence of 15 bytes that's it ( including space ). TCP can deliver the message to the recipient application as one unified message “Hello Abhishek” Or broken into several parts —

[Hell] [o Abhishek] — In two parts.

[He] [llo Abhi] [she] [k] — in 4 parts, and so on.

An application running on Top of TCP must not assume to receive messages of fixed sizes. TCP guarantees In-order delivery of bytes to the recipient application but it does not guarantee in how many parts the message will be delivered to the application. When does the Receiving TCP decide to deliver the message to the application depends on many factors like — the rate of reception of data, the time elapsed since the last byte recvd, Avg RTT time, etc.

???Reliable Data Delivery

The biggest strength and one of the most prominent features of TCP is its ability to deliver reliably. TCP sender retransmits the data segment again if it detects the segment loss in the network. TCP incorporates the system of?Acknowledgements?(ACK) to detect segment loss in the network. TCP receiver sends the acknowledgment back to TCP Sender for the segment it receives. Having received the ACKs, the sender believes the segment is delivered to the recipient successfully, and indeed it is.

???TCP Data Flow and Window Management

As I said earlier, TCP keeps track of each Byte of data. It does this using Window Management and stamping each byte with a unique strictly increasing numeric identifier called sequence number. TCP is a sliding window protocol.

Both — TCP Sender and TCP Reciever have a window called?send window and recv window.?Since TCP is a duplex connection by default, there TCP sender is also the receiver and the receiver is also the sender, hence each party has to?send and recv window?for either direction. The windows are nothing but Byte Circular buffers. The send window of the Sender is paired with recv window of the receiver.

No alt text provided for this image

TCP Send and Recv Window can be compartmentalized as shown in the below diagram.

Tcp Send Window as you can see below is logically partitioned into 4 compartments :

  1. Bytes Sent and Acknowledged [28, 30]
  2. Bytes Sent but not yet Acknowledged [31, 33]
  3. Not Sent but Reciever is ready to receive it [34, 36]
  4. Bytes that are not sent, the Receiver is not eligible to receive them either [37, 43]

No alt text provided for this image

Similarly, Recv Wind is logically partitioned into 3 compartments :

No alt text provided for this image

  1. Bytes which are recvd and Acknowledged [29, 30]
  2. Bytes that are not yet Received, but Sender is permitted to send [31, 36]
  3. Bytes which are not yet Recvd, Nor the Sender is permitted to send [37, 43]

Now there are certain rules laid down which govern the sliding of send and receive windows of TCP Peers. I will not go into specifics of those rules, but you understand the essence. TCP keeps track of each Byte of Data on TCP sending and Receiving side using Window Management System. Both these windows slide towards rights indicating the data being successfully delivered to the recipient. As I said, Windows are implemented as Byte Circular Queues. I will cover specifics of the Window Sliding rule in a separate article, this article aims to look at TCP from a higher level.

???TCP Flow and Congestion Control

The last and one of the most essential features of TCP is its ability to control the congestion in the network. TCP has the ability to detect if the network is in a congested state, and once it detects the congestion state TCP takes immediate steps to mitigate the congestion, at-least, TCP himself should not contribute to congestion if congestion has occurred because of other reasons.

Now Congestion in the network could arise because of two reasons :

?? The TCP Receiver machine is slow or overwhelmed such that it is not able to process the data at the rate the TCP Sender is sending

?? The Network itself is congested due to other factors which could be : Some devices on the network throttling the network bandwidth, the Network is lossy, low bandwidth links, Slow intermediate machines present in the network etc.

TCP employs different strategies to deal with slow networks and slow TCP Receivers. When TCP invokes congestion control procedures to mitigate the congestion that was caused due to slow TCP recipient, then procedures are referred to as?flow control.?When TCP invokes congestion control procedures to mitigate the congestion that was caused due to slow network, then procedures are simply referred to as?congestion control.

How Flow Control is Implemented:

The TCP receiver advertises the size of its recv window in every ACK that it sends to the TCP sender. TCP sender having recvd this advertisement sets the size of its send window to the value advertised by recvr. By definition, Send Window determines the no of bytes the TCP sender can send in one go. Thus , TCP receiver controls the size of TCP sender’s Send window, which controls the rate at which the TCP sender can send the data to the Receiver — This is called?Window based flow control.

Overwhelming/congested TCP Receiver tends to reduce its recv window size and advertise the reduced size of its recv window in ACK to TCP Sender, thus, mitigating the congestion. Both Peers Advertise the size of their respective TCP Recv Window to other during TCP connection establishment phase — three-way handshake. Thus Flow Control procedures are triggered based on feedback provided by TCP-Receiver.

How Congestion Control is Implemented:

Congestion Control Procedures?(?CCP ) are triggered by sender without any assistance/feedback from TCP receiver. Whereas Flow control Deals with Slow Receivers, and is driven by Receivers, Congestion Control Deals with slow Networks, and is driven by TCP Sender alone. Without CCP, slow network would drop packet only to trigger TCP Sender to retransmit lost segments — making the situation even worse. CCP enable TCP Sender to adopt itself to ever changing dynamic Network state.

TCP Congestion Control Procedures involve two algorithms :

  1. Slow Start

Goal: To determine the maximum rate at which the TCP sender can inject the segments into the network without experiencing packet loss.

Slow Start Algorithm is triggered on TCP sender side When :

1. New Connection has just established

2. Retransmission timeout (RTO) for a data segment happen (pkt loss)

3. When TCP sender do not send any data and stay idle for some time

2. Congestion Avoidance

TCP is always in a constant try to send as maximum as possible the data bytes into the network while respecting :

  1. The network traffic carrying capacity and
  2. receiver’s capability

In CA phase, TCP Sender keep probing the network for any additional bandwidth/capacity it has to offer to the connection, but, like slow-start, TCP does not probe the network as aggressively in CA phase.

No alt text provided for this image

TCP Graph

No alt text provided for this image

TCP Congestion Control Flowchart


???Learning TCP in Depth

One common problem I see among learners is that they complain that they don't find in-depth material on learning TCP internals anywhere — neither on youtube nor on any MOOCs site. All sources just touch the TCP at the surface ( like this article :p ). Luckily, there exists a course that tends to explain TCP and its internal machinery at the RFC level. Pls, check out the 9-hour-long course?here.?It has been awarded the?Best Seller Tag?on Udemy from Day-1.

No alt text provided for this image

▂ ▄ ▅ ▆ ▇ █ █ ▇ ▆ ▅ ▄ ▂ ▁▁ ▂ ▄ ▅ ▆ ▇ █ █ ▇ ▆

Want to get access to the Full Course?

website?www.csepracticals.com, We offer?20 Courses on System Programming, Operating Systems, Network Programming, and Development Projects.

You can enroll in all our courses for free?here?for a 30 days trial. You will have complete access for 30 days, including this course.

Join us :

Telegram Grp (600+ System Programming Engineers ):?https://t.me/telecsepracticals

Sign up with us?here

Our Course list, visit?www.csepracticals.com

1. Part A — Multithreading & Thread Synchronization in C (14h 11m)

2. Part B (ADV) Multithreading Design Patterns in C/C++ (8h 40m)

3. Part A — Networking Projects — Implement TCP/IP Stack in C (14h 20m)

4. Part B — Networking Projects — Implement TCP/IP Stack in C (8h 12m)

5. Operating System Project — Develop Heap Memory Manager in C (7h 22m)

6. Linux Kernel Programming — IPC b/w Userspace and KernelSpace (2h 48m)

7. Coding Project — Programming Finite State Machines (2h 0m)

8. Master Class : TCP/IP Mechanics from Scratch to Expert (9h 3m)

9. System C Project — Write a Garbage Collector from Scratch (3h 35m)

10. Build Remote Procedure Calls (RPC) — from scratch in C (6h 15m)

11. Linux System Programming Techniques & Concepts (14h 0m)

12. Linux Inter-Process Communication (IPC) from Scratch in C (8h 41m)

13. Network Concepts and Programming from Scratch — Academic LvL (22h 59m)

14. Integrate CLI interface to your C/C++ Projects Quickly (1h 20m)

15. System C/C++ Course on Linux Timers Implementation & Design (3h 48m)

16. Part A — Networking Project — Protocol Development from Scratch (14h 20m)

17. Part B — Networking Project — Protocol Development from Scratch (7h 54m)

18. Asynchronous Programming Design Patterns — C/C++ (5h 11m)

19. Advanced TCP/IP Socket Programming in C/C++ ( Posix ) (4h 30m)

20. Network Security — Implement L3 Routing Table and Access Control List (1h 48m)

Abhay Shanker Pathak

Cloud Application Developer | Retail Tech. | Polyglot Programmer | Systems & Network Programming Enthusiast | Exploring Distributed Systems

2 年

One of the things which used to confuse me was how TCP is connection-oriented, if IP is connectionless. Nice article

回复
Shubham Kumar Gupta

SDE Android @ OLA | Research Intern @ UiT-Norway | 8th Inter IIT Tech Gold | Ex-SDE Team Lead Cueweb | Google Hash Code AIR 19 | GCI'19 Mentor | TechGiG AIR 24 | Bug Hunter | BLE | 16k+ LinkedIn Family ??

2 年

Love this

回复
Amit Verma

96K+ @LinkedIn || Ex- KoiReader || Tech - marketing || Educating with insights ?? || OPEN for collaboration

2 年

Great share

回复

要查看或添加评论,请登录

Abhishek Sagar ????的更多文章

  • SQL Query Order Execution

    SQL Query Order Execution

    Have you ever wondered how a typical SQL query is executed by the SQL engine of a relational database system? In this…

    1 条评论
  • What does it take to Implement a Network Protocol?

    What does it take to Implement a Network Protocol?

    Routing Protocols are the heart of Networking. They power the Data Center Switching, Internet, clouds, and anything…

  • What does it take to Implement a Network Protocol?

    What does it take to Implement a Network Protocol?

    Routing Protocols are the heart of Networking. They power the Data Center Switching, Internet, clouds, and anything…

  • Mutex Vs Condition Variables

    Mutex Vs Condition Variables

    A Common Confusion Point among many is - What is the difference between Mutex and Condition Variable ? What exactly is…

  • What are Remote Procedure Calls ( RPCs )?

    What are Remote Procedure Calls ( RPCs )?

    As the name suggests, RPC means, invoking a function/procedure which is implemented and running on a remote machine in…

    2 条评论

社区洞察

其他会员也浏览了