登录查看更多内容

A TCP option that improves Frontends and Backends latency

Hussein Nasser

Software Engineer | Talks about backend, databases and operating systems

发布日期: 2024年12月17日

99% of network latency issues are caused by the user app logic. App logic here includes libraries and frameworks used by the app. Sometimes however, the 1% could be the kernel. Here is one example where a config to TCP can improve backend and frontend network latency.

When an app writes to a socket connection, whether its the frontend sending a request or the backend sending a response, the raw bytes are copied from the user space process application to the kernel memory where TCP/IP takes place.

Each connection has a receive buffer where data from the other party arrives and a send buffer where the data from the application goes in before it is sent to the network. Both send and receive buffers live in the kernel space.

Data written by the app to the kernel are not sent immediately but instead buffered in the kernel. The kernel hope is to get a full size worth of data to fill TCP segment. This is called an MSS or maximum segment size which is often around 1500 bytes.

The reason of the buffering is segment overhead, Each segment comes with ~40 bytes header, the overhead of sending few bytes with such a large header can lead to inefficient use of network bandwidth. Thus the buffering, classic computer science stuff.

领英推荐

Failable initialisers in Swift, and their usage

Evangelist Apps 2 年前

Find the ‘On’ button. Change the world. (Newsletter…

Ask Wire 2 年前

What Are Some Ways To Handle Failed Network Requests…

Korvage Information Technology 11 个月前

So by default the kernel delays sending the segments out through the network in hopes of receiving more data from the application to fill a full MSS. The algorithm which specifies when to delay and by how much is called Nagle’s algorithm.

You can disable this delay by setting the TCP_NODELAY on the socket option while creating the connection. This causes the kernel to send whatever it has in the send buffer even if it’s a few bytes. This is great because sometimes few bytes is all what we have. This essentially favor low latency over network efficiency.

For the backend, applications can benefit from enabling this option (disabling the delay) especially when writing responses back to the client. This is because responses are sent through the send buffer. Delaying sending segments just because they are not full can lead to slowdowns in writing responses. NodeJS has enabled this option in this PR.

For the frontend, apps can benefit from this option. In 2016 the creator of cURL Daniel spent hours debugging a TLS latency issue only to find out that the kernel was sitting on few bytes of TLS content in segment waiting for a full MSS. That caused the cURL project to set the TCP_NODELAY by default.

Jerin Thomas

2 个月

Animats aka John Nagle https://news.ycombinator.com/item?id=9045125

2 次回应

Avish kaarthik

Software Engineer

2 个月

@Hussein Nasser This is nice ...But how much of overhead does this cause Since the Nagle algo follows any of these 3 triggers 1. Full rcv/send buffer 2. Received ack packet 3. A set timeout Considering this is almost reliable...that there will be an ack sent/received for the previously sent/received data...there could be a possibility that there is delay in sending ack and till then the buffers can filled ...but seems good for any normal http application that doesn't need instant reply...could be wrong here ..just a hypothesis ??????

Mohammad Toyib

Symfony Developer

2 个月

Very informative

Meghana Jagadeesh

Founder & CEO at GoCodeo | Making software development smarter with AI ?? | Speaker on GenAI & tech leadership

2 个月

The detailed explanation here is brilliant. Perfectly demonstrates the value of going beyond libraries and frameworks to optimize performance.? App logic is often blamed, but kernel-level optimizations like TCP_NODELAY are equally crucial for minimizing latency.

查看更多评论

要查看或添加评论，请登录

Hussein Nasser的更多文章

A Story about Lunch and cache invalidation

2025年2月14日

A Story about Lunch and cache invalidation

A construction project at work has been blocking the main doorway to the cafeteria where we get lunch. For the first…

11 条评论
The Six Connections Limit in Chromium Browsers

2025年2月3日

The Six Connections Limit in Chromium Browsers

A web application can be choked by Chrome’s HTTP/1.1 six connection per host limit.

8 条评论
The Beauty of the WAL - A deep dive

2025年1月29日

The Beauty of the WAL - A deep dive

In any database system there are often two major storage units, data and indexes. Data represents tables, collections…

11 条评论
What makes a good database engineer

2025年1月17日

What makes a good database engineer

The art of truly understanding database systems boils down to the following principles. You cannot do much with data on…

18 条评论
What happens when databases crash?

2024年12月13日

What happens when databases crash?

This can make an interesting interview question. Databases have tables and indexes stored in files and cached in memory…

18 条评论
What happens to a request before it’s processed?

2024年12月7日

What happens to a request before it’s processed?

When sending a request to a backend, we tend to focus on the processing aspect of the request, which is really the last…

5 条评论
How to Become a Good Backend Engineer (Fundamentals)

2024年12月3日

How to Become a Good Backend Engineer (Fundamentals)

I have been a backend engineer for over 20 years and I have witness technologies come and go. One thing however, always…

49 条评论
Postgres and MySQL, the main differences

2024年11月21日

Postgres and MySQL, the main differences

One of my udemy students asked a question about the difference between Postgres and MySQL. The answer turned out too…

11 条评论
Good code, Bad code

2024年11月17日

Good code, Bad code

Code is just code, until bugs appear. Then we label it “bad code”.

20 条评论
Avoid SELECT *, even on a single-column tables

2024年11月13日

Avoid SELECT *, even on a single-column tables

Try avoiding SELECT * even on single-column tables. Just keep that in mind even if you disagree.

19 条评论

See all articles

A TCP option that improves Frontends and Backends latency

Hussein Nasser

Software Engineer | Talks about backend, databases and operating systems

领英推荐

Hussein Nasser的更多文章

社区洞察

其他会员也浏览了

Accelerate Your .NET Core 6 Applications: Expert Advice on Achieving Lightning-Fast Speed and Efficiency

Boosting Web Performance: A Deep Dive into HTTP/1, HTTP/2, and HTTP/3

Decoding HTTP/2: Delving into its superior Latency Performance Compared to HTTP/1.1

Hardware Sentry is now observable! Thanks OpenTelemetry!

Analyzing flows with NfSen, nfdump, fprobe, netflow5/9 and IPFIX...all of this is open source.

Agents are all cool, but can you really depend on them?

Behavior vs Data concurrency protection

Follow up on filter nodes in ACE/IIB/WMB

A Beginner's Guide to Node.js Event Loop

What is Self Validating Tokens?

领英推荐

Hussein Nasser的更多文章

A Story about Lunch and cache invalidation

The Six Connections Limit in Chromium Browsers

The Beauty of the WAL - A deep dive

What makes a good database engineer

What happens when databases crash?

What happens to a request before it’s processed?

How to Become a Good Backend Engineer (Fundamentals)

Postgres and MySQL, the main differences

Good code, Bad code

Avoid SELECT *, even on a single-column tables

社区洞察

其他会员也浏览了

Accelerate Your .NET Core 6 Applications: Expert Advice on Achieving Lightning-Fast Speed and Efficiency

Boosting Web Performance: A Deep Dive into HTTP/1, HTTP/2, and HTTP/3

Decoding HTTP/2: Delving into its superior Latency Performance Compared to HTTP/1.1

Hardware Sentry is now observable! Thanks OpenTelemetry!

Analyzing flows with NfSen, nfdump, fprobe, netflow5/9 and IPFIX...all of this is open source.

Agents are all cool, but can you really depend on them?

Behavior vs Data concurrency protection

Follow up on filter nodes in ACE/IIB/WMB

A Beginner's Guide to Node.js Event Loop

What is Self Validating Tokens?