Penetration Testing: Generated Code Attacks
Chris Blake, MSc, FIP
Director & Founder @ Firesand Ltd. | MSc, FIP, CISSP
Introduction
This is the fourth article in the series, and it is discussing how the concepts in some of our previous articles, including some of our primer articles, are connected. Many security vulnerabilities lie in the point between two systems. For example, where one system generates code or commands for another system to execute and, in particular, if the generating system uses external input to generate such code. The external input doesn't necessarily have to be direct user input, it could be from a configuration file, or data (e.g., cookie values, HTTP headers, etc.) that can be manipulated by an attacker.
Audience
Who is this article for?
Understanding application layer security threats is important for a wide-range of professions, including:
Pre-Reading
As reference for this article, it is worth reading the following articles that we have already published:
What is Generated Code?
Firstly, to understand generated code attacks, we need to understand what generated code is. In simple terms, generated code is code created by one application for another to execute.
When building applications, in particular where an application (be it mobile app, desktop application, web application, or an API) is built containing multiple technologies that interact - and specifically where one component (c1) talks to another component (c2) by passing it some form of programming code which is to be interpreted and then executed by c2 - then you have generated code.
In this scenario c1 will have generated, in one way or another, code. This is extremely common, there are several common scenarios:
And there are many, many other options, especially these days with increasing use of JSON and Infrastructure-as-Code (IaC).
In all of these scenarios, the receiving component, c2, trusts that the data supplied by c1 is non-malicious and generally - due to the generic nature of the components - it would be impractical for c2 to make a distinction between malicious and non-malicious data.
The following diagram highlights the basic scenario:
So, What are Generated Code Attacks?
Generated Code Attacks are where an attacker exploits the trust between the two or more systems (i.e. between c1 and c2 as above). The receiving, for example often a database, would receive a wide-range of commands from the sender (c1). It is practically difficult - or impossible - for c1 to determine a malicious DROP or DELETE command from a legitimate one.
So, all those aforementioned generated code scenarios (and many more) can be mapped to well known (generated code)? attacks/vulnerabilities, as follows:
Of course a buffer overrun in general also follows a similar pattern, but at a lower-level in the technology stack. A component is sending data to another component for interpretation and processing.
The receiving component trusts the data it is sent - which is the crux of the problem. Practically, in most scenarios, because the receiving component is often a generalised system and an attacker would necessarily send recognised commands, it is difficult for the receiving component to know the difference between a legitimate request and a malicious one.
The following diagram expands upon the previous, highlighting how there is an additional condition needed for an attack to succeed - externally supplied input (i1) that is used in the generation of the generated code:
?
?
领英推荐
The key point now is that i1 is, in most cases, benign. However, in the case of an attack, this is malicious input.
The following presents two examples, based on the above sample scenarios, one discussing SQL Injection and the other discussing XSS.
SQL Injection
In this case, c1 could be a web application, c2 would be a database (e.g. MySQL, Oracle, SQL Server etc). Suppose c1 is generating a SQL statement along the lines of:
SELECT product_name, product_description FROM TProduct WHERE product_name = '<user_supplied_input>'
In order to return product details in some kind of retail / eCommerce web application and where <user_supplied_input> is the input i1.?
If i1 is malicious, a number of attacks are possible. The attacker can inject a UNION SELECT to exfiltrate data from the database. In reality, an attacker could exfiltrate all information from the database if they are able to inject arbitrary SQL in a scenario such as this. Additionally, they could potentially launch destructive attacks using DROP and DELETE commands. If the database server supports something like xp_cmdshell (a database feature that allows the execution of shell commands) it would be possible to interact with the underlying OS that the database server is running on, and almost certainly be able to gain a remote shell (i.e., remote access to the database server).
XSS
In this case, c1 could be a web application, c2 be the user's browser (e.g. Chrome, Firefox, Edge etc). Suppose c1 is a C#.NET web application (ASP.NET) generating HTML content using the code below:
Response.Write("<p> Your search for " + Request.QueryString["q"] + " returned the following results: </p>");
If an attacker were to specify i1 (injecting into the q query string parameter) along the lines of:
<script>alert(1);</script>
This would cause the C#.NET code to generate the HTML content of:
<p> Your search for <script>alert(1);</script> returned the following results: </p>
This would be received by the browser and executed. Of course, this particular example is fairly benign (if slightly annoying, by introducing a pop-up alert box).
Addressing Generated Code Attacks and Why Care
At Firesand, we are often asked for advice with questions such as: "How to stop SQL Injection", "How do I stop XSS", "How do I prevent XXE" and so on. Our advice is to not focus on these specific attacks, as they are all instances of a wider class of problem: Generated Code Attack. Thus, if you defend really well against one instance, do you know if you have defended against any other instances? Whereas, if you resolve the Generated Code Attack problem, you fix not only the one you are concerned about, but also any others that you have not yet considered.
As most, if not all, generated code attacks rely on being able to supply input into a system that is not expected, the primary defence against Generated Code attacks in is, therefore, input validation. Always ensure that input is in the expected form, before you accept it and process it. For example, in the aforementioned SQL Injection and XSS scenarios, the supplied input would be expected to be in the form of lower and upper case alphanumeric characters (possibly with white space) - the exact valid input set is case-specific, of course! If the input is anything other than that expected input, it must be rejected. In doing so, most generated code attacks will fail.
Another line of defence is to sanitise (via encoding) output data when dealing with data being sent to external systems (e.g., use URL Encoding and/or HTML Entity Encoding when a web application sends data back to a browser for rendering).
Conclusion
This series has underscored the critical intersections where security vulnerabilities emerge in the architecture of modern software systems, particularly through the lens of generated code. As we have explored, the vulnerabilities primarily stem from the trust placed in automated processes that generate and execute code across disparate systems. This trust, while practically necessary, opens up avenues for attack through SQL injections, XSS attacks, and more, as detailed through the examples provided.
The fundamental challenge is to ensure that all code generated by one system and consumed by another is thoroughly scrutinised and sanitised. Security architects, developers, and testers must incorporate robust validation mechanisms at every stage of code generation and execution to safeguard against malicious inputs that can lead to catastrophic breaches. As technology continues to evolve at a rapid pace, the complexity of these interactions will only increase. The emergence of new paradigms such as IaC and the proliferation of APIs across microservices architectures amplify the need for stringent security protocols.
In essence, many well-known security issues arise from this core concept: a system automatically generates code that another system trusts and executes. If the generating system fails to properly validate its input, this trust can be exploited, allowing attacks to propagate to the final receiving system.
Therefore, to ensure that an application or system does not fall foul of a wide-range of security vulnerabilities, they must validate inputs into any form of code generation.
About the Author
Chris Blake has over 20 years of experience in the information and cyber security field, and is a passionate and qualified Enterprise Security Architect and Privacy Professional who leads and delivers innovative solutions at Firesand Limited, a company he co-founded in 2016. His specialities include application security, enterprise security architecture, and privacy, with a strong track record of building and implementing ISO 27001 compliant and certified information security practices, application security programmes, and enterprise security architectures. He has a thirst for continual learning and a commitment to excellence, as demonstrated by his academic and professional credentials from prestigious institutions such as the University of Oxford, (ISC)2, IAPP, SABSA, The Open Group, and ISACA.
Chris holds an MSc in Software and Systems Security at the University of Oxford, and an array of professional certifications: CISSP, ISSAP, CSSLP, CCSP, SSP.NET, SSP.JAVA, CISA, CISM, CIPP/E, CIPM, CIPT, FIP, SCF, TOGAF, CPSA, and CEH.
Chris' experience spans multiple sectors: Retail & eCommerce; Financial Services, Banking, & Payments; i-Gaming; Energy (Oil & Gas); Property Management & PropTech; and Data Science; as well as Defence.
His areas of interests include: penetration testing; regulation & privacy, including the impact on society; access control in software; security automation in development; application of cryptography; security architecture; risk modelling & analysis; HTTP architecture & web security; IoT Security.
Cyber Security Consultant | First Class Digital and Technology Solutions Graduate
7 个月Great read! It's crucial to understand how these vulnerabilities can lead to major cybersecurity issues and privacy breaches. #Cybersecurity #Privacy #InformationSecurity #TechRisks #CodeSecurity