Stack Based BufferOverflow
Introduction
Have you ever read, heard or saw someone posting about a BufferOverflow vulnerabilities. You was surprised how critical is that one and you rushed to install the patch. After that you felt relaxed. But wait there is a missing piece of the puzzle, what actually really happens there under the hood. Isn't it the time to really understand it. In this article we are going to explain the basics of stack based buffer overflow and hopefully we will make it simple for everyone.
First Question, What does a computer do ?
If you have no idea what does a computer do. let us try to change that first before diving into the mud. A computer simply executes instructions. Yeah a program is just a series of instructions that the computer executes so you get the final results. How are these instructions written ?.
Software engineers and programmers use high level programming languages to write the big ugly code that is then translated into machine language that the computer executes. Wait a sec, did he say machine language. Do not worry they are just computer instruction that are a series of 0s and 1s. For example a computer may have the instruction to move the value 10 into one of its registers MOV EAX,10 but remember our computers only understands 0s and 1s so this instruction will be introduced to the computer as the hex value "\xB8\x0A\x00\x00\x00". Back to simplicity, when you computer sees these it will move the value 10 into eax register.
Basic components of a computing system
Since we are going to work with memory, CPU and registers it i very essential to have the minimum understanding of these components. My words may not be very accurate technically but they will be enough to get you on the same page.
A CPU is what executes the code. We have already established the fact it understands only machine language which is a series of 0s and 1s. The CPU needs to store data and it can do this via :
- Registers : Storage locations directly accessible by the CPU to store short length values. 32 bit for x86 architecture and 64 bit length for x64.
Note:
These are 32 bit registers however they can also be referred as 16 bits and 8 bits. For example let us look at EAX :
EAX : full 32 bits
AX : the lower 16 bits
AH : the second 8 bits.
AL: the first 8 bits
For memory remember that we address based on bytes. So if you see the address 0x1234 this means we are looking at the byte wit the offset 0x1234. in 32 bit systems we have 4 Gig of memory to be used that is 2 to the power 32 possible combinations.
Note:
You will often hear people using the term x86 which is the assembly language used on 32 bit systems.
Second Question, What is a computer Program ?
it all starts from the disk on your computer. an EXE file is sitting there on the disk. Fun starts when you attempt to run the program (MS Office, Games, notepad, Browsers and all). The operating system will load the program into memory to be executed by the CPU. That is the short version of the story and may be we will dive into the details in another article.
So what did we load into memory? The Program is the code we want the CPU to execute for us. That is the simplest way to think about it. But we will try to explain more now. Let us Divide our file/executable we loaded from the disk into multiple parts :
- Text Segment
- Data Segment
- Stack Segment
- Heap Segment
Text Segment is the actual code of the executable. it contains the code that will be executed by the CPU. Your logic as a programmer is contained in that file, it may print something to the screen, it may be more complex and open a TCP connection somewhere then send "We are alive in 2020". So simply it is code in machine language understood by the CPU.
Data Segment contains the data used by your program. Any program is expected to be dealing with some sort of Data. That Data needs to be stored with the program so it will be available to be used when needed.
Stack Segment is the segment used with the function calls. When your program wants to call another function it needs to have a way to preserve the current flow and state of registers and also to provide the functions to be called with the parameters it needs to work on. All of this happens on the stack which is a part of the memory that has a dynamic size as it grows and shrinks with the execution of the program. We will look more at the details of the stack as it is very crucial for the understanding of stack buffer based overflow. Just remember this for now, On Intel Stack grows downward.
Finally, the heap segment. Another dynamic size memory segment of the program. imagine your program needs to allocate dynamic space to store some data. there it uses heap. which grows upward the address space, the opposite of the stack.
The Role of the Stack
So far we have said that stack is part of the program inside the memory, it has a dynamic size and it grows downward. A stack of plates should be good to start with. The last plate you put on that stack is the first one you get out of the stack. How is this relevant for a computer program?
Let us start with the following simple program that is just a pseudo code as it is not written in any specific programming language:
main_function
{
variable_1 = 2 , variable_2 = 2
magic_space(variable_1,variable_2);
}
We have a main function in which we are defining two variables and then we are calling another function inside the main function and pass the two variables as an arguments to that function. When this call happens the CPU needs to execute the code of the magic_space function and that function needs to know from where to read its parameters and it needs a space to allocate its local variable, the program also should know from where execution should continue after finishing the magic_space function.
How does the magic_space function gets its parameters ? ANS: They are pushed on the Stack.
It needs a space to allocate its local variable. Where? space is allocated on the stack for local variables.
How execution can continue from the point where we stopped ? ANS: The address of the next instruction after the magic_space function is pushed to the stack too. So when that function is done we restore that value from the stack to the instruction pointer register.
Note: Instruction Pointer is the register the CPU fetches to get the next instruction to be executed from the memory.
So we are using the stack to save some info, tell another piece of code how to access its parameters and to allocate space for its variables on that portion of the memory. it is good having these variables on the stack as once the function is done they will be simply removed (de-allocated).
Function Stack Frame
let us look at how the stack looks like when we call a function from our main function.
Note: Please get a cup of coffee.
So what is the idea? We call a function from within our code and we have the following Things to be worried about:
- The parameters to be passed to the function.
- The function local variables.
- The return address to be saved as we need to continue the execution of the caller function after finishing executing the called function.
- A way to be able to index the function variables and parameters is needed. This is very important to the called function. it can't read randomly. For this point Base Frame pointer register will be used as we will see.
Few notes to remember before we start:
- Stack grows toward low addresses in memory which is the opposite of heap memory which grows upward [specific to Intel architecture].
So, let us assume a function is being called within our code. Then what will be the flow:
- The caller function pushes the callee function parameters to the stack in reverse order from right to left.
- The call instruction pushes the return address to the stack. And then moves the execution to the first instruction in the called function.
- The callee function saves the old frame base pointer into stack [pushes it] and updates the base pointer register to point at the current end of stack location [EBP forms a linked list from one stack frame to the previous one and so on]. This is also know as the function Prologue and you will see it at the beginning of each x86 function.
- The called function now allocate space for its local variables moving down into memory. It achieve this by subtracting the value of ESP.
Here is an example of a stack frame here:
[V is the return address, X is the saved base pointer, var as a function local variable]
Do you want to have a look at my drawing skills?
Zeros and Ones
In this section we will look into a real example where we have a code and see how the function stack frame is there on the stack. At a later stage we will see how can make the machine suffer due to the buffer overflow vulnerability in that code.
Let us write the following simple c code:
#include <stdio.h> #include <string.h> int main( int argc, char *argv[]) { char buffer[20]; strcpy(buffer, argv[1]); reutrn 0; }
This is a very simple C code. it takes the program input and copy it into a local buffer nothing more. it is not a problem if you do not understand the code. Just get the logic, we get a user input and then we copy it into a variable.
let us link what we learned so far into this section. This is a code written in a high level programming language, So first we translate it into a machine language. compilers are used to achieve this and in this case we are going to use gcc which is available on linux for free.
gcc example.c -m32 -fno-stack-protector -o EXAMPLE
This will produce the machine language format file. Which we call an executable. Notice the Options used with the gcc command, m32 telling it we are compiling a 32 bit executable and -fno-stack-protector informing it that we want to mark the stack segment with the execution permission. why did we do that ? is it possible at some stage we will put an actual code on the stack ? Ha!. Yes, remember to do bad things it is often that you need to execute code.
Now let us dump the machine language instructions and see what is really there at the computer level.
objdump -d EXAMPLE -M intel :: Part of the output not the full one as it is very big
0000051d <main>: 51d: 8d 4c 24 04 lea ecx,[esp+0x4] 521: 83 e4 f0 and esp,0xfffffff0 524: ff 71 fc push DWORD PTR [ecx-0x4] 527: 55 push ebp 528: 89 e5 mov ebp,esp 52a: 53 push ebx 52b: 51 push ecx 52c: 83 ec 20 sub esp,0x20 52f: e8 30 00 00 00 call 564 <__x86.get_pc_thunk.ax> 534: 05 a4 1a 00 00 add eax,0x1aa4 539: 89 ca mov edx,ecx 53b: 8b 52 04 mov edx,DWORD PTR [edx+0x4] 53e: 83 c2 04 add edx,0x4 541: 8b 12 mov edx,DWORD PTR [edx] 543: 83 ec 08 sub esp,0x8 546: 52 push edx 547: 8d 55 e4 lea edx,[ebp-0x1c] 54a: 52 push edx 54b: 89 c3 mov ebx,eax 54d: e8 5e fe ff ff call 3b0 <strcpy@plt> 552: 83 c4 10 add esp,0x10 555: b8 00 00 00 00 mov eax,0x0 55a: 8d 65 f8 lea esp,[ebp-0x8] 55d: 59 pop ecx 55e: 5b pop ebx 55f: 5d pop ebp 560: 8d 61 fc lea esp,[ecx-0x4] 563: c3 ret
First column is the relative address for that instruction in the code segment. the second columns is the actual machine code of the corresponding instruction (hex code|opcode). the third one is the memonic representation of the instruction (assembly language).
Now launch the program via gdb (GNU debugger) and let us trace through it.
Note:
GDB is a linux debugger that can be used to trace/debug executable files. This is very vitals in dealing with binary files on Linux.
!!!Start GDB with our executable!!! #gdb -q ./EXAMPLE
Set the disassembly flavor to Intel syntax.
!!!set the disassembly flavor to intel!!! #set disassembly-flavor intel
Disassemble the main function of the executable. From the hex code into the readable x86 instructions. Set up a breakpoint before the call to the strcpy function. Execution will stop that breakpoint which is very useful for debugging.
!!!Disassemble the main function and place a breakpoint at the call to the strcpy function!!!
(gdb) set disassembly-flavor intel (gdb) disassemble main Dump of assembler code for function main: 0x5655551d <+0>: lea ecx,[esp+0x4] 0x56555521 <+4>: and esp,0xfffffff0 0x56555524 <+7>: push DWORD PTR [ecx-0x4] 0x56555527 <+10>: push ebp 0x56555528 <+11>: mov ebp,esp 0x5655552a <+13>: push ebx 0x5655552b <+14>: push ecx 0x5655552c <+15>: sub esp,0x20 0x5655552f <+18>: call 0x56555564 <__x86.get_pc_thunk.ax> 0x56555534 <+23>: add eax,0x1aa4 0x56555539 <+28>: mov edx,ecx 0x5655553b <+30>: mov edx,DWORD PTR [edx+0x4] 0x5655553e <+33>: add edx,0x4 0x56555541 <+36>: mov edx,DWORD PTR [edx] 0x56555543 <+38>: sub esp,0x8 0x56555546 <+41>: push edx 0x56555547 <+42>: lea edx,[ebp-0x1c] 0x5655554a <+45>: push edx 0x5655554b <+46>: mov ebx,eax 0x5655554d <+48>: call 0x565553b0 <strcpy@plt> << call to strcpy 0x56555552 <+53>: add esp,0x10 0x56555555 <+56>: mov eax,0x0 0x5655555a <+61>: lea esp,[ebp-0x8] 0x5655555d <+64>: pop ecx 0x5655555e <+65>: pop ebx 0x5655555f <+66>: pop ebp 0x56555560 <+67>: lea esp,[ecx-0x4] 0x56555563 <+70>: ret End of assembler dump. (gdb) break *0x5655554d Breakpoint 5 at 0x5655554d
Run the program and give it the string "Hello World" as an argument. Notice how the Execution stopped at our breakpoint.
!!! Run the program and notice it will stop at out breakpoint before the call to the strcpy function. Run the Program with the argument Hello World!
(gdb) run "Hello World!" The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/malhyari/bo/EXAMPLE "Hello World!" Breakpoint 5, 0x5655554d in main ()
Let us examine what do we have on the stack before the call. Always go back to the image that shows the structure of the stack function frame as it helps a lot when examining the stack.
!!! A look at the stack before the call happens !!! !!! Remember the stack frame image. What is on top of the stack !!! !!! the strcpy function takes 2 arguments, first is the buffer to copy to and 2nd !!! is the string to copy info that buffer !!! check the value of ESP at the breakpoint
(gdb) info register esp esp 0xffffd110 0xffffd110
!!! check the 8 words at the top of the stack
(gdb) x/8xw $esp 0xffffd110: 0xffffd12c 0xffffd3c4 0x00000000 0x56555534 0xffffd120: 0xf7fb63fc 0x56556fd8 0xffffd200 0x565555bb
!!!Remember we push from right to left. so first we push a pointer to the string !!! we will copy and then we push a pointer to where we will store the value as !!! per this the second byte in stack should point to the hello world string and !!! let us check what is there on address 0xffffd3c4
(gdb) x/s 0xffffd3c4 0xffffd3c4: "Hello World!"
The instruction pointer register is very important. See what happens for this precious thing.
!!! Next the call instruction will push the value of EIP into the stack so we know !!! where to return.
Dump of assembler code for function main: 0x5655551d <+0>: lea ecx,[esp+0x4] 0x56555521 <+4>: and esp,0xfffffff0 0x56555524 <+7>: push DWORD PTR [ecx-0x4] 0x56555527 <+10>: push ebp 0x56555528 <+11>: mov ebp,esp 0x5655552a <+13>: push ebx 0x5655552b <+14>: push ecx 0x5655552c <+15>: sub esp,0x20 0x5655552f <+18>: call 0x56555564 <__x86.get_pc_thunk.ax> 0x56555534 <+23>: add eax,0x1aa4 0x56555539 <+28>: mov edx,ecx 0x5655553b <+30>: mov edx,DWORD PTR [edx+0x4] 0x5655553e <+33>: add edx,0x4 0x56555541 <+36>: mov edx,DWORD PTR [edx] 0x56555543 <+38>: sub esp,0x8 0x56555546 <+41>: push edx 0x56555547 <+42>: lea edx,[ebp-0x1c] 0x5655554a <+45>: push edx 0x5655554b <+46>: mov ebx,eax => 0x5655554d <+48>: call 0x565553b0 <strcpy@plt> 0x56555552 <+53>: add esp,0x10 <<<< EIP value pushed into stack. 0x56555555 <+56>: mov eax,0x0 0x5655555a <+61>: lea esp,[ebp-0x8] 0x5655555d <+64>: pop ecx 0x5655555e <+65>: pop ebx 0x5655555f <+66>: pop ebp 0x56555560 <+67>: lea esp,[ecx-0x4] 0x56555563 <+70>: ret End of assembler dump. (gdb) si 0x565553b0 in strcpy@plt () (gdb) info register esp esp 0xffffd10c 0xffffd10c (gdb) x/8xw 0xffffd10c 0xffffd10c: 0x56555552 0xffffd12c 0xffffd3c4 0x00000000 0xffffd11c: 0x56555534 0xf7fb63fc 0x56556fd8 0xffffd200
From this point onward control will be moved to the strcpy function and it will start by pushing the old base frame pointer into the stack, saving a new one and allocating space for its own variable. when it finishes execution the RET instruction will pop old EIP value from the stack and update EIP with it so execution can continue from where it stopped.
Here is how the stack looks like when strcpy is being executed.
Being Evil Part 1- Crashing the Application
So far we have learned a lot about stack and how executable works. Looking at the last image in the previous section. We have the following inputs in our hands:
- A user input to the program.
- strcpy function operating on that input and copying it to a local buffer. that function does not also do any check on the length of the input!
- That structure of the stack we discussed for a long time now
How can we be evil ?
If the user inputs more than 20 bytes of data!
In our case func1 will copy the user input into the variable buffer. Variable buffer has 20 bytes of space. What happens if we provide more than 20 bytes as an input. Func1 will start copying and we will override what is above the 20 bytes . From the diagram notice how sensitive the info above that variable.
Notice that we will overwrite the return address of the main function. what does it mean ?
- If we override it with some junk then we have destroyed the logic of a valid return. The program will simply crash.
- If we override it with some reasonable value then we may have control of the program and this will open the window to execute some code there.
Now let us test and see if it crashes.
!!! running with 20 bytes everything is fine and the program exits normally (gdb) run AAAAAAAAAAAAAAAAAAA Starting program: /home/malhyari/bo/EXAMPLE AAAAAAAAAAAAAAAAAAA [Inferior 1 (process 2772) exited normally] (gdb)
!!! running with 30 bytes and program crashes!!!
(gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAA Starting program: /home/malhyari/bo/EXAMPLE AAAAAAAAAAAAAAAAAAAAAAAAAAAAA Program received signal SIGSEGV, Segmentation fault. 0x56555563 in main ()
Note 0x56555563 is the address of return instruction as it does not know where to return now. The stack frame of the main function has been corrupted. let us execute again but setup a breakpoint on the next instruction after the strcpy call.
!!! Look at the stack. what is 41 ?? !!! !!! 0x41 is the ASCI value of the character A. (gdb) run AAAAAAAAAAAAAAAAAAAAAAAAAAAAA The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/malhyari/bo/EXAMPLE AAAAAAAAAAAAAAAAAAAAAAAAAAAAA Breakpoint 8, 0x56555552 in main () (gdb) disassemble main Dump of assembler code for function main: 0x5655551d <+0>: lea ecx,[esp+0x4] 0x56555521 <+4>: and esp,0xfffffff0 0x56555524 <+7>: push DWORD PTR [ecx-0x4] 0x56555527 <+10>: push ebp 0x56555528 <+11>: mov ebp,esp 0x5655552a <+13>: push ebx 0x5655552b <+14>: push ecx 0x5655552c <+15>: sub esp,0x20 0x5655552f <+18>: call 0x56555564 <__x86.get_pc_thunk.ax> 0x56555534 <+23>: add eax,0x1aa4 0x56555539 <+28>: mov edx,ecx 0x5655553b <+30>: mov edx,DWORD PTR [edx+0x4] 0x5655553e <+33>: add edx,0x4 0x56555541 <+36>: mov edx,DWORD PTR [edx] 0x56555543 <+38>: sub esp,0x8 0x56555546 <+41>: push edx 0x56555547 <+42>: lea edx,[ebp-0x1c] 0x5655554a <+45>: push edx 0x5655554b <+46>: mov ebx,eax 0x5655554d <+48>: call 0x565553b0 <strcpy@plt> => 0x56555552 <+53>: add esp,0x10 0x56555555 <+56>: mov eax,0x0 0x5655555a <+61>: lea esp,[ebp-0x8] 0x5655555d <+64>: pop ecx 0x5655555e <+65>: pop ebx 0x5655555f <+66>: pop ebp 0x56555560 <+67>: lea esp,[ecx-0x4] 0x56555563 <+70>: ret End of assembler dump. (gdb) x/20wx $esp 0xffffd100: 0xffffd11c 0xffffd3b3 0x00000000 0x56555534 0xffffd110: 0xf7fb63fc 0x56556fd8 0xffffd1f0 0x41414141 0xffffd120: 0x41414141 0x41414141 0x41414141 0x41414141 0xffffd130: 0x41414141 0x41414141 0x00000041 0xf7df9e81 0xffffd140: 0xf7fb6000 0xf7fb6000 0x00000000 0xf7df9e81
Being Evil Part 2 - ShellCode
So we saw how we can crash the application as the strcpy function does not check the length of the input coming from the user. This is why most of the bad things comes out as a result of not validating the user input and writing a poor code.
In the next article we will build on top of this to craft our crash payload that it will allow us to overwrite the instruction pointer with some interesting value that will allow us to take control of the execution. And the last piece of the puzzle we will put a shellcode on the stack to be executed and spawn a reverse shell (shellcode is just a machine language code that does something ). Since we deal with the memory directly in our attacks then we need to put hex values there.
I hope you like this and understand the basics of this topic and stay tuned for the next Article.
Further Readings
Will keep it very simple. You can look at this course for x86:
And always remember. Google is there !