A memory location is more important to a computer than the data stored in that location…

A memory location is more important to a computer than the data stored in that location…

Solving The C Coding Challenge:

A dream doesn't become reality through magic; it takes sweat, determination and hard work...


magic_pointer_challenge_screenshot


Question:

Add one line to the code above, so that the program prints a[2] = 98 instead of the assigned value a[2] = 1024, followed by a new line.

NOTE: We have some restrictions to the challenge...

  • You are not allowed to use the variable a, in your new line of code.
  • You are not allowed to modify the variable p.
  • You can only write one statement.
  • You are not allowed to code anything else than the expected line of code at the expected line number.
  • Your code should be written in line 19, before the ;
  • Do not remove anything from the initial code (not even the comments) and don’t change anything but the line of code you are adding (don’t change the spaces to tabs!)


How can we modify the value stored at a[2] without directly accessing the array at the given index?

The computer when executing a program does not know the variable names, it only knows the memory addresses which it fetches at runtime to execute program instructions.


If we have access to any memory address that has 'write' permission, we can actually manipulate the value stored in that location without making any reference to the variable name used to reference that address in memory.


To solve the above challenge, we'll need to look into the assembly code of the program by disassembling the executable file obtained from compiling the source code of the program.

Then, we will check how memories are allocated to the declared variables, and with a pointer variable, we can move it around in the allocated memory space for running the program. Once we point to any location with our pointer variable, we can modify any value stored in the location without directly modifying the variables that were initially used to store those values.

You can obtain your own copy of the source code here to use and experiment.


Program Environment

  • Language: C
  • OS: Ubuntu 20.04.01 LTS
  • Compiler: gcc 9.40

We will compile the source code which is stored in a file named magic.c to generate an executable file that we will use.

You can use the below code to compile yours.


gcc magic.c -o magic

        

After compilation, we will then disassemble our executable file, in my case it is called magic using the objdump tool as follows.



objdump -j .text -M intel -d magic
        


I specified -j .text option to reduce the information displayed to just the text section of the assembly code and the -M to display the intel syntax. The -d option is used to disassemble the executable file, magic.


objdump_disassemble_c_code_screenshot


The highlighted lines contain all the information we will need to solve this challenge but let's explain what happened before our data gets fetched into the registers.



???? push?? rbp resets the base pointer register that will be used to calculate the relative memory locations of the data to be used in the main function.


???? mov?? rbp,rsp copies the rsp register to the rbp register.


???? sub??? rsp,0x30 reserves 48bytes(0x30) of memory enough to execute the instructions in the main function. For instance, we have three declarations in our code int n (4bytes), int a[5] (4 * 5 = 20bytes), and the pointer variable int *p (8 bytes) which sums to 32bytes of memory. The remaining 16bytes are reserved for other special registers used for special functions.


???? mov??? rax,QWORD PTR fs:0x28 copies the fs register used to store thread-local information to the rax register.


???? mov??? QWORD PTR [rbp-0x08],rax copies the value of rax to a location 8bytes before the rbp register.

NOTE: the rbp register points to the base address (0x30 = 48) of the rsp register.


From this point, the program starts loading our data into the registers


loading_data_to_memory_source_code_screenshot


???? mov??? DWORD PTR [rbp-0x18],0x400?? ( a[2] = 1024 (0x400) ). This shows that a[2] is located 24bytes(0x18) before the rbp register. With this, we can trace out other memory locations for other indexes of a (a[0], a[1], a[3], and a[4]) respectively, because array elements are arranged contiguously (side-by-side) in memory.



loading_data_to_memory_assembly_code_screenshot


???? lea??? rax,[rbp-0x2c]??? loads the effective address stored at 44bytes(0x2c) before the rbp to the rax register. In other words, the processor fetches the address of n (&n) and puts it into the rax register which will later be assigned to the pointer p.



???? mov QWORD PTR [rbp-0x28],rax??? assigns the address stored at rax(which is the address of n) to a pointer located 40bytes(0x28) before the rbp register (which is the pointer p).


? Now, we have all we need to crack this challenge! ???


The goal here is not to get the actual address values stored in the registers as these values can change anytime. Instead, our goal is to find the spacial distance between the declared variables relative to the base address register as these spatial distances are always fixed irrespective of environment.


So, the spatial distance between n(0x2c) and a[2](0x18) is,

44bytes - 24bytes = 20bytes spatial distance.?


memory_address_location_screenshot


Now, we have a pointer p with value as the address of n (p = &n) while n is of type int which is 4bytes in a 64bit machine. This means that though p is located in a different address in memory (0x28 = 40bytes before rbp), it is pointing to where n is located.

So, if we move the pointer p one step to the right, it will point to the next address 4bytes relative to the variable n.


ie. (p + 1) == (44 - 4)bytes = 40bytes
        


If p moves 4bytes when moved one step, and we have that the spatial distance between n and a[2] is 20bytes, then if p is moved 5 steps (20bytes / 4bytes = 5) rightward, it will point to the address (44 - 4 * 5 = 24 bytes) and this is exactly the address of a[2].


(p + 5) == p[5] == &a[2] == 0x18 (24bytes before rbp)
        

Dereferencing this address gives us access to the value stored at this location which can then manipulate without directly modifying the array a or p.

Final Solution

To change the value of a[2] from 1024 to 98 without using the array a or modifying p, we have:


*(p + 5) = 98; // Or p[5] = 98;
        


Variables in any programming language are a simple abstraction for human readability of the memory addresses that the variables map to. At runtime, the variable names are substituted for the actual memory address they point to, so if we can by any means reference these memory locations even at runtime, we can manipulate any value stored in that particular location.

And in C, C++, etc. we have the flexibility of using variable pointers to access valid memory addresses at will.

You can check out my code on GitHub.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了