Prolog And Epilog Code
abhinav Ashok kumar
Curating Insights & Innovating in GPU Compiler | Performance Analyst at Qualcomm | LLVM Contributor | Maintain News Letter | AI/ML in Compiler
In assembly language function prolog is a line of code at the starting of a function. This prolog prepare the stack and register for use within the function.
Similarly, the function epilog appears at the end of the function which restore the stack and register to state that were in before the function was called.
This prolog and epilog are not the part of the assembly language itself; they represent a convention used by assembly language programmers and compilers of many high level language. They are fairly rigid, having the same form in each function.
The primary functions of prolog code are:
2. To obtain the dynamic storage area for this function.
3.To chain this function’s save area to the calling function’s save area, in accordance with the MVS? linkage convention.
The primary functions of epilog code are:
2. To restore the calling function’s general-purpose registers.
3. To return control to the calling function.
Let's Understand Them in More detail
There are two major syntaxes for x86 assembly:?Intel and AT&T. AT&T syntax is used by Linux and The Intel syntax is used by many windows assemblers and debuggers. The two formats yield exactly the same machine language; however, there are a few differences in style and format:
Intel syntex:
mov dest, source?; copies data from the source to the destination
mov ebp,esp?; copies data from source "esp" register to the destination "ebp" register
AT&T syntex :
mov source, dest;?copies data from the source to the destination
mov esp,ebp?; copies data from source "esp" register to the destination "ebp" register
AT&T format uses a % before registers while Intel does not. AT&T format uses a $ before literal values while Intel does not.
In assembly the functions are mostly called by an instruction ‘CALL address’ the address is the place where the function code lies. Every function has an identical prolog(The starting of function code) and epilog ( The ending of a function).
Prolog:?The structure of Prologue is look like:
push?ebp
mov??esp,ebp
Epilog:?The structure of Prologue is look like:
leave?
ret
Lets take a program which just print hello world
#include<stdio.h>
#include<stdlib.h>
int hello(char *s);
int main()
{
? int len=hello("hello world");
? printf("\nLength of string is %d\n",len);
}
int hello(char *s)
{
return? (printf("%s",s));
}
Assembly can be generated using -S flag with gcc
gcc -S hello.c -o hello.s
????.file??"a.c"
????.text
????.def??printf; .scl??3;???.type??32;???.endef
????.seh_proc????printf
printf:
????pushq??%rbp
????.seh_pushreg??%rbp
????pushq??%rbx
????.seh_pushreg??%rbx
????subq??$56, %rsp
????.seh_stackalloc 56
????leaq??48(%rsp), %rbp
????.seh_setframe??%rbp, 48
????.seh_endprologue
????movq??%rcx, 32(%rbp)
????movq??%rdx, 40(%rbp)
????movq??%r8, 48(%rbp)
????movq??%r9, 56(%rbp)
????leaq??40(%rbp), %rax
????movq??%rax, -16(%rbp)
????movq??-16(%rbp), %rbx
????movl??$1, %ecx
????movq??__imp___acrt_iob_func(%rip), %rax
????call??*%rax
????movq??%rbx, %r8
????movq??32(%rbp), %rdx
????movq??%rax, %rcx
????call??__mingw_vfprintf
????movl??%eax, -4(%rbp)
????movl??-4(%rbp), %eax
????addq??$56, %rsp
????popq??%rbx
????popq??%rbp
????ret
????.seh_endproc
????.def??__main; .scl??2;???.type??32;???.endef
????.section .rdata,"dr"
.LC0:
????.ascii "hello world\0"
.LC1:
????.ascii "\12Length of string is %d\12\0"
????.text
????.globl?main
????.def??main;??.scl??2;???.type??32;???.endef
????.seh_proc????main
main:
????pushq??%rbp
????.seh_pushreg??%rbp
????movq??%rsp, %rbp
????.seh_setframe??%rbp, 0
????subq??$48, %rsp
????.seh_stackalloc 48
????.seh_endprologue
????call??__main
????leaq??.LC0(%rip), %rcx
????call??hello
????movl??%eax, -4(%rbp)
????movl??-4(%rbp), %eax
????movl??%eax, %edx
????leaq??.LC1(%rip), %rcx
????call??printf
????movl??$0, %eax
????addq??$48, %rsp
????popq??%rbp
????ret
????.seh_endproc
????.section .rdata,"dr"
.LC2:
????.ascii "%s\0"
????.text
????.globl?hello
????.def??hello;?.scl??2;???.type??32;???.endef
????.seh_proc????hello
hello:
????pushq??%rbp
????.seh_pushreg??%rbp
????movq??%rsp, %rbp
????.seh_setframe??%rbp, 0
????subq??$32, %rsp
????.seh_stackalloc 32
????.seh_endprologue
????movq??%rcx, 16(%rbp)
????movq??16(%rbp), %rdx
????leaq??.LC2(%rip), %rcx
????call??printf
????addq??$32, %rsp
????popq??%rbp
????ret
????.seh_endproc
????.ident?"GCC: (Rev2, Built by MSYS2 project) 10.3.0"
????.def??__mingw_vfprintf;????.scl??2;???.type??32;???.endef
So, let’s look at what’s going on. The stack section is used to keep track of function calls (recursively) and grows from the higher addressed memory to the lower-addressed memory on most systems .Local variables exist in the stack section.
Now let’s look at main function assembler code.The first two lines are the function prolog:
????pushq??%rbp
????.seh_pushreg??%rbp
????movq??%rsp, %rbp
????.seh_setframe??%rbp, 0
????subq??$48, %rsp
????.seh_stackalloc 48
The first few line are common in all the part of function prolog they are:
????pushq??%rbp
????.seh_pushreg??%rbp
????movq??%rsp, %rbp
The first instruction :
push??%rbp
pushes old value of RBP, the base pointer, onto the stack and RSP is updated. For the sake of simplicity, let's assume the starting address of stack is 0x1000
The second instruction :
mov???%rsp,%rbp
copies current vale of rsp into the rbp. This means rbp is now point to top of the stack.
?movl??$0, %eax
????addq??$48, %rsp
Similarly for epilog
????popq??%rbp
????ret
????.seh_endproc
Technical Architect at Societe Generale Global Solution Centre
2 年push??%rbp pushes old value of RBP, the base pointer, onto the stack and RSP is updated. It is not the old value that gets pushed - it is the current value that get's pushed. Given that any given register(s) can hold only one value at a time (the current value) and since it doesn't have any "memory" of any of it's "previous value(s)", I think there's no point in saying "old value" ??