Compilation process in C explained step by step.
The compilation process, that of converting one or several source code files, into executable binary code for a given hardware / software architecture, involves several stages, particularly in C language, which "convert" a source code into an executable one.
First and foremost, let's define source code. The source code is the program that we write as programmers, the plain text that "tells" the computer how to do things.
On the other hand, the executable binary code, in general for any compiled language, and in particular for C language, is binary code (not text) that in turn can be executed in the computer. I clarify this, because one of the intermediate products of the compilation process is the object code, which although it is binary, can not be executed and must continue its compilation process to the next stage, the link, or link
A simple example
Suppose we have the following source code ... the classic "Hello World":
/*
* File: holamundo.c
* Mi primer "Hola Mundo" en Lenguaje C
* charlyhackr - HOlbertonschool.
*/
#include<stdio.h>
int main(void){
printf("Hello World\n");
return 0;
}
A simple compilation would be, in GNU / Linux systems, the following:
gcc holamundo.c -o holamundo
That will generate a binary file called helloworld, and whose description will be similar to the following
carlos@charlyhackr:/tmp$ file heloworld
holamundo: ELF 64-bit LSB pie executable x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 8.2.0, BuildID[sha1]=141789414df18ece4fa044cf91f5e776bf75f959, not stripped
diego@cryptos:/tmp$
This means that the output helloworld file (output, hence the "-o") is of the ELF type, the GNU / Linux executable format (similary to the windows exe)
We can execute it without problems:
carlos@charlyhackr:/tmp$ ./holloworld
Hello World
Compiling step by step
Now let's analyze the compilation process step by step, what does a C language compiler do internally?
Preprocessing
The first thing the compiler does is to preprocess the source file, that is, interpret all the pre-processing directives that we have used, such as #define, #include, #ifdef, etc ... and also, it will eliminate all the comments that we have written in the file.
In the particular case of our holloworld, it will include the stdio.h file (standard input / output header), and it will delete the comments.
Let's preprocess our example:
gcc -E helloworld.c -o helloworld.i
The "-E" modifier allows you to specify to the compiler (gcc) that only preprocesses, and that the output is written in the file holamundo.i. The .i extension is used for preprocessed files.
Now, helloworld.i is still source code, but if we look at its contents we will find something similar to this:
[....]
extern int pclose (FILE *__stream);
extern char *ctermid (char *__s) __attribute__ ((__nothrow__ , __leaf__));
# 840 "/usr/include/stdio.h" 3 4
extern void flockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
extern int ftrylockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__)) ;
extern void funlockfile (FILE *__stream) __attribute__ ((__nothrow__ , __leaf__));
# 868 "/usr/include/stdio.h" 4.2
# 8 "helloworld.c" 2
# 9 "helloworld.c"
int main(int argc, const char *argv[]){
printf("Hello world\n");
return 0;
}
In the upper part we have more text, product of evaluating stdio.h, since all this is generated when interpreting the directive #include <stdio.h>. If we had more than one #include we would see a combination of many lines of code.
At the end of this file is the code known to us, our ?Hello World?, of course, no longer comments
Compilation
The next step is to compile our code. the result of the compilation is a binary code NOT executable, called object code, whose characteristic extension is a ".o" file.
Let's compile
gcc -c helloworld.i -o helloworld.o
And if we see the file type, this will be an ELF binary file, but not executable, as it was the previous one.
ar -cvr libhelloworld.a helloworld.o
The file libhelloworld.a will contain the necessary functions that we must link to our helloworld.o to be able to create an executable.
Now, it will be enough to link our object with this library file. By the way, it is a ?.a? file, which comes from the English ?Archive?, and it is a static link library, as opposed to the dynamic link libraries, which in GNU / Linux systems are called ?.so? of ? Shared Object ?, and they come to be the equivalent to the" .DLL "of Windows.
gcc -Wall helloworld.o -L/tmp/ -lhelloworld -o helloworld
Here we have linked the file helloworld.o with the libhelloworld.a library and we have generated the helloworld file. · The modifier ?-L? indicates the path where the compiler should search the libraries, while ?-l? indicates the particular library that we want to link to the object, since we can have several.
If we now execute ?file helloworld? we will see an output similar to the first one, an executable ELF. (Executable and Linkable Format).
Chief Marketing Officer | Product MVP Expert | Cyber Security Enthusiast | @ GITEX DUBAI in October
2 年Carlos, thanks for sharing!
Software Developer | JavaScript, Nodejs, MongoDb, Python
2 年So valuable Sir. Thanks.