C compilation process - An introduction
Introduction
Knowing what goes on behind the scenes when we compile our code.
The compilation process
Overview of the compilation process
- ? The compilation process is about the conversion of the C source code into an executable that the operating system is capable of running.
- ? This process is composed of four main steps:
? ? 1. ?Preprocessor
? ? 2. ?Compilation
? ? 3. ?Assembly
? ? 4. ?Linker
- ? Let's take each step and dissect it to see its internals, but before diving in, there are two rules that we must be aware of:
? ? 1. ?Only source files are compiled.
? ? 2. ?Each source file is compiled separately.
- ? Our analysis will comprise three main points:
? ? 1. ?The input of the process.
? ? 2. ?The function of the process.
? ? 3. ?The output of the process.
Step 1 - Preprocessing
Input
A C source code file `(.c)` is fed into this process.
Function
- ? Inclusion of the header files in the source code.
? ? - ? If the program contains `#include` this line is to be replaced by the original content of the header file.
- ? Expansion of macros.
? ? - ? Every macro defined by the `#define` keyword is to be replaced with its value.
- ? Removal of the user comments.
Output
- ? The output of this process is an expanded C code file without any preprocessing statements.
- ? The output is of type `(.i)` or `(.pre)`.
- ? This file is known as a translation unit or compilation unit.
Viewing preprocessed code
To view the translation unit we use the `-E` flag of the GCC compiler.
gcc -E myProgram.c
Step 2 - Compiler
Input
- ? The input of this step is the translation unit.
- ? The compiler operates on a single translation unit only, so when there are multiple source code files that need to be compiled, each one is taken on an individual step.
Function
- ? The function of the compiler is to produce an optimized assembly code file.
- ? One of the main functions of the compiler is performing lexical analysis and syntax analysis on the translation unit, thus any error during the compiler, is mainly a syntax error.
- ? The compiler resolves only the current code, what we mean by this it doesn't resolve any function that is not defined in the current translation unit such as `printf()` as this still needs to be linked by the linker.
Output of the process
- ? A list file: This file contains the corresponding assembly code for each of our instructions.
Viewing the assembly code
To view the assembly code uses the `-S` flag for the GCC compiler.
gcc -S myProgram.c
Step 3 - Assembler
Input
- ? The input of this process is an assembly code that could be generated by the compiler or written by the user.
Function
- ? The conversion of the assembly code file into an object code file.
Output
- ? An object file.
Creating an object file from an assembly file
as myProgram.s -o myProgram.o
Creating an object file from a C source code file
gcc -c myProgram.c
Step 4 - Linker
Input
The linker takes three input files:
1. ?Linker file.
2. ?Object file.
3. ?Library files.
+ What is a linker file?
? ? It's a file written in a linker script that describes the segmentation
? ? of the memory of the target machine that the code is to be run on.
Function
- ? Linking of external libraries and object files together.
- ? Resolution of external referenced functions and objects, such as the `printf()` function.
- ? Performing memory allocation following the linker file.
Output
The binary file for the corresponding machine.
Embedded Systems Instructor || ITI Graduate || RUST | C | C++ | Python | AVR | ARM
2 年Great job ????
Embedded Associate Consultant at Siemens DISW
2 年Great ??????