C Breakdown: gcc main.c
So you’ve typed up your first program and ready to see it in action, but wait! How do you run a .c file? Well, you can’t. The next step in the process is to run it through a compiler like gcc. Let’s break it down.
Overview
C and other high-level programming languages are built to be human readable, not computer readable. Therefore they require a sort of translation before the computer is able to understand it. These translators are called either compilers or interpreters. While compilers and interpreters are quite similar, their main difference is that interpreters do the translations at run-time rather than before. C is a compiled language, so that’s what we’ll be focusing on in this article.
There are four steps a compiler like gcc goes through when turning your code into an executable program: Pre-Processing, Compiling, Assembling, and Linking.
Below you can see the C file that we will be using as an example.
Pre-Processing
The first step is pre-processing. This is where the source code is stripped of unnecessary information and taken down to plain old code. This includes three vital parts. First up, it reads all #include headers and essentially copy-pastes the code stored in those header files right where the #include lines were typed. Next, it takes out all comments as they are unnecessary for the program to work and are there as a tool for programmers to better understand the code. Finally, it reads all macros, placing the result everywhere the macro was invoked, similar to including a header file, but it works within the same file instead of between files. We can tell gcc to stop after this stage with the option -E, however it will not save it to a file by default. You must tell it to save with the option -o.
Below you can see part of our example file after pre-processing (they become pretty lengthy during this step).
Compiling
The next thing gcc will do is called compiling. This is when the translation really begins. This step in the process takes our C code and translates it down a level into a low-level language called Assembly. It then gets stored in a file with the extension .s, but will later get deleted unless we asked gcc to stop at this step with the option -S. This is also the part where any compile-time errors will show up. This can occur when you have a syntax error somewhere in your source code which causes the compiler to be unable to translate the code.
Below you can see our example file after compiling.
Assembling
Now onto the assembling process. This is another translation step that takes your Assembly code from the compiler down to machine code (aka binary). Now you have what is called an object file with the extension .o. We can stop gcc at this step with the option -c. While object files are not human readable, they are still useful in some situations. For instance, if you are creating a library for others to use, you want the object file instead of an executable. Why? That’s where the next and final step comes in.
Below you can see what is displayed when you try to open an object file in a normal text editor.
Linking
Finally, it’s time to turn our code into an executable program! Linking is the process of taking the newly compiled machine-code and “linking” it with any libraries that may have been used to create the code. I mentioned earlier that keeping your program as an object file is useful for libraries. This is because the linker will only take object files. It will take all of your object files and the libraries that have already gone through the compiler and connect them all together into one executable file.
Now we can run our program for the first time!