登录查看更多内容

How to Compile a C Program

Colson Scott

Software Engineer @ Quantum Metric

发布日期: 2020年2月6日

So you've written your first C Program, which is undoubtedly "Hello, World". Now you want to run the program so that you can show off your hard work to all your friends. How do you go about running a C Program? First, there are a few steps to take before your computer can effectively print "Hello, World" to your terminal.

What is gcc?

You're sitting there with your source file named hello.c and wondering how to get to the desired result of "Hello, World". The mechanism for doing this is running your code through a compiler, which for us in this instance is gcc. Gcc stands for GNU Compiler Collection as it was originally built as the compiler for GNU, a free operating system. It now operates as a compiler for a wide variety of languages, including C.

How does gcc work?

Gcc runs your source code (hello.c) through a series of steps to turn it into an executable file. These steps include: Preprocessing, Compiling, Assembling, and Linking. A short synopsis of this process would be that preprocessing adds in all the external dependencies and converts all your macros into their subsequent values and replaces them in your code. Compiling then turns your code into assembly language, which is the language that the assembler understands. The assembler takes the compiled code in assembly language and converts it into machine code in the form of binary, the only language computers actually understand. The machine code is placed into object files, and it's then sent to the final step of linking. Linking bundles together all the object files you've told it to use, so you have one cohesive program that becomes an executable.

Step 1 - Preprocessing

The act of preprocessing can be likened to a translation. Anything with a # preceding it is seen as a preprocessing directive. It looks at all the header files you've included, and inserts them into your source file, essentially grabbing the code you ask for with #include <stdio.h> and putting it at the top of your source file. All your macros (denoted by #define) are expanded into their respective values and placed into your code. It also strips away all your comments as these are meant only for other humans perusing your code, and they have no meaning to the machine itself.

In the above example, we see the header file <stdio.h> included, as well as the macro AGE defined with the value of 28. This is what our source code file looks like before we've begun the compilation process. You can see we have 2 preprocessor directives, 1 include and 1 define. Now let's take a look at what happens when we run our hello.c file through the preprocessor. We can have the gcc complier stop at the preprocessing stage by adding the -E option, as shown in the example below.

By stopping the compilation process after the preprocessing stage, we can view the output of this step and see what gets sent to the compilation stage. An example of preprocessed output is shown below.

This is the beginning of the preprocessed file's output, which was stored in a file called "hello.i". This .i extension is the computer's way of saying this is a preprocessed file. In the image above, the preprocessor is figuring out how to resolve all the dependencies by determining if it has the dependencies or needs to fetch them externally.

The final lines of the preprocessed output shows us the preprocessor grabbing the external dependencies it lacks, denoted by the extern keyword. It then shows us that our comments have been stripped away and we're left with whitespace in their place. Finally, we see that the macro we defined as AGE has been replaced with its respective value of 28. Now that we've seen what happens at preprocessing, let's move on to compilation.

Step 2 - Compiling

The next step is to take our preprocessed file and move it into the compiling stage. Here, the machine will take the "hello.i" file and transform it into a filed called "hello.s". This .s extension signifies to the computer that file is in assembly level language. Assembly language is the language the assembler understands and looks slightly more arcane than preprocessed files. We can have the gcc compiler stop at the Compiling stage by giving it a -S argument. This stops gcc at compilation and outputs into the file "hello.s"

Above we take a look at an example of assembly language by peering into our "hello.s" file. As you can see, there are words you recognize and you know what they mean. There are also some weird combinations of letters, numbers, and symbols that look completely out of place among the comprehensible English words. We won't go into what it all means, but think of the assembly level language as a bridge between written English language (in the form of C code) and machine code that the computer understands.

We can even see parts of our function we recognize, like string and main. We've formatted our code into assembly language for a specific reason, the assembler. It's no coincidence that the assembler understands assembly language. As you can see, we're still not down to binary yet. Let's take the next step in our evolution to get there, which takes us to assembling.

Step 3 - Assembling

This is where our assembly code (in the form of hello.s) makes a complete transformation into something that is unreadable to humans (in the form of the object file hello.o). This is called machine code or binary, and it's the only language computers understand. The object file has one important caveat to note, only existing code will transformed into binary. All function calls (in our example we called printf), are still unresolved and will be resolved in the linking process which is the final step of compilation. We can force gcc to stop at the Assembling stage by giving it the argument -c.

As you can already tell, nearly the entirety of our code is no unreadable to us. We do see a few recognizable parts still however. Near the top you can see the "Age is: %d", which is the string we fed to printf. Towards the bottom right, "hello.c", "main", and "printf" appear to us as well. Everything else is, well, incomprehensible. This is by design as we're not running our program, the computer is. The final step of linking will allow the computer to translate the last remaining parts of our code into binary, thus completing the compilation process and producing an executable file. Let's check out linking now.

Step 4 - Linking

We've now arrived at the final step of the compilation process. Linking is the process whereby the linker resolves all function calls to all external libraries. It also links together all object code that might exist in a project with a large scope (think of multiple source code files) with the object code from the external libraries into one executable file called by default "a.out". There obviously is no flag for gcc to stop at the linking stage because it's the final step in the compilation process. However, we can specify the name of the executable file if we find "a.out" doesn't suit our needs. We can specify the name of the executable file by passing gcc the -o argument followed by the name we want for the file.

When we ls to list the contents of the directory, we see that the compilation process has finished steps 1-4 and after specifying with the -o flag that we wanted the name of the executable file to be hello, we see the green "hello" file. The green font is the computer's way of saying "Hey, this is an executable!"

And finally, the moment you've all been waiting for, it's time for our program to serve it's intended purpose. We programmed it to print my age of 28. We've followed it through the entire compilation process step by step. Now that we have a better understanding of what's going on underneath the hood, let's see the result of all the computer's hard work. We can accomplish this by running ./hello. The "./" tells the computer to look in this folder we're currently in for an executable file named "hello". This was the file name we specified to gcc with gcc -o hello.

In Summary

Just to recap, the compilation undertaken by the gcc compiler consists of 4 steps. The first step is Preprocessing. This is where gcc takes your header files and prepends them to your source code. It also strips comments and expands macros into their respective values in the source code. The second step of the process is Compiling. In Compiling, gcc takes the preprocessed file (.i extension) and turns it into assembly language (.s extension). Assembly language is essentially a bridge between human language and machine language. This assembly language code is sent to the third step, which is Assembly. In Assembly, the assembler takes assembly language and translates it into machine code or binary, with the exception of a few parts of our program (mainly function calls). This machine code is finally sent to the linker. The linker bundles together all the object code from the assembler with the object code acquired by resolving dependencies on external libraries. The linker links everything together into one single executable file, which is named by default "a.out". That is, unless we tell gcc otherwise. If you're interested in getting really deep into how gcc works, I suggest reading the gcc man page.

Thanks for taking the time to read this article! I hope you were able to learn a little more about the compilation process and what's going on underneath the hood of your computer. Until next time! Happy Coding!

要查看或添加评论，请登录

Colson Scott的更多文章

Spotify: A Design Case Study

2021年2月1日

Spotify: A Design Case Study

Introduction Overview and Problem Statement This case study focuses on the Spotify desktop app experience. Spotify is…

2 条评论
Portfolio Project Reflection

2020年11月2日

Portfolio Project Reflection

Introduction Friendzy helps you create lasting connections with the amazing people in your local community who share…

1 条评论
The Journey of An HTTPS Request to holbertonschool.com

2020年8月24日

The Journey of An HTTPS Request to holbertonschool.com

Introduction This post is going to follow the route of an HTTPS request to holbertonschool.com from my browser all the…
Where the Wild "Internet of Things" Are

2020年8月23日

Where the Wild "Internet of Things" Are

Introduction When the dust settles on the historical record of the early 21st century, we'll undoubtedly all agree on…
Machine Learning Made Easy

2020年7月6日

Machine Learning Made Easy

It's safe to say that if you're reading this article, your life has been impacted by machine learning in countless…
Flying "First Class" with Python: Where Everything is an Object and Mutability Matters

2020年5月27日

Flying "First Class" with Python: Where Everything is an Object and Mutability Matters

Introduction If you didn't already know, Python is an object-oriented programming (OOP) language. The creator of…
Looking Into Libraries: How Are Static and Dynamic Libraries Different?

2020年5月5日

Looking Into Libraries: How Are Static and Dynamic Libraries Different?

Congratulations on making it this far in your C programming journey! You've learned to handle all the intricacies of…
Static Libraries in C: Boosting Your Program's Efficiency

2020年3月1日

Static Libraries in C: Boosting Your Program's Efficiency

As any C programmer knows, C gives you nothing for free. You must be very explicit when writing code so the compiler…
Let's Hit the Links

2020年2月4日

Let's Hit the Links

Piqued your interest with the golf reference didn't I? While I've got your attention, let me see if I can teach myself…
I Want to See All the C!

2020年2月4日

I Want to See All the C!

If you're reading this post on a computer, I think it's safe to say you haven't explored how it works all that much…

See all articles

How to Compile a C Program

Colson Scott

Software Engineer @ Quantum Metric

What is gcc?

How does gcc work?

Step 1 - Preprocessing

Step 2 - Compiling

Step 3 - Assembling

Step 4 - Linking

In Summary

Colson Scott的更多文章

社区洞察

其他会员也浏览了

Code Optimization: What the Compiler can do for our programs

C++ Core Guidelines: Non-Rules and Myths

History of C language

Interview Questions C++

lvalue rvalue and their references in C++

A memory location is more important to a computer than the data stored in that location…

TYPES OF COMPILERS

Enhancing Code Quality with MISRA?-C Rules and GCC Options #7

Compilation process in C explained step by step.

What is gcc?

How does gcc work?

Step 1 - Preprocessing

Step 2 - Compiling

Step 3 - Assembling

Step 4 - Linking

In Summary

Colson Scott的更多文章

Spotify: A Design Case Study

Portfolio Project Reflection

The Journey of An HTTPS Request to holbertonschool.com

Where the Wild "Internet of Things" Are

Machine Learning Made Easy

Flying "First Class" with Python: Where Everything is an Object and Mutability Matters

Looking Into Libraries: How Are Static and Dynamic Libraries Different?

Static Libraries in C: Boosting Your Program's Efficiency

Let's Hit the Links

I Want to See All the C!

社区洞察

其他会员也浏览了

Code Optimization: What the Compiler can do for our programs

C++ Core Guidelines: Non-Rules and Myths

History of C language

Interview Questions C++

lvalue rvalue and their references in C++

A memory location is more important to a computer than the data stored in that location…

TYPES OF COMPILERS

Enhancing Code Quality with MISRA?-C Rules and GCC Options #7

Compilation process in C explained step by step.