登录查看更多内容

Introduction to ELF

David Orejuela

Software Developer | Docker, Flask, FastAPI

发布日期: 2021年3月25日

What is ELF

We are not talking about Christmas ELF, here we are talking about the Unix ELF, Executable and Linkable Format (ELF) is a standard in UNIX to define:

Executable files (Binary file with instruction to be performed by the CPU)
Object code (Sequence of instructions in machine language)
Shared libraries (Instructions to be shared between executable files)
Core dumps (Recorded state of the working memory of a computer program at a specific time)

Why it is used

A standard format is used so that between a different number of machines with different characteristics can still be understood, the ELF format offers flexibility, extensibility, and cross-platform support for divergent endian formats and address sizes.

a.out Was used in Linux before ELF, but ELF’s design is not limited to a specific processor, instruction set, or hardware architecture. The a.out file format on Linux was deprecated with the release of the 5.1 Linux kernel.

What information is stored in it

Let's create a simple hello world program

gcc main.c -o hello_world

This file hello_world will have multiple sections and segments, and additional fields that allows our code to be executed

Executing and Linking ELF files

Note that only the Only the ELF header has a fixed position in any ELF file.

When we are linking (creating the executable) only the sections part of the ELF file is important since there is information about instructions, data, symbol table, relocation information and other parts of the program.

When we are executing, we only care about the segments of the ELF file that give us all the information, the system needs to prepare the program for execution, memory addresses, permissions, VRAM addresses, and data.

ELF header

We can get information about the ELF file using the header by using readelf:

readelf -h hello_world

The ELF header defines whether the file is designed to use 32-bit or 64-bit addresses. The header contains three fields that are affected by this setting and offset other fields that follow them.

The ELF header is 52 or 64 bytes long for 32-bit and 64-bit binaries respectively, since the ELF header is the only fixed part of the ELF it gives the information needed to access the sections and segments inside the file.

Section header

readelf -S hello_world

Section header table, describing zero or more sections, Various sections hold program and control information, sections are needed on links but they are not needed on runtime, you can delete the sections from your working ELF file and it will still be able to run the software.

Program header

readelf -l hello_world

The program header table describes zero or more memory segments, and how has the memory to be mapped in order to execute the program. Tells the system how to create a process image. It is found at file offset e_phoff, and consists of e_phnum entries, each with size e_phentsize.

The layout is slightly different in 32-bit ELF vs 64-bit ELF, because the p_flags are in a different structure location for alignment reasons. Each entry is structured as:

How this information is stored

All the information inside the file is stored in the form of bytes, those bytes can be understood according to the following headers:

ELF header, Section header and format header for a 64 bit architecture

Each ELF file is composed by bytes of information, this information is parsed using the structures defined by the headers, some commun segments are:

.text: code.
.data: initialised data.
.rodata: initialised read-only data.
.bss: uninitialized data.
.plt: PLT (Procedure Linkage Table) (IAT equivalent).
.got: GOT entries are dedicated to dynamically linked global variables.
.got.plt: GOT entries dedicated to dynamically linked functions.
.symtab: global symbol table.
.dynamic: Holds all needed information for dynamic linking.
.dynsym: symbol tables dedicated to dynamically linked symbols.
.strtab: string table of .symtab section.
.dynstr: string table of .dynsym section.
.interp: RTLD embedded string.
.rel.dyn: global variable relocation table.
.rel.plt: function relocation table.

How to parse this information

Custom code can read the information about an ELF file taking into account the endianess, size of the addresses and every data inside the ELF file can be read, of course, there already exist solutions to obtain information about ELF files like nm, the objdump, and readelf.

The readelf command

The readelf command displays information about ELF files, we can indicate about which part of the code we want to obtain more information, This program performs a similar function to objdump but it goes into more detail and it exists independently of the BFD library, the base code of the readelf command uses the "elf.h" header to parse byte by byte.

The nm command

Inside the .symtab region of the ELF format, there's information about each segment that the code needs to be executed in the computer, with nm command we can have more information about those segments, displaying the lists of symbols from object files, if no object files are listed as arguments.

The objdump command

objdump displays information about one or more object files

ELF format allows flexibility since everything you need to understand the content of the file can be explained in the header, as we can see we can check for data in the read-only memory of our hello world program in the initialized read-only data section (rodata) of our ELF file.

ELF format is pretty popular so having a basic understanding of how the information is parsed and what's stored in the file, will help you debug easily your code.

要查看或添加评论，请登录

David Orejuela的更多文章

How to use multi-threading to increase your app performance

2022年6月19日

How to use multi-threading to increase your app performance

Sometimes when we code the paradigm of using multiple threads can be a little scary, but multi-threading is a common…
Analysing Linux series: How is the RAM and VRAM related?

2021年1月19日

Analysing Linux series: How is the RAM and VRAM related?

RAM stands for Random Access Memory, let's break this with a common example, there are some chips that are used for…
Analysing Linux series: The /proc Filesystem

2021年1月18日

Analysing Linux series: The /proc Filesystem

Every time we run Software in Linux a process is created and an ID is assigned to that process (starting from 1 and in…
Our experience launching Hovify

2020年11月2日

Our experience launching Hovify

“If you spend too much time thinking about a thing, you’ll never get it done.” – Bruce Lee Our project is a…

1 条评论
What happens when you type https://www.holbertonschool.com in your web browser and press Enter

2020年8月23日

What happens when you type https://www.holbertonschool.com in your web browser and press Enter

First, let's stat that you are on your web browser and you type like this: https://www.holbertonschool.
Entering the IOT world

2020年8月19日

Entering the IOT world

“If you think that the internet has changed your life, think again. The IoT is about to change it all over again!” —…
Machine learning for everybody

2020年7月5日

Machine learning for everybody

“Machine learning will automate jobs that most people thought could only be done by people.” ~Dave Waters Machine…
Python the world of objects

2020年5月27日

Python the world of objects

Abstraction is one of those notions that Python tosses out the window, yet expresses very well. - Gordon McMillan…
Entering the world of Class and Instance attributes

2020年5月26日

Entering the world of Class and Instance attributes

In the one and only true way. The object-oriented version of 'Spaghetti code' is, of course, 'Lasagna code'.
Applying dynamic and static Libraries in C programming

2020年5月4日

Applying dynamic and static Libraries in C programming

Let's start by defining library, a library is a collection of modules stored in object format (1's and 0's) this…

See all articles

Introduction to ELF

David Orejuela

Software Developer | Docker, Flask, FastAPI

What is ELF

Why it is used

What information is stored in it

Executing and Linking ELF files

ELF header

Section header

Program header

How this information is stored

ELF header, Section header and format header for a 64 bit architecture

How to parse this information

The readelf command

The nm command

The objdump command

David Orejuela的更多文章

社区洞察

其他会员也浏览了

Mastering Unix System Calls with Rust's Nix Crate

LINUX BOOT PROCESS

“A riot is an ugly thing and I think it is just about time that we had one!”

RHEL: "Increase or Decrease Static Partition Size in Linux using "resize2fs" without losing the Data"

Windows Server 2003 Build Server

ELF Linux Executable PLT and GOT Tables

[Linux Kernel] Mutex (1) - Overview

Hacking The Linux Kernel

How to Use the Less Command?

What is ELF

Why it is used

What information is stored in it

Executing and Linking ELF files

ELF header

Section header

Program header

How this information is stored

ELF header, Section header and format header for a 64 bit architecture

How to parse this information

The readelf command

The nm command

The objdump command

David Orejuela的更多文章

How to use multi-threading to increase your app performance

Analysing Linux series: How is the RAM and VRAM related?

Analysing Linux series: The /proc Filesystem

Our experience launching Hovify

What happens when you type https://www.holbertonschool.com in your web browser and press Enter

Entering the IOT world

Machine learning for everybody

Python the world of objects

Entering the world of Class and Instance attributes

Applying dynamic and static Libraries in C programming

社区洞察

其他会员也浏览了

Mastering Unix System Calls with Rust's Nix Crate

LINUX BOOT PROCESS

“A riot is an ugly thing and I think it is just about time that we had one!”

RHEL: "Increase or Decrease Static Partition Size in Linux using "resize2fs" without losing the Data"

Windows Server 2003 Build Server

ELF Linux Executable PLT and GOT Tables

[Linux Kernel] Mutex (1) - Overview

Hacking The Linux Kernel

How to Use the Less Command?