С interview questions. Structure alignment in c
This is one of the most popular interview questions.
Calculate size of the next structure:
And it is probably clear to everyone that if the size was calculated simply by adding the sizes of all the elements included in this structure, which in our case is 9 bytes, then no one would ask this question.
The correct answer is that the size of the structure is 16 bytes.
Note 1: All examples discussed in this article were run on the STM32F407 microcontroller and compiled by the GCC 10.3 compiler
Structure alignment
The fact is that for structures in c, the next rule works:?the fields of the structure are aligned in memory along a boundary multiple of their own size. That is, 1-byte fields are not aligned, 2-byte fields are aligned to even positions, 4—byte fields are aligned to multiples of four, etc.
And then our structure actually has next view:
And only size of this structure we can calculate as adding the sizes of all the elements included in this structure (including the elements?padding[1-7]).
So the compiler added 7 extra elements to our structure:
Structure packing
So we saw that the compiler adds alignment, but can this be avoided if necessary? Such a need may arise in at least 2 cases:
In the both cases, you can get rid of the alignment of the elements of the structure and this is done by packing its elements. There are two types of it:?manual?and?using the compiler directive.
Manual structure packing
With this packaging, it is necessary to redistribute the elements of the structure by yourself in such a way as to get rid of the empty elements that the compiler adds as much as possible. In our example, if the last element of the structure?e?is moved to the second place, then the element?padding1?disappears. If the element?d?is swapped with?c, then the elements?padding2,?padding3,?padding4?will disappear, and the size of the structure will become 12 bytes, instead of 16 bytes:
But in order not to engage in such fine-tuning every time, smart people have deduced a simple rule:?arrange the elements of the structure in descending order of their size:
This type of packaging, in addition to requiring manual adjustment, has another drawback: it could not provide maximum packaging in some cases. After all, in our example, the fields:?padding5,?padding6,?padding7?still remained in the structure. To get rid of them, you need to use the following type of packaging method.
领英推荐
Structure packing by using compiler directive
In this case the special GCC compiler directive was used:?#pragma pack:
This directive forces the compiler to refuse alignment and stack all the elements of the structure sequentially one after another. By using this directive you can get the minimum size of the structure - 9 bytes:
Why do compiler need to align structures?
Well, there are two reasons:?acceleration of access?to structure elements,?avoiding HardFault/MemFault?on some processor architectures.
Acceleration of access to structure elements
The fact is that usually access to aligned variables is faster than to unaligned ones. Let 's look at the example of writing to elements of an aligned and unaligned structure:
For write to the aligned structure, the compiler generates the following code:
And to write to an unaligned structure:
As you can see, the assembler code for these two cases is identical except for the part of working with the unaligned element?.b.
When working with this element, the compiler is forced to use two?strb?commands to write the highest and lowest byte instead of one assembler command to write a half-word:?strh?. And at the same time use the commands of the logical operation?OR-NOT.
That is, instead of 3 instructions, we got 7. Accordingly, the dependence is simple: the more processor instructions to access an element of the structure, the more time this access takes.
However, the code that increases the number of instructions is generated by the compiler, but for many processor architectures this is unnecessary, because they fully support read/write operations of unaligned data in one instruction. For example, the Armv7 architecture processor (Cortex M3, M4, and M7) is available writing the variable?uint16_t?to an unaligned address with a single command?strh.
According to the ARMv7-M Architecture Reference Manual:
The following data accesses support unaligned addressing, and only generate alignment faults when the CCR.UNALIGN_TRP bit is set to 1, see Configuration and Control Register, CCR on page B3-660:
Non halfword-aligned LDR{S}H{T} and STRH{T}.
Non halfword-aligned TBH.
Non word-aligned LDR{T} and STR{T}.
According information above I had a question:?is it possible to somehow force the compiler to generate more optimal code for such architectures??I haven't found the answer yet, but if any of the dear readers knows the answer, then I would be grateful if you share it with me.
Note 2: In addition to increasing the number of instructions for writing an unaligned value, the presence of two write commands for the highest and lowest bytes turns the operation of writing a value to the?.b?element into a?non-atomic?operation.
HardFault/MemFault avoid
On some architectures, access to unaligned memory can not only slow down the program a little, but also cause an exception (MemoryFault/HardFault). Such architectures include, for example, ARMV6m (Cortex M0, M0+). So, if you are writing firmware for such architectures, then using the packaging method of the compiler directive is strictly contraindicated for you.