С interview questions. Bit-fields

С interview questions. Bit-fields

?Note: Almost all examples were compiled using the GCC 12.2 compiler for ARM

Intro

A Bit fields provide convenient access to individual bits of data. They allow you to create objects that are not multiples of a byte.

A bit field cannot exist by itself. It can only be an element of the structure. Bit fields have the following form:

No alt text provided for this image

As a?<type> field, int (both signed and unsigned), __Bool or implementation-defined type can be used.

C99 standard (6.7.2.1): A bit-field shall have a type that is a qualified or unqualified version of _Bool, signed int, unsigned int, or some other implementation-defined type.

The <name>?is an arbitrary identifier, and <size> is a positive integer that must not exceed the length of <type> in bits:

C99 standard (6.7.2.1): The expression that specifies the width of a bit-field shall be an integer constant expression with a nonnegative value that does not exceed the width of an object of the type that would be specified were the colon and expression omitted. If the value is zero, the declaration shall have no declarator.
No alt text provided for this image

We also can use calculation for bit-field width:

No alt text provided for this image

Packing of bit fields

The compiler tries to pack the maximum number of bit fields into the size specified in the <type> field. But if the bit field does not fit into the size <type>, then an additional variable is allocated for it.

C99 standard (6.7.2.1): An implementation may allocate any addressable storage unit large enough to hold a bitfield. If enough space remains, a bit-field that immediately follows another bit-field in a structure shall be packed into adjacent bits of the same unit. If insufficient space remains, whether a bit-field that does not fit is put into the next unit or overlaps adjacent units is implementation-defined. The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified

Example 1:

No alt text provided for this image

This bit field will be packed into a single variable uint8_t, and thus the size of the structure will be equal to 1 byte.

Example 2:

No alt text provided for this image

This bit field will be packed into two variables uint8_t, i.e. the size of the structure will be 2 bytes, because the b field do nor fit in remain space of a field.

Field Selection <type>

Let's look at a few more examples:

No alt text provided for this image
No alt text provided for this image

These examples clearly show the importance of choosing the right field <type> for each bit field. For example, if you change the type from uint16_t to uint8_t in the byte structure, this will allow the compiler to pack the entire structure into 1 byte, instead of 2:

No alt text provided for this image
No alt text provided for this image

However, with the byte1 structure, this focus will no longer work, because even if you replace the type uint16_t with uint8_t, the compiler will still not be able to pack all the bit fields into the size of uint8_t, because 5 bit + 4 bit = 9 bit.

No alt text provided for this image
No alt text provided for this image

We also cannot reduce the occupied size of the structure byte2, because it is not impossible to change the type from uint16_t to uint8_t for the bit field a, because its size is larger than 8 bits. And changing the type for the bit field b will not do anything, because there is enough free space left in the bit field a to pack both of these fields into a new uint16_t.

No alt text provided for this image
No alt text provided for this image

And finally, consider the last example with the structure byte3. Here, replacing the type of the bit field b with uint16_t with uint8_t will also not change the size of the structure. Because of the size of the field a, the compiler will not be able to fit the entire structure in 16 bits, and therefore the field a will have a size of 2 bytes as expected, and the field b will have a size of 1 byte. And the compiler will add one additional byte to the end of the structure. Why and why he does it can be read in this article.

No alt text provided for this image
No alt text provided for this image

Unnamed zero-size bit-fields

Unnamed zero-size bit fields have a special function - they disable packaging.

C99 standard (6.7.2.1): A bit-field declaration with no declarator, but only a colon and a width, indicates an?unnamed bit-field. As a special case, a bit-field structure member with a width of 0?indicates that no further bit-field is to be packed into the unit in which the previous bitfield, if any, was placed.

If you want the compiler not to pack two adjacent bit fields, then you need to use an unnamed zero-size bit field. This field forces the compiler to "indent" to the border of the field of the specified type <type>.

Look at the structure for example:

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

As you can see, the compiler packed all the bit fields into 1 byte as expected.

Now we will add unnamed zero-size bit fields of various types to this structure and see what will change.

Unnamed field uint8_t

Let's add an unnamed zero-size field of type uint8_t:

No alt text provided for this image
No alt text provided for this image

As you can see, the size of the structure became equal to 2 bytes due to the fact that the unnamed field do not allow the compiler to pack the bit fields into 1 byte and, therefore, for the bit field b, the compiler was forced to allocate a separate memory cell of type uint8_t.

No alt text provided for this image

Unnamed field uint16_t

Let's add an unnamed zero-size field of type uint16_t:

No alt text provided for this image
No alt text provided for this image

The byte_3 structure contains a uint16_t unnamed zero-length field and accordingly shifts the b field to the next 16-bit boundary and it turns out that the bit field already occupies 3 bytes:

No alt text provided for this image

Unnamed field uint32_t

Let's add an unnamed zero-size field of type uint32_t:

No alt text provided for this image
No alt text provided for this image

The byte_4 structure contains a uint32_t unnamed zero-length field and accordingly shifts the b field to the next 32-bit boundary and it turns out that the bit field already occupies 5 bytes:

No alt text provided for this image

When the alignment of an anonymous field does not work

However, if the address of the field is already a multiple of sizeof(<type>) bits, then the unnamed zero-length field will not add a shift:

It doesn't make sense to consider anonymous fields uint8_t because they are always aligned along the border of their type. Therefore, we will focus only on examples with anonymous fields of types uint16_t and uint32_t

Take the structure of byte_3, but this time, between the bit field a and the unnamed bit field, we will add a few more fields of the size necessary to align the unnamed bit field to the boundary <type>.

No alt text provided for this image

The byte_3 structure has the uint16_t unnamed field of zero length and, accordingly, should shift the b field to the next 16-bit boundary. However, given the fact that the b field is already aligned along the 16-bit boundary, the anonymous field does nothing and the size of this structure will not change.

No alt text provided for this image
No alt text provided for this image

It will be the same with the unnamed field uint32_t :

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

An unnamed field of non-zero size

Only unnamed fields of zero size disables the packing of bit fields. For an unnamed field of non-zero size, ordinary packing rules work.

No alt text provided for this image
No alt text provided for this image

The structures byte_2, byte_3, byte_4 contain anonymous fields with a length of 1 bit, which are packed into fields a, as the b field. This type of fields can be used to reserve a certain number of bits. For example, when working with hardware protocols, they often have such reserved bits.

Alignment of structures with bit fields

If there are regular fields with bit fields in the structure, then the first bit field will be shifted to the <type> type boundary.

Lets look at the next structure

No alt text provided for this image

According to the rule described above, in the example structure, the b field will be aligned to a 4-byte boundary:

No alt text provided for this image

However, if you specify a number of bits less than 32 for the b field, then the compiler may (or may not) allocate a type other than uint32_t for this field. For example, for the field b in the form: uint32_t b: 8, a field of the type uint8_t will be allocated, and accordingly there will be no alignment:

No alt text provided for this image
No alt text provided for this image

The order of the individual bits in the bit fields

Look at an example:

No alt text provided for this image

The question arises in what order will the fields b be located in the memory cell?

Like this:

No alt text provided for this image

or like this:

No alt text provided for this image

And the problem is that the bit packing order is not defined (more precisely, it is implementation defined).

Let's take next structure and look at it at different byte order cpus.

No alt text provided for this image

For ARM (LE):

No alt text provided for this image

For PowerPC(BE):

No alt text provided for this image

A little unexpected because if you look at the assembly files for these two architectures, we will see the following - for ARM (LE):

No alt text provided for this image

For PowerPC(BE):

No alt text provided for this image

As you can see, the bit sequence is different for different architectures.

For ARM (LE), the 0th bit in a byte has the maximum right position, i.e. offset 0, and for PowerPC(BE), the 0th bit has the maximum left position, i.e. offset 7.

But at the same time when printing printf("b1 = %p\n", byte.raw); the information is output the same for both cpus:

No alt text provided for this image

Why?

Because here the compiler comes to our help. Let's add the setting of the bit b0 to one to our code and look at the assembly code again:

No alt text provided for this image

ARM (LE):

No alt text provided for this image

As you can see here, the compiler generates quite logical code and sets the 0th bit.

PowerPC(BE):

No alt text provided for this image

And then the compiler, knowing that the code is generated for the BE architecture, sets the 7th bit, because for this architecture it is considered zero. And as a result, it turns out that thanks to the compiler, we do not notice any differences in working with bit fields for LE and BE. However, firstly, as mentioned above, the order of packing bits is not defined by the standard, and therefore you can not to rely on it. And, secondly, if you are need to work with data received from a processor with a different byte order or with some hardware bit protocol that you communicate with over the network, then the compiler will not help you here and you need to take this into account when working with this protocol through bit fields.

Signed and unsigned bit fields

Bit fields can be either signed or unsigned. Here lies one nuance that I could not understand for a very long time - why do bit fields need a signed data type? And this is because I perceived bit fields precisely as a combination of simple bits, and this is fundamentally wrong because bit fields in the c language are a way of working with data types of a size not a multiple of a byte.?

I.e., a record of the form:

No alt text provided for this image

It means not just a set of 4 logically connected bits, it means a data type of 4 bits in size, and the data type can be either signed or unsigned.

C99 standard (6.7.2.1): A bit-field is interpreted as a signed or unsigned integer type consisting of the specified number of bits.

That is, if for the type uint8_t the range of values is [0,255], for the type int8_t the range of values is [-127,127], then for the bit field uint8_t a:4 the range of values will be [0,15], for int8_t a:4 the range of values is [-8,7]. Let's show this by example:

No alt text provided for this image
No alt text provided for this image
Dr Chris Rose MIET MIEEE

Embedded C programming | Digital, analogue & power electronic design | Prototyping | Engineering training provider

2 年

I tend to avoid bitfields. The biggest problem with them is highlighted in your article - too much of their behaviour is implementation defined! Given they are most useful in situations where those fine details are important, they're not as useful feature as they first seem.

Great article. Low level bit twiddling is a bit out of fashion these days but many embedded systems still use bitfields to set all sorts of configurations, so it's still very much relevant there.

要查看或添加评论,请登录

Aliaksandr Kavalchuk的更多文章

  • С interview questions. Static.

    С interview questions. Static.

    So one of the most common questions at firmware interviews is what does the keyword static mean and how is it used? The…

  • Bit operations in general and Arm bit banding in particular

    Bit operations in general and Arm bit banding in particular

    This article appeared in the process of writing a small library for working with the bit-banding mechanism for…

    2 条评论
  • С interview questions. Structure alignment in c

    С interview questions. Structure alignment in c

    This is one of the most popular interview questions. Calculate size of the next structure: And it is probably clear to…

  • How arguments passed into с function?

    How arguments passed into с function?

    The question is on the knowledge of such a thing as Procedure Call Standard. It differs for different architectures.

    3 条评论
  • Вопросы на cи собесе. Выравнивание элементов структуры.

    Вопросы на cи собесе. Выравнивание элементов структуры.

    Раз в прошлый раз мы затронули такую тему как выравнивание элементов структуры, то пожалуй разберем вопрос связанный…

    6 条评论
  • Вопросы на cи собесе. Flexible array member.

    Вопросы на cи собесе. Flexible array member.

    Update 1: Благодаря комментарию Nikita Divakov обновилось название статьи и в нем появилось нормальное название…

    7 条评论
  • Вопросы на Си собесе. Что такое static?

    Вопросы на Си собесе. Что такое static?

    Итак один из самых распространенных вопросов на собеседованиях по знанию языка С - что означает и как применяется…

    6 条评论

社区洞察