登录查看更多内容

Strategies for reserving memory for std::vector

Noam Weiss

发布日期: 2022年1月17日

I haven't written here lately, but a post by Amir Kirsh caused me to think of a few things - so I decided to share them here as well.

??????: consider strategies for reserving memory in vectors

So the basic example that I'm shamelessly stealing from Amir is this piece of code:

??????::????????????<??????::????????????> ????????

??????(std::????????_?? ??=0; ??<??000; ++??) {

??????????.????????_????????(??????::????_????????????(??));

};

Here we generate a vector that holds the strings "0" to "999", and while in terms of functionality it is fine - adding the elements one by one causes the vector to keep reallocating memory and copy\move the elements from the old memory location to the new one. But since we know how much memory we are going to need, we can tell vector to allocate all of it in advance:

??????::????????????<??????::????????????> ????????

????????.??????????????(??000);

??????(std::????????_?? ??=0; ??<??000; ++??) {

??????????.????????_????????(??????::????_????????????(??));

};

Great, much better! If we know how much memory we are going to need we can reserve it (the amount of memory the vector has is called the capacity) in advance and save all the reallocations and copy\move operations. But what if we don't know what size the vector is going to be? Well, there are still a couple of options to consider.

Before going any further, I'm going to make some assumptions mostly so it will be easy for me to do the calculations :

While the standard doesn't specific exactly how to capacity of the vector should increase, a common rule-of-thumb (and from what I can tell, what GCC and Clang are doing in practice) it to multiply the previous capacity in 2. So in the first example, the capacity of the vector will be: 0, 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024. So I'm going to assume that this behavior for the vector.
I'm going to assume that all memory allocations are equally costly. This is not necessarily true, especially if you are using a custom memory allocator.
Much in the same way, I'm going to assume that all the copy\move operations are of equal cost (think of short strings optimization for why it may not be the case).

So I'm going to consider three options for when we don't know exactly how much memory we are going to need.

Take you best guess

This particularly effective when you are building your vector out of another source of information with a known size. Since the cost of guessing a number which is a little too large is a small memory waste, and the cost of guessing a number which is a little too small is a large memory waste (since the vector will reallocate and double the amount of memory it uses) and a large computation overhead (to copy\move almost all the elements) - it is better to overestimate in this case.

For example, let us assume we want to create a vector of names of all the students that have a birthday this month. While we know how many students they, we don't know in advance how many were born this month - but it probably about 1/12 of them, so a slightly higher number might be a good guess, for example:

birthday_students_vec.reserve(total_number_of_students / 10);

This will skip all the unnecessary reallocations and copy\move operations. You might end up with a little memory consumption overhead, but this is likely to happen anyway if you let the vector do the reallocations on its own.

So if you can, you should take a guess as to how much memory to reserve - even if it is inaccurate. But what if you simply don't know or can't even guess? Well, there are still more options.

Reserve 32 elements

The idea here is that we can save quite a lot of reallocations in exchange for the risk that we might waste some (relatively small) amount of memory. This is based on the observation that much of the allocations happen when the vector is still rather small - and slow done as it grows in size.

领英推荐

Failure Friday Part 2: Test Like You Fly and…

Getting It Right 2 年前

Firmware v2.1.0 Release

Hargrave Technologies 7 个月前

Reliable CDC constraints #2: Counters and FIFOs

Lukas Vik 11 个月前

If we look at the original example - we've had 11 memory allocations in it: form 2^0 to 2^10. If instead we were to do something like this:

??????::????????????<??????::????????????> ??????;

????????.??????????????(32);

??????(std::????????_?? ??=0; ??<??000; ++??) {

??????????.????????_????????(??????::????_????????????(??));

};

Then now we only have 6 memory allocations: from 2^5 to 2^10. We've saved nearly half of the memory allocations, and a few of the copy\move operations. The "cost" here is that if instead of 1000 elements the vector would have turned out to have something like 9 elements (or any number blow 17, really) in it, than we have wasted more memory than we otherwise would have. But since 32 is still relatively small number, that extra waste would probably also be relatively small.

Reserve a lot more than you need, and then shrink it back down

Alright, lets look at a code example and see how something like that may work

??????::????????????<??????::????????????> ??????;

????????.??????????????(7500);

??????(std::????????_?? ??=0; ??<??000; ++??) {

??????????.????????_????????(??????::????_????????????(??));

};

strs.shrink_to_fit();

So here we are allocating a lot more memory than what we will actually going to need - which is the main drawback of this strategy. It is by no means simple to assume that that is possible - so you need to check this won't cause any troubles. Another potential issue is that according to the standard, the vector doesn't have to respect the shrink_to_fit function - it is non-binding. But most implementations will, and if not it is simple to work around this limitation.

But what do we gain? Well, in terms of allocations, we have two - one when we reserve the memory at the beginning, and one when we shrink the memory back. That down from 11, and only one "unnecessary" memory allocation. Not bad if you can afford the initial memory allocation.

What about copy\move operations? Well in the original solution we had 1 + 2 + 4 + 8 + 16 + 32 + 64 + 128 + 256 + 512 such operations, for a total of 1023 operations. In the new solution we have exactly 1000 of them - we've saved 23 copy\move operations! This is not unusual, in most cases we will save on the number of operations - and in the worst case it will only cost us one extra operation. So this is pretty good from this perspective as well.

Again, the main problem with this approach is the need for a large initial memory allocation. I think that can also be remediated using a temporary vector with custom allocator - but that is a topic for another post.

Noam Weiss的更多文章

Safe std::vector iterators

2022年9月5日

Safe std::vector iterators

A while ago, I had the pleasure of participating in the #cppgameshow, where Peter Bindels won very convincingly in my…

4 条评论

Strategies for reserving memory for std::vector

Noam Weiss

Take you best guess

Reserve 32 elements

领英推荐

Reserve a lot more than you need, and then shrink it back down

Noam Weiss的更多文章

社区洞察

其他会员也浏览了

Once upon a time, WriteProcessMemory and I...

Flatten mode functional netlist ECO

" Lock the Malloc"

Relation between clock skew and frequency of operation

Characterize level-shifter - It can’t get simpler than this

How to parse the /proc/maps file in order to read the virtual memory.

Conversion of archaic technical devices

Order Book with Custom shared_ptr

Quick demo to function calling with Mistral-7B-Instruct-v0.3

Two dimensional arrays:

Take you best guess

Reserve 32 elements

领英推荐

Reserve a lot more than you need, and then shrink it back down

Noam Weiss的更多文章

Safe std::vector iterators

社区洞察

其他会员也浏览了

Once upon a time, WriteProcessMemory and I...

Flatten mode functional netlist ECO

" Lock the Malloc"

Relation between clock skew and frequency of operation

Characterize level-shifter - It can’t get simpler than this

How to parse the /proc/maps file in order to read the virtual memory.

Conversion of archaic technical devices

Order Book with Custom shared_ptr

Quick demo to function calling with Mistral-7B-Instruct-v0.3

Two dimensional arrays: