登录查看更多内容

Optimization with Allocators in C++17

Rainer Grimm

Trainer at Modernes C++

发布日期: 2023年10月9日

This post is a cross-post from www.ModernesCpp.com.

Thanks to polymorphic allocators in C++17, you can optimize your memory allocation. This optimization includes performance and the reuse of memory.

Performance

The following program is from cppreference.com/monotonic_buffer_resource. I will explain and extend its performance test to Clang and the MSVC compiler.

// pmrPerformance.cpp
// https://en.cppreference.com/w/cpp/memory/monotonic_buffer_resource

#include <array>
#include <chrono>
#include <cstddef>
#include <iomanip>
#include <iostream>
#include <list>
#include <memory_resource>
 
template<typename Func>
auto benchmark(Func test_func, int iterations)              // (1)
{
    const auto start = std::chrono::system_clock::now();
    while (iterations-- > 0)
        test_func();
    const auto stop = std::chrono::system_clock::now();
    const auto secs = std::chrono::duration<double>(stop - start);
    return secs.count();
}
 
int main()
{
    constexpr int iterations{100};
    constexpr int total_nodes{2'00'000};
 
    auto default_std_alloc = [total_nodes]            // (2)
    {
        std::list<int> list;
        for (int i{}; i != total_nodes; ++i)
            list.push_back(i);
    };
 
    auto default_pmr_alloc = [total_nodes]            // (3)
    {
        std::pmr::list<int> list;
        for (int i{}; i != total_nodes; ++i)
            list.push_back(i);
    };
 
    auto pmr_alloc_no_buf = [total_nodes]             // (4)
    {
        std::pmr::monotonic_buffer_resource mbr;
        std::pmr::polymorphic_allocator<int> pa{&mbr};
        std::pmr::list<int> list{pa};
        for (int i{}; i != total_nodes; ++i)
            list.push_back(i);
    };
 
    auto pmr_alloc_and_buf = [total_nodes]            // (5)
    {
        std::array<std::byte, total_nodes * 32> buffer; // enough to fit in all nodes
        std::pmr::monotonic_buffer_resource mbr{buffer.data(), buffer.size()};
        std::pmr::polymorphic_allocator<int> pa{&mbr};
        std::pmr::list<int> list{pa};
        for (int i{}; i != total_nodes; ++i)
            list.push_back(i);
    };
 
    const double t1 = benchmark(default_std_alloc, iterations);
    const double t2 = benchmark(default_pmr_alloc, iterations);
    const double t3 = benchmark(pmr_alloc_no_buf , iterations);
    const double t4 = benchmark(pmr_alloc_and_buf, iterations);
 
    std::cout << std::fixed << std::setprecision(3)
              << "t1 (default std alloc): " << t1 << " sec; t1/t1: " << t1/t1 << '\n'
              << "t2 (default pmr alloc): " << t2 << " sec; t1/t2: " << t1/t2 << '\n'
              << "t3 (pmr alloc  no buf): " << t3 << " sec; t1/t3: " << t1/t3 << '\n'
              << "t4 (pmr alloc and buf): " << t4 << " sec; t1/t4: " << t1/t4 << '\n';
}

This performance test in line (1) executes the functions in lines 2 – 5 one hundred times (constexpr int iterations{100}) . Each call of the functions creates a std::pmr::list<int> of two hundred thousand nodes (constexpr int total_nodes{2'00'000}). The nodes of each list are allocated in different ways:

Line 2: std::list<int> uses the global operator new
Line 3: std::pmr::list<int> uses the special memory resource std::pmr::new_delete_resource
Line 4: std::pmr::list<int> uses std::pmr::monotonic_buffer without a preallocated buffer on the stack
Line 5: std::pmr::list uses std::pmr::monotonic_buffer with a preallocated buffer on the stack

The comment to the last function (line 5) states that the stack has enough space to fit all nodes: “enough to fit in all nodes“. This was correct on my Linux PC but not on my Windows PC. On Linux, the default for the stack size is 8 MB, but on Windows only 1 MB. Consequentially, my program execution on Windows using the MSVC compiler and the Clang compiler failed silently. I fixed it by changing with the help of editbin.exe the stack size of my MSVC and Clang executables:

Finally, here are the numbers. The reference value is the allocation with std::list<int> (line 2). Don’t compare the absolute numbers but the relative numbers because I used a virtualized Linux PC and a non-virtual Windows PC. Additionally, I enabled full optimization. This means (/Ox) for the MSVC compiler and (-Ox) for the GCC and Clang compilers.

Clang Compiler

GCC Compiler

MSVC Compiler

Interestingly, the memory resource std::pmr::new_delete_resource was always the slowest memory allocation. On the contrary, std::pmr::monotonic_buffer the fastest memory allocation. This holds particularly if you use a preallocated buffer on the stack. On Windows, memory allocation is about 10 times faster.

There is another optimization potential of std::pmr::new_delete_resource.

Modernes C++ Mentoring

Be part of my mentoring programs:

"Fundamentals for C++ Professionals" (open)
"Design Patterns and Architectural Patterns with C++" (open)
"C++20: Get the Details" (reopens December 2023)

Do you want to stay informed about my mentoring programs: Subscribe via E-Mail.

Rainer Grimm 1 年前

Understanding Ownership in Rust with?Examples

Luis Soares, M.Sc. 1 年前

To build Qt 6.7 static:

Ayman Alheraki 7 个月前

Memory Reuse

std::pmr::monotonic_buffer enables the reuse of memory, and you can, therefor, spare the to free the memory.

// reuseMemory.cpp

#include <array>
#include <cstddef>
#include <iostream>
#include <memory_resource>
#include <string>
#include <vector>

int main() {
 
    std::array<std::byte, 2000> buf;

    for (int i = 0; i < 100; ++i) {                                       // (1)
        std::pmr::monotonic_buffer_resource pool{buf.data(), buf.size(),  // (2)
                                                std::pmr::null_memory_resource()};
        std::pmr::vector<std::pmr::string> myVec{&pool};
        for (int j = 0; j < 16; ++j) {                                    // (3)
            myVec.emplace_back("A short string");
        }
    }
}

This program allocated a std::array of 2000 bytes : std::array<std::byte, 2000>. This stack-allocated memory is reused 100 times (line 1). The std::pmr::vector<std::prm::string> uses the std::pmr::monotonic_buffer_resource with the upstream memory resource std::pmr::null_memory_resource (line 2). Finally, 16 strings are added to the vector.

What’s Next?

This post ends my min-series about the polymorphic memory resources in C++17. In my next post, I will jump three years further and continue my journey through C++20.

Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dr?ge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschl?ger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Matthieu Bolt, Stephen Kelley, Kyle Dean, Tusar Palauri, Dmitry Farberov, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, and Marco Parri Empoli.

Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.

My special thanks to Embarcadero, PVS-Studio, Tipi.build, and Take Up Code.

Seminars

I’m happy to give online seminars or face-to-face seminars worldwide. Please call me if you have any questions.

Bookable

German

Embedded Programmierung mit modernem C++ 12.12.2023 – 14.12.2023 (Pr?senzschulung, Termingarantie)

Standard Seminars (English/German)

Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.

C++ – The Core Language
C++ – The Standard Library
C++ – Compact
C++11 and C++14
Concurrency with Modern C++
Design Pattern and Architectural Pattern with C++
Embedded Programming with Modern C++
Generic Programming (Templates) with C++

New

Clean Code with Modern C++
C++20

Contact Me

Phone: +49 7472 917441
Mobil:: +49 176 5506 5086
Mail: [email protected]
German Seminar Page: www.ModernesCpp.de
Mentoring Page: www.ModernesCpp.org

Modernes C++ Mentoring,

Ivan Zoraja

Associate Professor at FESB - Faculty of Electrical Engineering, Mechanical Engineering and Naval Architecture

1 年

Nice ...

1 次回应

Leanne Dong, Ph.D

Client driven IT lead.{computer systems, networking, web tech security infrastructure}

1 年

Rainer is truly a legend here. Reading multi of your blogposts even I don't quite have a job :)

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Optimization with Allocators in C++17

Rainer Grimm

Trainer at Modernes C++

Performance

Modernes C++ Mentoring

领英推荐

Memory Reuse

What’s Next?

Seminars

Bookable

German

Standard Seminars (English/German)

New

Contact Me

更多精彩文章

社区洞察

其他会员也浏览了

WASAM compilation

To build Qt 6.7 static:

All About Compiler

C++20: The Big Four

How to build Graal-enabled JDK8 on CircleCI?

C++ Exercise 2: What is Concurrency and Why is it Important?

LLVM Compiler Infrastructure explained

Exploring and Innovating with the GNU Compiler on AArch64: Part 2 (Nov 3, 2024)

Exploring and Innovating with the GNU Compiler on AArch64 (Nov 2, 2024)

Performance

Modernes C++ Mentoring

领英推荐

Memory Reuse

What’s Next?

Seminars

Bookable

German

Standard Seminars (English/German)

New

Contact Me

std::execution: Asynchronous Algorithms

2024年11月25日

My ALS Journey (17/n): Christmas Special

2024年11月20日

std::execution

2024年11月18日

C++26 Core Language: Small Improvements

2024年11月4日

My ALS Journey (16/n): Good Bye Training / Hello Mentoring

2024年11月3日

Placeholders and Extended Character Set

2024年10月28日

Contracts in C++26

2024年10月21日

Mentoring as a Key to Success

2024年10月19日

Reflection in C++26: Determine the Layout

2024年10月14日

My ALS Journey (15/n): A typical Day

2024年10月8日

社区洞察

其他会员也浏览了

WASAM compilation

To build Qt 6.7 static:

All About Compiler

C++20: The Big Four

How to build Graal-enabled JDK8 on CircleCI?

C++ Exercise 2: What is Concurrency and Why is it Important?

LLVM Compiler Infrastructure explained

Exploring and Innovating with the GNU Compiler on AArch64: Part 2 (Nov 3, 2024)

Exploring and Innovating with the GNU Compiler on AArch64 (Nov 2, 2024)