C++ tidbit #5: const static members

C++ tidbit #5: const static members

struct S { static const int x = 0; };
int?       n = S::x;
const int& m = S::x;  <-- link error: undefined reference to `S::x`        

This link error is clearly impossible. Not only is `S::x` defined right there in line 1, in line 2 it is used successfully. How could it turn into an undefined-reference one line later?!

This particular rabbit hole dive began with a comment in Andy Soffer's CppOnSea2022 talk, and its depth surprised me. Let's go in.

(Partially made up) C++ Standard History

The old, simple days

In the elden days you'd declare a static data member as part of the class declaration, typically in a header. It wasn't stored as part of any instance and the class declaration was included in many translation units - so you had to communicate a single intended storage for it, by separately defining it in a single translation unit:


// S.h:
struct S { static int x;};
// src1.cpp: 
#include "S.h"
int S::x;  // optionally: =1        

<detour> The selection of translation unit (TU) to use is entirely inconsequential. So why not have the location for S::x be chosen automatically? Well, in the classical compiler/linker separation of responsibilities - neither could make this choice: the compiler processed only one TU at a time, and the linker couldn't 'create' data - just pulled it from TUs into a unified executable. More on that later. </detour>

The Middle Ages

In a second phase, static *const* data members started being advertised as better (type-safe, scoped) alternatives to macros.


# define NUM_WIDGETS 10
struct Widget { } ;
Widget arrWidgets[NUM_WIDGETS]

// could be modernized into:
struct Fidget { static const int nFidgets; }
Fidget arrFidgets[Fidget::nFidgets];        

Then a startling discovery was made: in this usage, the static const Fidget::nFidgets doesn't really need any storage! When the compiler has its value (say 10) it is perfectly happy to embed it directly into stack allocations or machine instructions, and not take it from any memory storage. (This was long before constexpr was born).

So, initialization as part of declaration was made legal for static const integers, and no definition in any cpp was required any longer:


struct Fidget { static const int nFidgets = 10;}; 
Fidget arrFidgets[Fidget::nFidgets];
        

As is often the case, these good intentions resulted in some unforeseen, hairy side effects. For one, 'A uses B' no longer had one clear meaning. Two forms of 'usage' had to be distinguished: usage that requires storage for B, and usage that doesn't.

Old-style usage, which requires storage, was baptized as "ODR-usage". This is a strong contender to the least-informational C++ term (second only to "RAII"), as connection to the actual One-Definition-Rule is vague at best. If you wanted to:


const int& r = Fidget::nFidgets;

std::vector<int> v; 
v.push_back(Fidget::nFidgets);  // <-- takes a reference        

--you still had to create a definition (==storage) for Fidget::nFidgets in some cpp file. If you stuck with declaration+initializaiton only:


// Fidget.h
struct Fidget { static const int nFidgets = 10;};        

You could use Fidget::nFidgets only in non-odr way, such as -


Fidget arr[Fidget::nFidgets] ;

int?n = Fidget::nFidgets ;  // <-- think of the rhs as a literal. 
                            //     No storage is actually required == non-odr.        

Some poor soul had to grep the entire C++ standard for 'use' and change it to odr or non-odr, and sometimes split the wording by case. For the sole benefit of allowing nFidgets=10 at the declaration site. (I think... are you aware of other entities that support only non-odr use?)

Solving the original mystery

Recall we started with -


struct S { static const int x = 0; };
int?       n = S::x;
const int& m = S::x;  <-- link error: undefined reference to `S::x`        

And said that `S::x` is (1) defined in line 1, (2) used in line 2. Both turned out to be small lies: S::x is in fact (1) declared+initialized in line 1, but not defined, (2) used in line 2 only in a weak sense (non-odr). The link failure in line 3 is hopefully clear by now: it is an attempted odr-use for a variable with no definition.

Modern C++ Fix

Remember this detour paragraph above?

why not have the single location for S::x be chosen automatically? Well, in the classical compiler/linker separation of responsibilities - neither could make this choice: the compiler processed only one TU at a time, and the linker couldn't 'create' data - just pulled it from TUs into a unified executable.

That is... almost true. Linkers indeed cannot create data, but they can select one from many copies. Matter of fact they do it all the time: when you mark a function as inline, the compiler creates an instance of it in every TU that includes it and the linker has the ability to merge them all into one, storage and all (say if someone takes the address of it). Along the road towards C++17 another startling discovery was made: this entire inlining apparatus is already in place - we can use it for variables, not just functions!

So, in C++17+ you'd probably want to solve the original error with either -


struct S { static const inline int x = 0; };        

Or even more expressively:


struct S { static constexpr int x = 0; };        

(constexpr implies inline).

Side Note: Why Just Integers?

Making this legal:


struct S { static const float f = 0; };        

could have made the following succeed without any definition for S::f:


float g = S::f;        

So why not?

The standard term for allowed types in static consts is Integral constant expressions. These also include enums and a few technical restrictions, but the important bit to note is their usage. Integral constant expressions, and only them, are expected at :

  1. array bounds,
  2. the dimensions in?new-expressions?other than the first (until C++14),
  3. bit-field lengths,
  4. enumeration initializers when the underlying type is not fixed,
  5. alignments.

And moreover, these are all non-odr uses.

Integers are special indeed. Float (and other) const statics are just not useful enough to be considered in this context.

Alex Dathskovsky ?

Director of Software Engineering @ Speedata.io | C++ Guru and Speaker | ISO C++ standardization group member

2 年

Great article ??

要查看或添加评论,请登录

Ofek Shilon的更多文章

  • Compiler Limitations #3/3

    Compiler Limitations #3/3

    Some examples, before the main point As discussed in the 1st post in this series, clang isn't able to properly express…

    3 条评论
  • C++ tidbit #10: `=default` Impact on Initialization

    C++ tidbit #10: `=default` Impact on Initialization

    Here's just one gotcha of dozens lurking within C++ initializations. The following is called 'value initialization':…

    3 条评论
  • C++ tidbit #9: nullptr_t doesn't behave like a pointer type?

    C++ tidbit #9: nullptr_t doesn't behave like a pointer type?

    Suppose you want to customize some behavior for compile-time nullptrs. As a concrete example let's take a comparison of…

    11 条评论
  • C++ tidbit #8: Damaging Default Destructor

    C++ tidbit #8: Damaging Default Destructor

    Special member functions are implicitly generated by the compiler if the user didn't provide them - constructors…

    9 条评论
  • C++ tidbit #7: Lifetime Extension by Const Ref

    C++ tidbit #7: Lifetime Extension by Const Ref

    Const-ref can bind to a temporary..

    11 条评论
  • C++ tidbit #6: Virtual functions and trivial copy

    C++ tidbit #6: Virtual functions and trivial copy

    Trivial Copyability In C++ terms an object is `trivially-copyable` if it is ok to memcpy it around. Containers (stl and…

  • C++ tidbit #4: Signed UB

    C++ tidbit #4: Signed UB

    All happy integers are alike, but every unhappy integer is unhappy in its own way Since the early computing days all…

    7 条评论
  • C++ tidbit #3: Contextual Conversion

    C++ tidbit #3: Contextual Conversion

    Warmup: explicit constructors When you mark a cast operator as explicit: struct C { explicit operator bool() {…

    1 条评论

社区洞察

其他会员也浏览了