When can the C++ compiler devirtualize a call?

lionkor1 pts0 comments

When can the C++ compiler devirtualize a call? – Arthur O'Dwyer – Stuff mostly about C++

When can the C++ compiler devirtualize a call?

Someone recently asked me about devirtualization optimizations: when do they happen?<br>when can we rely on devirtualization? do different compilers do devirtualization<br>differently? As usual, this led me down an experimental rabbit-hole. The answer<br>seems to be: Modern compilers devirtualize calls to final methods pretty reliably.<br>But there are many interesting corner cases — including some I haven’t thought of,<br>I’m sure! — and different compilers do catch different subsets of those corner cases.

First, let’s observe that devirtualization can (probably?) be done more effectively via<br>LTO,<br>using whole-program analysis. I don’t know anything about the state of the art in<br>link-time devirtualization, and it’s hard to experiment with on Compiler Explorer,<br>so I’m not going to talk about LTO at all. We’re looking purely at what the compiler<br>itself can do.

There are basically two situations where the compiler knows enough to<br>devirtualize. They don’t have much in common:

When we know the instance’s dynamic type

The archetypical case here is

void test() {<br>Apple o;<br>o.f();

It doesn’t matter if Apple::f is virtual; all virtual dispatch ever does is<br>invoke the method on the actual dynamic type of the object, and here we know<br>the actual dynamic type is exactly Apple. Static and dynamic dispatch should<br>give us the same result in this case.

A sufficiently smart compiler will use dataflow analysis to optimize non-trivial<br>cases such as

Derived d;<br>Base *p = &d;<br>p->f();

It turns out that even this simple dodge is enough to fool MSVC and ICC.<br>The next test case is

Derived da, db;<br>Base *p = cond ? &da : &db;<br>p->f();

This is too much for Clang, but GCC actually manages to survive it… until<br>you move the conversions to Base* inside the conditional! Here is where<br>even GCC’s analysis fails (Godbolt):

Derived da, db;<br>Base *p = cond ? (Base*)&da : (Base*)&db;<br>p->f();

When we know a “proof of leafness” for its static type

Okay, let’s suppose that we’re receiving a pointer from somewhere else in the<br>system. We know its static type (e.g. Derived*), but we don’t know the actual<br>dynamic type of the object instance to which it points. Still, the compiler can<br>devirtualize a call to Derived::f if it can somehow prove that no type in the<br>entire program can ever override Derived::f.

Proof-by-final

The simplest “proof of leafness” is if you’ve marked Derived as final.

struct Base {<br>virtual int f();<br>};<br>struct Derived final : public Base {<br>int f() override { return 2; }<br>};<br>int test(Derived *p) {<br>return p->f();

A pointer of type Derived* must point to an object instance that is<br>“at least Derived” — i.e., Derived or one of its children.<br>Since Derived is final, it isn’t allowed to have children; therefore<br>the dynamic type of the instance must be exactly Derived, and the compiler<br>can devirtualize this call.

Or, you can mark the specific method Derived::f as final.

The same analysis should apply no matter whether Derived::f is declared<br>in Derived itself, or inherited from Base. So for example the compiler<br>should be equally able to devirtualize

struct Base {<br>virtual int f() { return 1; }<br>};<br>struct Derived final : public Base {};<br>int test(Derived *p) {<br>return p->f();

GCC, Clang, and MSVC pass this test (Godbolt, case one); ICC 21.1.9 is fooled.

An utterly bizarre proof-of-leafness is to observe that when class C’s destructor is<br>final, C must be childless — because if C had a child, the child would<br>have to have a destructor (since you can’t make a class without a destructor),<br>which would then override C’s destructor, which isn’t allowed.<br>Clang actually both warns on final destructors, and optimizes on them.<br>Every other vendor considers this situation very silly<br>and doesn’t dignify it with a codepath as far as I can tell.

Proof-by-internal-linkage

A class whose name has internal linkage cannot be named outside the current translation unit.<br>Therefore, it cannot be derived from outside the current translation unit, either!<br>As long as it has no children in the current TU — or at least no children that override its<br>methods — calls to its virtual functions are devirtualizable.

namespace {<br>class BaseImpl : public Base {};<br>int test(Base *p) {<br>return static_cast(p)->f();

If p really does point to an object instance that is “at least BaseImpl,”<br>then the compiler can prove that the instance must be exactly BaseImpl.<br>(And if p doesn’t point to an instance that is “at least BaseImpl,”<br>the program has undefined behavior anyway.)

This strikes me as a case that might actually come up pretty commonly in real codebases.<br>It’s common to have a base class exposed publicly in the header file, and then one<br>or more derived implementations scoped tightly to a single .cpp file. If you<br>go the extra mile and put those derived implementations into anonymous namespaces,<br>you might be helping out the compiler’s devirtualization...

derived base compiler final type devirtualize

Related Articles