Using GCC's nested functions with wide pointers and no trampolines

Martin Uecker

Blog

Contact/Impressum

Using GCC's Nested Functions with Wide Pointers and no Trampolines

Martin Uecker, 2026-01-06

Introduction

Nested functions are extremely useful, which is why basically any computer language since ALGOL60 has them. Except C.

In particular, nested functions are useful to write kernels for passing them to higher-order functions. The kernels can access local state, and this is important because it then does not have to be passed as a void pointer to the higher-order function.

void tree_increase_all(tree(int) *t, int increment) void update(int *value) { (*value) += increment; }

tree_walk(t, update);

For very simply nested functions it makes also sense to use a lambda expression, i.e. an anonymous nested function.

void tree_increase_all(tree(int) *t, int increment) tree_walk(t, (void(int *value)){

(*value) += increment); });

How does this work?

The nested function gets passed in a special register &dash; the static chain register &dash; a link to its environment. It can then access the variables of the parent function(s) via this pointer. But how does the calling function know what to pass in the static chain register? The obvious solution is to have this information contained in the function pointer, which would be allowed by the C standard. But on most platform the standard application binary interface (ABI) does not allow this. There are different solutions, as we will see below.

GCC supports nested function as an extension (but not lambdas). C++ has lambdas with a different syntax, but those are actually objects where each instance has its own unique type. C++'s lambdas are therefor most useful in when used as argument to template functions. To be able to pass them to a non-template function, its unique type needs to be erased by wrapping it in std::function. Why C++'s lambdas are an ingenious design, but have various properties which make them a terrible fit for C. But this is a story for another day.

The Problem with GCC's Nested Functions

When creating a pointer to a nested function to pass it to another function, a program compiled with GCC creates at runtime a small piece of code - a so-called trampoline based on an idea from the eighties (Breuel et al.). These trampolines were traditionally placed on the stack, which then has to be executable. For security reason, it is commonly agreed on that an executable stack should be avoided. An alternative solution is to create the trampoline somewhere else, e.g. on memory allocated on the heap. In fact, GCC now can support this using the flag -ftrampoline-impl=heap. Unfortunately, this is computationally more expensive and it can also lead to memory leaks when the cleanup path is skipped over in via longjmp. It therefor makes sense to look for alternative solutions.

Wide Pointers and Descriptors

Instead of a trampoline, one can use a special function pointer type, which is essentially what C++ does with std::function and also what Apple's Block language extension does using its special syntax. There are essentially two ways how this can work. Either, the new function type is a wide pointer that consists of two parts, a pointer to the machine code of the function just as in a regular function pointer and another pointer that points to the environment of the function, i.e. the static chain pointer. When the function is called, the value for the static chain register is extracted from the wide pointer and loaded into the static chain register (r10 on x86_64) before the code pointer is loaded and the function is called.

A alternative is to build a function descriptor that contains code pointer and environment pointer, and then construct a regular pointer to this descriptor. A major advantage is that the pointer has the same size as any regular pointer. A disadvantage is that calling the function requires a second indirection to first load code and data pointers from the descriptor. Another disadvantage is that a conversion of a regular pointer then requires the construction of a descriptor, which needs to be stored somewhere, while it could be converted directly to a wide pointer simply by setting the static chain to NULL.

Preliminary Implementation in GCC

Defining a wide pointer type for a function R(A) is not difficult.

#define wide(R, A) \ struct wideptr_##R##_##A { \ void *chain; \ R (*code)(A); \

Given such a wide pointer, we can then call this function using a built-in function __builtin_call_with_static_chain that GCC already provides to support other languages that make use of this ABI.

#define CALL(ptr, arg) \ __builtin_call_with_static_chain(ptr.code(args), ptr.chain)

But where do we get the two pointer when constructing the wideptr? Surprisingly, there is no built-in function for this. In the past, I used a grotesque hack to read the two pointer from the trampoline that GCC has created. But this still required the creation of the trampoline even though it is then never used. The technique is...

Using GCC's nested functions with wide pointers and no trampolines

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

Naphtha Shortages Having a Growing Impact in Japan