A worm in my Erlang cluster, and adventures in microfluidics

chantepierre1 pts0 comments

Lucas Sifoni - A worm in my Erlang cluster, and adventures in microfluidics<br>Home Blog Talks Hire<br>A worm in my Erlang cluster, and adventures in microfluidics<br>elixir programming erlang hot-loading introspection

June 5 2026<br>Previous post : 34 days in an Elixir tunnel to refactor my SaaS without a rewrite

Working on something in Elixir or on the BEAM<br>where a fresh pair of eyes would help ? I take<br>on a couple of days a week for consulting work<br>: library design, structure review,<br>exploratory prototypes... See how I can help &rarr;<br>Just to be clear, it should maybe be “minifluidics” or “millifluidics” but I’ll ask for forgiveness on this one, because microfluidics reads better. We’ll see what this has to do with my Erlang cluster a bit later.

Sparsely connected Erlang clusters

An erlang cluster, by default, is fully meshed, meaning that every node maintains a connection to all the others. Since this can lead to excessive chatter and an explosion of edges, it is possible to not fully connect an Erlang cluster, and connect some peers to select peers. This means that instead of having a full mesh, you can cut sub-meshes in an Erlang (or Elixir) cluster and connect them together via bridges, aka single (or sparse) connections.

We will use this notation for graphs from now on, which can be interpreted by .dot visualizers :

graph G {<br>a -- b;<br>b -- c;<br>a -- c;

The above graph describes a fully-meshed 3-node graph, while the below one describes a 4-node cluster where the “d” node only connects to “c”, meaning the nodes form a triangle with a tail.

graph G {<br>a -- b;<br>b -- c;<br>a -- c;<br>d -- c;

An Erlang node can list the nodes it sees by calling :erlang.nodes() or Node.list() in Elixir. My question is then : how can you map an arbitrary cluster with arbitrary connections, from a single node ?

Walking an Erlang cluster like a graph

My answer is : by asking all the nodes for their neighbours, and comparing the answers with the neighbours I can see.<br>In the above graph, if I am on node a, I will ask :

b, who answers [:a, :c]

c, who answers [:a, :b, :d]

My own neighbors are [:b, :c]. If I take the difference between my own neighbors and the answers, node c gives me new knowledge : there is [:d] that only c can reach.<br>But what if d itself has neighbors ? I can ask c to ask d to query their own neighbors. d would report to c who would report to me. And if d finds that some of its neighbors have neighbors it cannot see ? This must continue.

We need to flood-fill the graph no matter its topology.

I’ve promised I’d talk about fluids : the graph traversal illustrations here and lower were done by moving ink in channels, and there are a few details about that in this child post : behind the scenes post.

And to do this, I would like to not ship code to more than a single node .

The need for self-propagating code in the cluster

My goal is now clear : I want to build a mapping tool that works with any Erlang cluster and reports its full topology, no matter how sparse or dense are the connections, and I want this tool to be a single file that I can ship or paste to a single node. Because clustered nodes have no obligation to share code, and hot-loading mechanisms only load code on one node. So, if you have a Probe module loaded on your node, you can’t :erpc.call(neighbor, Probe, :run, []) on it if it does not have this module.

Thankfully, Erlang has tools for us : :code.load_binary(module, filename, binary) where filename is only to tag the newly created module in the code server, and does not map to a filesystem operation, and :code.get_object_code(module) that gives us the object code for a loaded module, but is unable to recover the object code for an in-memory-loaded module.

If you paste :

defmodule Probe do<br>def run() do<br>...<br>end<br>end

into iEx, and you call :code.get_object_code(Probe), despite the module being defined, you get an :error.

So, our first task is to create a module that we can paste in a first node, and that makes this first node have access to the module binary. I settled to use Kernel.ParallelCompiler.compile_to_path/2 which produces a .beam file, and to add the temporary compilation path to the code server via Code.append_path/1. I resorted to this function from the elixir compiler itself because I did not find an equivalent in :code that gave later access to the object code.

defmodule ProbeWrapper do<br>def load() do<br>payload = """<br>defmodule ActualProbe do<br>def run() do<br>:ok<br>end<br>end<br>"""

tmp = System.tmp_dir!<br>name = (:crypto.strong_rand_bytes(16) |> Base.encode16) <> ".ex"<br>path = Path.join(tmp, name)<br>File.write!(path, payload)<br>Kernel.ParallelCompiler.compile_to_path([path], tmp)<br>Code.append_path(tmp)<br>end<br>end

When you paste this, you get :

{:module, ProbeWrapper,

0, 0, 0, 34, 19, 69, 108, 105, 120, 105, 114, 46, 80, 114, 111, 98, 101, 87,<br>114, 97, 112, 112, 101, 114, 8, 95, 95, ...>>, {:load, 0}}

Then you can run ActualProbe, but also access its code :

iex(2)> ProbeWrapper.load<br>true<br>iex(3)>...

code node erlang cluster module graph

Related Articles