Adding Features Without Interrupting Network Connections

anitil1 pts1 comments

Adding Features Without Interrupting Network Connections - exe.dev blog

A busy network

People use exe.dev to build and run programs on virtual machines. For virtual machines that are serving web traffic, we handle encryption and authentication. For people using their VM as a traditional computer, we transparently support SSH access. We also provide terminal access to VMs via the web, and of course our popular Shelley AI coding agent. These features and others require exe.dev to proxy incoming connections between the broader internet and an individual VM.

This network proxying isn’t trivial—we have to identify the VM the user is trying to reach, route the connection to that VM (which may be in some different region of the world), and provide basic authentication to verify that the user has access to the VM. For outgoing connections from the VM, we have to route the outgoing network connections correctly, and we provide an increasing number of integrations.

Managing these features and adding new ones means that we regularly update the software that directs network connections, sometimes deploying multiple times a day. Of course, at the same time our users—and their users—are making and using network connections to VMs. We have to be able to deploy new software without interrupting those network connections. In this post we’ll look at how we do that.

Pipe process

The answer is to simplify the problem by handling the network connections in a separate process, known internally as exepipe. The exepipe process intentionally does not implement any features. Its only job is to manage network connections. That makes it a simple program that doesn’t change very often. We can and do run it for months without redeploying it.

An exepipe process takes commands from a proxy process. These commands are simple and don’t have any features or options.

The simplest command is “listen on this network address, and forward any connections to this other network address.” This command is used on VM host machines to direct incoming connections on an external address to a VM address that can only be seen on the same machine.

Another command is “take these two network connections, and copy data back and forth between them.” This is used by the proxy. The proxy will accept an incoming connection, handle authentication, decide which VM the user is trying to reach, and open a connection to that VM. When both connections are up and running, the proxy will direct exepipe to copy data between the connections.

This uses a handy and slightly obscure socket feature: Unix socket ancillary data. When using a Unix domain socket, you can send ancillary data which can include, among other things, file descriptors. The receiving process gets a new file descriptor that refers to the same open file description as the original process. This file description can be a network socket. This effectively duplicates a network socket from one process to another.

For both of these commands, the exepipe process winds up copying data between network connections. Fortunately on Linux this can be done efficiently using a pair of splice calls over a pipe.

Since exepipe is rarely restarted, this copying will last as long as it needs to.

There are of course many ways to copy data between network connections, such as the socat program. That, however, requires a separate socat program per connection and a lot of state tracking. Using a single exepipe server is a more efficient use of system resources.

SSH copying

Handling SSH connections is similar, except that forwarding one SSH connection to another SSH connection is more complicated than forwarding a simple TCP connection. The SSH protocol is packet-oriented, with separate communication channels, and the ability for the SSH client to query the SSH server in various ways. The exepipe process has to forward these channels and queries back and forth over two SSH connections.

Restarting the pipe process

Of course, sometimes we do need to change exepipe.

When we redeploy exepipe, we don’t want to break any of the connections that the existing instance is managing. So, rather than restart exepipe, we start a new exepipe and leave the old one running.

The new exepipe will contact the old exepipe and tell it to transfer any listening sockets, including the socket used to receive new commands from the proxy. The listening sockets will be handed over using Unix socket ancillary data. The old exepipe will then do nothing but keep copying data between sockets that are already open. When all of those socket connections are closed, the old exepipe has nothing left to do and it will quietly exit.

This work is all in the service of supporting exe.dev users. We handle the details of the network to present a simple and seamless interface, using implementation complexity to provide user simplicity.

connections network exepipe process data socket

Related Articles