Fun with Docker, broken networking, remote filesystem mounts, and race conditions on Debian
Starting sometime several months ago, the Debian 13 (Trixie) VPS that I use for most of my personal Docker containers (including the one that runs this blog, my family's Nextcloud instance, and a Rustdesk server, among some other small things) developed an issue where the internet interface eth0 would fail to come up. It's kinda difficult to manage an internet-facing server without being able to, you know, connect to it, but thankfully my VPS provider provides fairly easy-to-use VNC rescue access through their control panel.<br>Upon initial investigation, ip -br link showed that my eth0 interface was present, but down. Running systemctl restart networking brought everything back up again and I was once again able to ssh into my server, immediately leading me to suspect this problem was the result of a race condition involving Docker. Docker is known to significantly modify the underlying operating system's networking stack, and on Linux creates isolated virtual networks and networking interfaces for its child containers, so it's totally reasonable to consider it a "prime suspect".<br>Add network-online.target as a hard prerequisite for docker.service<br>If you have installed Docker the "normal" way, the first thing to do is to inspect the Docker systemd service profile using systemctl cat docker.<br># /usr/lib/systemd/system/docker.service<br>[Unit]<br>Description=Docker Application Container Engine<br>Documentation=https://docs.docker.com<br>After=network-online.target nss-lookup.target docker.socket firewalld.service containerd.service time-set.target<br>Wants=network-online.target containerd.service<br>Requires=docker.socket<br>StartLimitBurst=3<br>StartLimitIntervalSec=60You'll notice that network-online.target is listed in After= and Wants= - this simply tells systemd that Docker should be started after the network is online and that the network being online is a nice-to-have but not strictly necessary prerequisite. For most people running software in Docker containers that do useful real-world things over the internet, therefore, internet connectivity is more or less a hard prerequisite before the Docker daemon starts up.<br>To achieve this, we need to add network-online.target to the Requires= clause. The best way to do this without modifying /usr/lib/systemd/system/docker.service directly is to use systemctl's edit function systemctl edit docker.service. This can be used to drop in partial overrides/directives without having to edit the entire service (although you can do that too). Add the new Requires= clause as follows:<br>### Editing /etc/systemd/system/docker.service.d/override.conf<br>### Anything between here and the comment below will become the contents of the drop-in file
[Unit]<br>Requires=network-online.target
### Edits below this comment will be discardedOne reboot later, and everything is working as expected - the network comes up automatically before Docker and the Docker containers do!<br>Getting my remote filesystem to play nice with systemd and Docker<br>Instead of using local on-disk storage for my Nextcloud instance, I use a Cloudflare R2 bucket mounted via rclone using a systemd mount unit. I had been dealing with a related issue where the mount would fail to appear before the Nextcloud container had started. The kernel log was more informative in this case and directly pointed to a race condition:<br>local-fs.target: Job var-www-nextcloud-data.mount/start deleted to break ordering cycle...<br>It turns out that my initial mount unit file was inadequate - there are a few parameters needed specifically for network/remote filesystems:<br>Specify network-online.target in both After= and Requires=<br>Before=docker.service - start the mount before Docker<br>DefaultDependencies=no - Tells systemd to not assume any default dependencies as we are specifying them here<br>Include _netdev in the mount Options - "Normally the file system type is used to determine if a mount is a "network mount", i.e. if it should only be started after the network is available. Using this option overrides this detection and specifies that the mount requires network."<br>WantedBy=remote.fs.target: Ensures that the mount unit is automatically triggered on boot when systemd executes the remote-fs target. I had initially been using a more generic multi-user.target, but that is not adequate.<br>Here's an example of a fully functional mount unit stored at /etc/systemd/system/var-www-nextcloud-data.mount:<br>[Unit]<br>Description=Nextcloud data mount<br>After=network-online.target<br>Requires=network-online.target<br>Before=docker.service<br>DefaultDependencies=no
[Mount]<br>Type=rclone<br>What=r2:nextcloud-1<br>Where=/var/www/nextcloud/data<br>Options=rw,allow_other,uid=33,gid=33,dir-perms=770,file-perms=0664,umask=002,args2env,vfs-cache-mode=full,config=/etc/rclone.conf,cache-dir=/var/cache/rclone,_netdev
[Install]<br>WantedBy=remote-fs.target<br>With this setup, I now have a nicely ordered boot phase where everything reliably...