How to deadlock a Java executor service

How to Deadlock a Java ExecutorService | Mlangcs Personal Blog

In this blog post I want to have a close look at a type of deadlock you can run into with any Java ExecutorService with a bounded number of threads. While rare in practice, the necessary ingredients to trigger it are always at your fingertips, especially around the shared ForkJoinPool#commonPool.

First, I’ll introduce you to the problem at an abstract level, in a lab setting. Then I will walk you through how I recently ran into this problem in the wild, while working on more-log4j2, and conclude with a few recommendations.

Minimal Reproduction

Let me explain the heart of the problem at an abstract level, before showing you how to reproduce it with a few lines of code. What you need for a minimal reproduction is a single-threaded executor service executor and a task taskS that is scheduled on executor. taskS schedules another taskT on executor, and waits for its completion synchronously. Since executor has only one thread that is already occupied by taskS, taskT can never start, and taskS therefore waits forever. You might find it helpful to have a look at the following diagram, which captures the scenario I’ve just described:

A minimal reproduction based on the aforementioned ideas has only a few lines of code:

void main() { var executor = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory()); Runnable taskT = () -> out.println("T done");

Runnable taskS = () -> { CompletableFuture.runAsync(taskT, executor).join(); out.println("S done"); };

CompletableFuture.runAsync(taskS, executor).join();

Note that Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory()) is used in favor of Executors.newSingleThreadExecutor() to make sure that executor threads cannot prevent the program from terminating. The only potential reason for the program to hang is therefore the last line in main, which schedules and waits for taskS.

If you run this program, you can confirm that it indeed hangs, and all println statements are dead code.

Before we move on, let’s look at three ways to fix the deadlock. Probably the most obvious way is to add a thread to the executor, as in

var executor = Executors.newFixedThreadPool(2, Thread.ofPlatform().daemon().factory());

Then taskT can execute while taskS waits for it as in the following picture:

However, the implementation is easily adapted to also deadlock with 2 threads:

void main() { var executor = Executors.newFixedThreadPool(2, Thread.ofPlatform().daemon().factory()); IntFunctionRunnable> newTaskT = i -> () -> out.printf("T%s done%n", i);

IntFunctionRunnable> newTaskS = i -> () -> { CompletableFuture.runAsync(newTaskT.apply(i), executor).join(); out.printf("S%s done%n", i); };

CompletableFuture.allOf( CompletableFuture.runAsync(newTaskS.apply(1), executor), CompletableFuture.runAsync(newTaskS.apply(2), executor) ).join();

What happens if you execute the code above is captured in the diagram below:

To reproduce the deadlock, both S1 and S2 have to start before either T1 or T2 gets a chance to run, and I hope it’s clear from here how this generalizes to 3, 4, 5, … threads.

Luckily, there is another, more reliable way to get rid of the deadlock, without adding threads. The code can be unblocked by letting taskS wait for taskT asynchronously, like in this snippet:

void main() { var executor = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory()); Runnable taskT = () -> out.println("T done");

SupplierCompletableFutureVoid>> taskS = () -> CompletableFuture.runAsync(taskT, executor).thenRun(() -> out.println("S done"));

CompletableFuture.supplyAsync(taskS, executor) .thenCompose(f -> f) // Flattens Future> into Future .join();

This fixes the problem because taskS now releases its hold on the executor thread after scheduling taskT, and therefore allows taskT to run. As you can confirm by instrumenting our executor with

var executor = new Executor() { final Executor impl = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory());

@Override public void execute(Runnable command) { out.printf("execute(%s)%n", command); impl.execute(command); };

there are still two Runnables being submitted. The first one, implemented by CompletableFuture.AsyncSupply, corresponds to the first part of taskS, which schedules taskT. The second one, implemented by CompletableFuture.AsyncRun, executes taskT, and then the remaining part of taskS. Assembled in a diagram, you can picture what is going on as follows:

Finally, a third way to address this problem is to switch to a virtual-thread executor, like Executors.newVirtualThreadPerTaskExecutor(). Adding platform threads doesn’t truly solve it: seemingly unbounded pools like Executors#newCachedThreadPool() or Executors#newThreadPerTaskExecutor() are still subject to resource limits, and exhausting them might severely impact the health of your application. Let me illustrate this point with a small example,...

How to deadlock a Java executor service

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI