How to deadlock a Java executor service

0x54MUR411 pts0 comments

How to Deadlock a Java ExecutorService | Mlangcs Personal Blog

In this blog post I want to have a close look at a type of deadlock you can run into with any Java<br>ExecutorService<br>with a bounded number of threads. While rare in practice, the necessary ingredients to trigger it are always at your fingertips,<br>especially around the shared<br>ForkJoinPool#commonPool.

First, I’ll introduce you to the problem at an abstract level, in a lab setting. Then I will walk you through how I recently ran<br>into this problem in the wild, while working on more-log4j2, and conclude with a few<br>recommendations.

Minimal Reproduction

Let me explain the heart of the problem at an abstract level, before showing you how to reproduce it with a few lines of<br>code. What you need for a minimal reproduction is a single-threaded executor service executor and a task taskS that is<br>scheduled on executor. taskS schedules another taskT on executor, and waits for its completion synchronously. Since<br>executor has only one thread that is already occupied by taskS, taskT can never start, and taskS therefore waits forever.<br>You might find it helpful to have a look at the following diagram, which captures the scenario I’ve just described:

A minimal reproduction based on the aforementioned ideas has only a few lines of code:

void main() {<br>var executor = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory());<br>Runnable taskT = () -> out.println("T done");

Runnable taskS = () -> {<br>CompletableFuture.runAsync(taskT, executor).join();<br>out.println("S done");<br>};

CompletableFuture.runAsync(taskS, executor).join();

Note that Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory()) is used in favor of<br>Executors.newSingleThreadExecutor() to make sure that executor threads cannot prevent the program from terminating. The only<br>potential reason for the program to hang is therefore the last line in main, which schedules and waits for taskS.

If you run this program, you can confirm that it indeed hangs, and all println statements are dead code.

Before we move on, let’s look at three ways to fix the deadlock. Probably the most obvious way is to add a thread to the executor,<br>as in

var executor = Executors.newFixedThreadPool(2, Thread.ofPlatform().daemon().factory());

Then taskT can execute while taskS waits for it as in the following picture:

However, the implementation is easily adapted to also deadlock with 2 threads:

void main() {<br>var executor = Executors.newFixedThreadPool(2, Thread.ofPlatform().daemon().factory());<br>IntFunctionRunnable> newTaskT = i -> () -> out.printf("T%s done%n", i);

IntFunctionRunnable> newTaskS = i -> () -> {<br>CompletableFuture.runAsync(newTaskT.apply(i), executor).join();<br>out.printf("S%s done%n", i);<br>};

CompletableFuture.allOf(<br>CompletableFuture.runAsync(newTaskS.apply(1), executor),<br>CompletableFuture.runAsync(newTaskS.apply(2), executor)<br>).join();

What happens if you execute the code above is captured in the diagram below:

To reproduce the deadlock, both S1 and S2 have to start before either T1 or T2 gets a chance to run, and I hope it’s<br>clear from here how this generalizes to 3, 4, 5, … threads.

Luckily, there is another, more reliable way to get rid of the deadlock, without adding threads. The code can be unblocked by<br>letting taskS wait for taskT asynchronously, like in this snippet:

void main() {<br>var executor = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory());<br>Runnable taskT = () -> out.println("T done");

SupplierCompletableFutureVoid>> taskS = () -><br>CompletableFuture.runAsync(taskT, executor).thenRun(() -> out.println("S done"));

CompletableFuture.supplyAsync(taskS, executor)<br>.thenCompose(f -> f) // Flattens Future> into Future<br>.join();

This fixes the problem because taskS now releases its hold on the executor thread after scheduling taskT, and therefore allows<br>taskT to run. As you can confirm by instrumenting our executor with

var executor = new Executor() {<br>final Executor impl = Executors.newSingleThreadExecutor(Thread.ofPlatform().daemon().factory());

@Override<br>public void execute(Runnable command) {<br>out.printf("execute(%s)%n", command);<br>impl.execute(command);<br>};

there are still two Runnables being submitted. The first one, implemented by CompletableFuture.AsyncSupply, corresponds to the<br>first part of taskS, which schedules taskT. The second one, implemented by CompletableFuture.AsyncRun, executes<br>taskT, and then the remaining part of taskS. Assembled in a diagram, you can picture what is going on as follows:

Finally, a third way to address this problem is to switch to a virtual-thread executor, like<br>Executors.newVirtualThreadPerTaskExecutor(). Adding platform threads doesn’t truly solve it: seemingly unbounded pools like<br>Executors#newCachedThreadPool() or Executors#newThreadPerTaskExecutor() are still subject to resource limits, and exhausting<br>them might severely impact the health of your application. Let me illustrate this point with a small example,...

executor tasks taskt thread executors completablefuture

Related Articles