What Python async exposes that synchronous code hides

jhevans1 pts0 comments

What Python async exposes that synchronous code hides

Menu

Asynchronous IO does more than overlap work. It changes the shape of time<br>inside a system, making behaviours visible that synchronous code hides. Hidden<br>waits become explicit. Accidental coupling shows up immediately. Backpressure<br>appears as measurable queueing rather than blocked threads.

One value of asynchrony is it acts as a diagnostic tool. The moment two steps<br>depend on each other, the dependency becomes visible because the event loop<br>cannot progress until the awaited operation completes. What looked OK in<br>synchronous code becomes an operational constraint you can now see, measure,<br>and reason about.

The example below shows how Python async reshapes execution flow. A long‑lived<br>call to the server yields instead of blocking, allowing the client to process<br>the next image while the previous upload is still in flight. The result is not<br>just a speedup of 6.7 times, it is a clearer picture of where your system is<br>coupled, where it is IO‑bound, and where backpressure begins to form.

Backpressure is the situation where a service becomes overloaded, slowing down<br>those that call in. In synchronous code, this slow down is communicated<br>downstream to code that relies on the now slower process. Threads in the slower<br>process block, no useful work can be done. The process seizes up. Synchrony<br>has coupled the performance of the two processes together.

Backpressure is the system signalling "I am saturated", and it shows up as<br>silence: growing waits, stalled pipelines, or accumulating tasks, rather than<br>explicit errors. The system slows down beore it falls over.

To get a handle on such errors in the asynchronous examples, OpenTelemetry spans<br>are included to show how observability aligns naturally with async boundaries.<br>Because each Python await is a structured yield point, tracing maps<br>directly onto the concurrency model.

Asynchrony: the fundamental idea

The fundamental idea is to put long-lived operations into a separate execution<br>context so that other work can be progressed while waiting for the long-lived<br>operation to complete.

In this way we overlap the completion of tasks. The long-lived execution is<br>started, but the call to it returns immediately. The other work can then<br>start.

It is important that the other work has no dependence on the long-lived work.<br>This is necessary because if the second was dependent on the first, the first<br>would have to complete before the second could be started. In this case, the<br>two bodies of work would have to operate serially, not concurrently.

For concurrency to work, the work items must be independent of one another.

Our example

Our example is a long-lived RESTful call that receives client JPG data. During<br>the long server wait, the client performs useful work preparing the next<br>image file for upload.

In addition, observability support is shown by using<br>OpenTelemetry, using a tracer and span to capture a timed, structured record of<br>one operation. The observability for the client and server sides of the JPG<br>processing are built by linking a span together.

A span is a timed, structured record of one operation.<br>Client and server observability is built by linking many spans together.

Input/Output

Input/Output (IO) is a common operation that takes time. Your code might have<br>to wait for:

a remote service to return such as a payment provider or cloud storage

a database query

a large file to be read into memory

machine learning inference

Kafka partitions to respond

A Synchronous Approach

When programming synchronously, the thread that is executing must wait (block)<br>for the entire IO operation. In a single threaded program, nothing else is<br>getting done while you wait.

In a threaded program using a thread pool, long IO wait times makes it more<br>likely that all threads block.

For example, in a pool of 32 threads, if the average blocking time is 2s and 32<br>requests are received within 2s, all 32 threads will be blocked. A 33rd request<br>will have to wait on a thread becoming available. Queued requests inherit the<br>delay.

If you have a downstream system waiting on the 32 requests completing, you now<br>have back pressure building and being communicated around your distributed<br>system.

The slow remote service is slowing down your downstream service which is likely<br>to slow any other downsteam services, which could be a user interface, giving<br>your end user a degraded experience.

In a synchronous system you want to give yourself a heads-up when production<br>moves towards backpressure. Observability of wait times, configuration (the use<br>of 32 threads as opposed to another number), and resource usage --- all 32<br>threads being occupied --- are important signals to track.

Using asynchrony

Python's asynchrony provides:

The event loop — a scheduler within Python

Coroutines — functions that can suspend and resume

Tasks — scheduled coroutines managed by the event loop

Await points — explicit yield boundaries where a coroutine...

work long synchronous code system threads

Related Articles