How Python async exposes hidden coupling and backpressure
Menu
Asynchronous IO addresses the hidden waits that cause synchronous systems to<br>more easily drift into backpressure.
Long operations yield instead of blocking, so unrelated work continues while<br>the wait completes. The moment two steps depend on each other, the dependency<br>becomes visible. This is the value of asynchrony. Coupling is exposed, as is<br>load behaviour, and the early signs of saturation become more obvious before<br>they turn into blocked threads and stalled pipelines.
The example code below shows how asynchrony changes the shape of time in a<br>system and reveals the operational behaviours that synchronous code can hide.
The code also includes a small OpenTelemetry span, making behaviour observable<br>from client through to server.
As the Conclusion shows, asynchrony has structural effects on execution flow<br>and affects system‑level behaviour and has operational consequences.
Asynchrony: the fundamental idea
The fundamental idea is to put long-lived operations into a separate execution<br>context so that other work can be progressed while waiting for the long-lived<br>operation to complete.
In this way we overlap the completion of tasks. The long-lived execution is<br>started, but the call to it returns immediately. The other work can then<br>start.
It is important that the other work has no dependence on the long-lived work.<br>This is necessary because if the second was dependent on the first, the first<br>would have to complete before the second could be started. In this case, the<br>two bodies of work would have to operate serially, not concurrently.
For concurrency to work, the work items must be independent of one another.
Our example
Our example is a long-lived RESTful call that receives client JPG data. During<br>the long server wait, the client performs useful work preparing the next<br>image file for upload.
In addition, observability support is shown by using<br>OpenTelemetry, using a tracer and span to capture a timed, structured record of<br>one operation. The observability for the client and server sides of the JPG<br>processing are built by linking a span together.
A span is a timed, structured record of one operation.<br>Client and server observability is built by linking many spans together.
Input/Output
Input/Output (IO) is a common operation that takes time. Your code might have<br>to wait for:
a remote service to return such as a payment provider or cloud storage
a database query
a large file to be read into memory
machine learning inference
Kafka partitions to respond
A Synchronous Approach
When programming synchronously, the thread that is executing must wait (block)<br>for the entire IO operation. In a single threaded program, nothing else is<br>getting done while you wait.
In a threaded program using a thread pool, long IO wait times makes it more<br>likely that all threads block.
For example, in a pool of 32 threads, if the average blocking time is 2s and 32<br>requests are received within 2s, all 32 threads will be blocked. A 33rd request<br>will have to wait on a thread becoming available. Queued requests inherit the<br>delay.
If you have a downstream system waiting on the 32 requests completing, you now<br>have back pressure building and being communicated around your distributed<br>system.
The slow remote service is slowing down your downstream service which is likely<br>to slow any other downsteam services, which could be a user interface, giving<br>your end user a degraded experience.
In a synchronous system you want to give yourself a heads-up when production<br>moves towards backpressure. Observability of wait times, configuration (the use<br>of 32 threads as opposed to another number), and resource usage --- all 32<br>threads being occupied --- are important signals to track.
Using asynchrony
Python's asynchrony provides:
The event loop — a scheduler within Python
Coroutines — functions that can suspend and resume
Tasks — scheduled coroutines managed by the event loop
Await points — explicit yield boundaries where a coroutine gives up control of the thread to the event loop
The scheduler will resume tasks whenever the condition they were waiting for<br>becomes true. IO readiness is one such condition.
You mark functions as asynchronous coroutines using async. Asynchronous<br>IO tasks are created with asyncio.create_task. await suspends the<br>current coroutine at an explicit yield point, returns control to the event<br>loop, and resumes the coroutine later when the awaited operation has completed.
Python asynchrony gives you concurrency, not parallelism
Concurrency is multiple units of work in progress at the same time. They may<br>or may not run simultaneously. This is what Python async gives you.
Parallelism is multiple units of work executing at the same time on different<br>CPU cores. Threads or processes give you this. async does not give<br>you this.
Python async gives you concurrency, but not simultaneous execution, because:
asyncio runs on one thread
the Python...