How we made parallel pytest safe for multi-tenant agent swarms

How we made parallel pytest safe for multi-tenant agent swarms · EquatorOps Blog

Request demo

Toggle navigation

Request demo

EquatorOps Blog Engineering How we made parallel pytest safe for multi-tenant agent swarms

Pytest-xdist is easy to love until two runs touch the same schemas. Here is how the EquatorOps engineering team reworked its multi-database test environment to stay fast and safe under concurrent human and AI development. April 21, 2026 9 min read EquatorOps Engineering Testing Infrastructure Pytest Multi-Tenant SaaS Reliability

Parallel test runs feel solved right up until the day they stop being solved.

For a while our backend test environment looked healthy. We had pytest-xdist, modular fixtures, Factory Boy, a structured conftest.py, separate platform and tenant databases, and a ~/run_tests entrypoint that auto-loaded the test environment. One engineer running a focused slice of the suite was fast and stable. A few workers in parallel was routine.

Then how we worked changed.

We started running more tests concurrently across more contexts: multiple tmux panes, background validation passes, longer fixture-heavy suites, and eventually multiple AI agents firing test invocations seconds apart against the same database host. That is when a structural problem we had been getting away with surfaced: even with xdist in place, two independent pytest invocations could collide on the same PostgreSQL schemas and produce LockNotAvailable, statement timeouts, or worse, silent fall-through writes into shared public.

The fix was not “raise the timeout again.” It was treating the test environment as shared infrastructure: per-invocation namespacing, fail-closed cleanup, deterministic connection labeling, and a few subtle bug fixes that mattered more than the headline change.

This post is about what broke, why xdist alone does not solve it, and the specific mechanisms we ended up needing.

Why xdist isn’t enough

pytest-xdist gives you worker parallelism inside one invocation. That is not the same as making multiple independent invocations safe on the same database host.

Our original isolation was schema-per-worker:

test_gw0 test_gw1 test_gw2 That works fine until two entirely separate ~/run_tests calls both spin up gw0. Both runs then try to drop, recreate, migrate, and seed the same physical schema. The result is structural lock contention that looks like flakiness until you see the pattern.

A multi-tenant SaaS test setup typically touches at least two databases:

a platform database for users, orgs, roles, and global control-plane state

one or more tenant databases for the operational data each customer actually works with

Once tests add multi-tenant isolation cases or extra tenant DBs, “isolated by worker name” becomes “isolated only when other invocations stay out of the way.” For a small human team that is an occasional flake. For a swarm of AI agents kicking off concurrent runs and inheriting shell state from each other, it is the normal operating envelope.

The hard lesson was simple: xdist gives you worker parallelism inside one invocation. It does not automatically make multiple independent invocations safe on the same database host.

What we changed

Every pytest invocation now gets a namespace token, and every worker composes its schema from namespace + worker_id:

test_{namespace}_{worker_id} A namespace token from ~/run_tests looks like this:

p18234t1745178234r3af93d71 That is ptr with 32 bits of /dev/urandom entropy in the suffix. Two concurrent invocations therefore look like:

test_p18234t1745178234r3af93d71_gw0 test_p18234t1745178234r3af93d71_gw1 test_p20491t1745178241r9c12d4fe_gw0 test_p20491t1745178241r9c12d4fe_gw1 The cross-invocation collision is now impossible by construction. The interesting work was in the parts of the system that had to learn this rule, and in two subtle bugs we hit along the way.

The runner generates and validates the namespace

~/run_tests generates a fresh token per invocation by default, even if the calling shell already has HONEY_TEST_SCHEMA_NAMESPACE exported from a previous run. That stale-export defense matters: a debug session that left the variable set could otherwise cause the next “normal” run to silently re-collide with itself.

A scoped debug override (HONEY_TEST_SCHEMA_NAMESPACE_FORCE=mydebug) is allowed but tightly restricted: lowercase alphanumeric, no underscores, no hyphens, max 32 characters. The runner validates the override in shell before the value reaches any report file path or environment export.

conftest.py covers direct pytest

bin/run_tests is not the only way pytest gets invoked. To keep the design coherent under direct pytest calls, tests/conftest.py:

preserves any namespace already exported by the runner

generates one if none exists

propagates the controller’s namespace into every xdist worker subprocess via the pytest_configure_node hook (node.workerinput["schema_namespace"])

Without that propagation, each...

How we made parallel pytest safe for multi-tenant agent swarms

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

How to Earn a Billion Dollars