Testing a Kafka Proxy: Taming Millions of Permutations

Testing a Kafka Proxy: Taming Millions of Permutations | Conduktor

Skip to main contentTesting a Kafka Proxy: Taming Millions of Permutations

Matt Searle June 25, 2026 20 min read

60-second summary Proxying Kafka sounds easy, but the protocol is ~90 APIs and 600+ wire formats. Every broker, client, and proxy setting multiplies that into ~9 million permutations. You can't brute-force it, so we split it into three axes and use pairwise expansion . Scenarios are written once, run against real Kafka clients , and assert every skip. One run: 500,000+ real API calls , ~30 deployments, under network chaos, green in 30 minutes.

When people are new to Kafka, I tell them to hold onto one fact: at its core, Kafka is brutally simple. You append to a file, and you read that data back, over the network. That part is a few hundred lines of code. The other ~300,000 lines exist to make that one operation reliable when it's distributed at scale, which, folks, is much harder than it looks. That simplicity is a strength. A lot of Kafka's famous resilience and scaling comes from its laser focus on doing one thing well. Don't half-ass two things; whole-ass one thing. So when you set out to proxy Kafka traffic, where you aren't even on the hook for persisting the data, surely things are straightforward? 🚫 "It's just produce and fetch, with the odd low-volume admin call. Right?"

Well. Yes and no. Produce and fetch are a fraction of the protocol Under the hood, Kafka clients and brokers talk a lot. Beyond produce and fetch there's a swathe of constantly-called APIs, version negotiation, authentication, leader discovery, consumer-group coordination, transactions, topology discovery, and almost every one supports multiple versions, so old and new clients can share a cluster. ~90 APIs×request + response×many versions→600+ wire formatsAround 600 distinct request/response shapes to get right, before a single feature is switched on.The version differences aren't cosmetic. Behaviour and feature access change between them, so a proxy very much has to care which version each client is using. And in a test run like ours, by raw call count, produce/fetch are far from the whole story (by bytes they still dominate, but that's not what a proxy spends most of its decisions on). Everything else forms a significant bulk: the ApiVersions negotiation alone fires tens of thousands of times in a single run as connections churn. A large amount of what a proxy handles is conversation about the data, not the data itself. (A single consumer's poll() already fans out into a surprising number of these calls before it reads a byte.) Brokers and clients add their own variables The 600+ wire formats are just the start. Brokers and clients each have levers that change the conversation: SideVariesBroker listeners; security protocol (PLAINTEXT, SSL/mTLS, SASL_PLAINTEXT, SASL_SSL); controller (ZooKeeper or KRaft); topologyClient connection and partition-leader failover handling (often 10+ broker connections at once); compression; batching; which APIs and versions it triggers. And there are two main families with their own quirks, Java and librdkafka (which underpins most non-JVM clients)Some choices are static, baked in at startup. Others shift from one request to the next, or during the lifespan of a single connection. So a real test of "produce and consume" isn't one test at all. We run it across Java clients reaching back several major versions plus librdkafka, against brokers from older Kafka through modern KRaft, and even non-Apache brokers like Redpanda. The same scenarios and the same assertions, with capability-aware skips where a broker lacks a feature, over wildly different wire conversations underneath. The proxy adds another layer of variation Finally, the proxy isn't a transparent pane of glass. You install it precisely because you want it to have an effect, from simple address translation up to field-level encryption and topic virtualisation. And most of those features behave differently depending on every other variable in play: API version, broker topology, client config. Conduktor Gateway has a lot of them: Managed or delegated authentication and authorisation Routing and load-balancing strategies, while staying correct under Kafka's strict per-partition leadership model Access and cluster virtualisation Multiple-cluster access, switching, and failover Topic virtualisation and mapping Protocol-level safeguards and guardrails across a large set of APIs Encryption (field-level and full-payload) Advanced traffic observation Data-quality enforcement Deliberate chaos injection (for testing) Each has its own options, some static config, others toggled while traffic is flowing. A single test run exercises deployments like: DeploymentAuth / front1 gw → 1 brokermutual-TLS client certs2 gw → 3 brokersOIDC via Keycloak5 gw ⇄ 3 brokers (SNI)multi-cluster + live cluster-switching3 gw ⇄ 5 brokers (SNI)HAProxy proxy-protocol front…each running the relevant...

Testing a Kafka Proxy: Taming Millions of Permutations

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi