Federating Clusters for Zero-Downtime Kubernetes

Federating Clusters for Zero-Downtime Kubernetes | Linkerd

Announcing Linkerd 2.20: Rate-limit-aware load balancing, reduced memory usage, and better metrics Learn more Federating Clusters for Zero-Downtime Kubernetes

Dominik Táskai, Linkerd Ambassador Jun 24, 2026 • 15 min read

Every multi-region setup eventually meets the same awkward moment: a whole cluster goes away, and the identical copy of your service running two regions over might as well not exist, because nothing is wired to treat them as one thing. Failover becomes a runbook: restore, repoint DNS, and wait for an outage that, on paper, you’d already paid to survive. Linkerd’s multicluster extension closes that gap by letting several clusters present a service as a single, load-balanced endpoint. The part that the official tasks gloss over is that a real platform almost never picks one multicluster mode. Some services want federation (same service everywhere, one endpoint, automatic failover). While others want mirroring (reach a specific remote service by name). And you frequently want both patterns living on the same set of links. The docs walk through each mode on its own. This post wires all three together across three GKE clusters, with a full-mesh link topology, a chaos test that takes out an entire cluster, and scripts you can clone and run on a fresh GCP project. Companion repo : Every script referenced here lives in this repository. Feel free to clone it, set your project ID, and run it. Linkerd multicluster modes: Gateway, flat, and federated Linkerd’s multicluster extension supports three modes. The nice thing is they’re not mutually exclusive: on the same set of linked clusters, the mode is chosen per service via a label. ModeLabelWhat happensNetwork RequirementHierarchical (gateway) mirror.linkerd.io/exported=trueService mirrored as -, traffic routed through a gatewayGateway IP reachableFlat (pod-to-pod) mirror.linkerd.io/exported=remote-discoveryService mirrored as -, traffic goes directly to remote podsFlat network (pod IPs routable)Federated mirror.linkerd.io/federated=memberAll same-name services unioned into -federated, load balanced across all clustersFlat network (pod IPs routable)The distinction that matters operationally is that hierarchical mirroring works on any network. Only the gateway IP needs to be reachable, while flat and federated modes need real pod-to-pod connectivity. On GCP, VPC-native GKE clusters on peered VPCs give you that flat network for free. So, you can run federated services for your core workloads over a flat network and still mirror a specialized service through a gateway from a cluster that isn’t on that network. Most platform teams I’ve seen end up with exactly this kind of mix. Multi-region architecture: GKE cluster setup We have three GKE clusters across three regions, fully linked to each other (six directional links total). Three demo services, each using a different multicluster mode: frontend is federated and runs in all three clusters. A single federated frontend service in each cluster load-balances across all nine pods (3 replicas × 3 clusters). When a cluster goes down, the remaining six pods absorb the traffic with no application changes. api is flat-mirrored and runs in west and east. The north cluster consumes it as api-west and api-east, which are explicit remote service names with traffic sent straight to the remote pods. This is what you reach for when the client needs to decide which backend it talks to, for example, to keep a request in-region for data locality. analytics is gateway-mirrored and runs only in east. Exported through the Linkerd gateway so west and north reach it as analytics-east-gw without needing flat-network connectivity to east’s pods. It’s here mainly to prove that gateway mode coexists with flat and federated modes on the same links. Deployment prerequisites: GKE, Linkerd, and CLI tools A GCP account (free-tier credits cover this. Use three standard clusters with small node pools) gcloud CLI, authenticated (gcloud auth login) kubectl v1.28+ step CLI, brew install step (for certificate generation) helm v3 ~30 minutes for the full setup The infra script enables the compute and container APIs for you, so a brand-new project works out of the box. Step 0: Configure Clone the repo, create a local .env file from the example file, and customize it for your GCP project. The defaults are enough for the rest of the demo, so in most cases you only need to change the project ID. git clone cd blog-linkerd-federation cp env.example .env

Open .env and set at least your project ID. The file ships with sensible defaults for everything else: export GCP_PROJECT="your-project-id"

export REGION_WEST="us-central1" export REGION_EAST="us-east1" export REGION_NORTH="europe-west1"

# One zone per region. We pin node-locations to a single zone so num-nodes is # the TOTAL node count — see the cost note below for why this matters. export...

Federating Clusters for Zero-Downtime Kubernetes

Related Articles

US Government directive to suspend access to Fable 5 and Mythos 5

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

Britain Became as Poor as Mississippi