American Express: Cell-Based Architecture for Resilient Payment Systems

birdculture1 pts0 comments

without it.<br>We're hiring! https://AmericanExpress.io/jobs<br>-->

Cell-Based Architecture for Resilient Payment Systems - American Express Technology

Published June 11, 2026

Cell-Based Architecture for Resilient Payment Systems

Architecture<br>Payments<br>Resiliency

The American Express core payments ecosystem is a global platform relied on by Card Members and partners around the<br>world. Every day, it processes live payment transactions that require high availability, low latency, and<br>predictable performance.

Resiliency is not an afterthought; it has been encoded into the system’s design from the beginning. Localized faults<br>are contained within defined boundaries, and recovery is designed to be fast and predictable.

To achieve this, the platform is built around a cell-based architecture that isolates failures, maintains low-latency<br>processing, and scales capacity without expanding the failure domain.

This blog outlines the principles that guide this architecture and how they help us build a resilient payments<br>latform at global scale.

Core Payments Ecosystem

In 2018, we started a journey to modernize our core payments ecosystem. This platform processes live card and<br>payment transactions and is mission-critical to our Card Members and partners.

As we modernized the platform, resiliency remained a primary design requirement. We needed an architecture that<br>could continue processing transactions reliably, even when individual components failed. This decision was heavily<br>influenced by our historical design patterns, which predated the term “cell-based architecture,” but share many of<br>the same principles.

Our new platform targeted cloud-native technologies, which meant we needed to think differently about how we<br>designed for resiliency and scalability.

In the next sections, we’ll discuss some of the design principles we follow in our core payments ecosystem and<br>how they not only improve our ability to process payments reliably but also help us reduce latency and<br>scale more easily.

What is Cell-based Architecture?

Cell-based architecture is an architecture pattern that has gained popularity in the cloud-native<br>distributed systems space.

The idea behind the concept is to group related microservices, databases, and other components into independent<br>instances called cells. Each cell is able to function independently without reliance on other cells.

In this diagram: Each cell contains its own services and data so a failure stays within that cell instead of spreading<br>across the platform.

The primary benefit of cell-based architecture is reducing the blast radius of failures. With each cell being<br>independent, if one cell experiences issues, it doesn’t impact the others. The trade-off is that cell-based<br>architecture often increases management overhead and architectural complexity, as it requires careful design<br>to ensure that cells are truly independent and that data is appropriately localized.

However, for mission-critical systems like payments, we find that the benefits of a reduced blast radius and improved<br>resiliency outweigh the additional complexity.

We’ve also found that when implemented well, a cell-based architecture can help platforms reduce latency (by<br>reducing external dependencies and network hops) and improve scaling by introducing additional independent cells.

How We Follow Cell-Based Architecture

Each instance of our core payments ecosystem is designed as a cell, which:

Is an independently deployable unit that can process payments on its own.

Has its own set of microservices, databases, and other components.

Is a single failure domain, meaning that if one cell experiences issues, it doesn’t cascade the failure beyond the cell boundary.

Can be taken out of rotation for maintenance or in response to failures without impacting the overall system or requiring coordination with other cells.

Has no synchronous cross-cell dependencies in the critical path of processing transactions.

A cell is defined by its failure boundaries rather than a specific infrastructure construct. In practice, cells<br>never span multiple regions—everything required to process transactions (DNS, databases, microservices, and<br>supporting services) remains local within that boundary.

To achieve this, we follow a set of core principles that guide our design decisions and help us ensure that our<br>cells are truly independent and resilient.

Data and Processing Locality by Default

Processing payments requires data: currency rates, merchant category codes, and so on. Some data is static, while<br>some data changes with each transaction.

Static & Semi-Static Data Replication

For static or semi-static data like currency rates and merchant category codes, we replicate that data to each cell.

In this diagram: Reference data is pushed into every cell ahead of time so transaction processing never needs a<br>synchronous lookup to a central source.

Rather than relying on a fall-through read to a centralized system of record during...

cell architecture based payments data cells

Related Articles