Use your database to power state machines (2023)

nivethan1 pts0 comments

Use your database to power state machines | Lawrence Jones

Use your database to power state machines

Most people are familiar with state machines and know their value. The average<br>state machine library can help you model states, prevent invalid transitions,<br>and produce diagrams that help even non-technical people understand how the code<br>behaves.

This article isn’t about making the case for state machines. It’s about how you<br>take the concept of a state machine and have it work alongside your database<br>models, leveraging your relational database (say Postgres or MySQL) to help you<br>build concurrent-safe and efficient software.

I first encountered this pattern when I joined GoCardless in 2015. Processing<br>bank payments is a multi-day affair and extremely stateful, so it’s no surprise<br>that the team eventually built a library called Statesman that<br>provided a state machine powered by underlying transition tables.

Most of GoCardless’ critical processes use Statesman, and by the time I left we<br>had transition tables with well over 10B rows. It became such an essential tool<br>that I’d argue it was a powerful competitive advantage. Statesman is even<br>reflected externally, such as in the GoCardless public API with endpoints<br>providing rich audit trails.

So if you use Ruby, grab Statesman and get going. But for those who want these<br>benefits but are non-Ruby’ers, you can implement a small library in your<br>language of choice in just a few hours provided you understand the nuances of<br>transition tables, locking, and edge cases.

That’s what I did the other day at incident.io. Here’s a guide so you can do it<br>too.

Transition tables

Let’s begin by saying that in most applications, you want to be capturing every<br>transition that a resource has made through a state machine and storing it for<br>later analysis. In some situations, you may even consider the history of<br>transitions to decide which state to transition to next.

That’s why the first component of any database state machine will be creating a<br>table to contain transitions.

-- Assuming the payments table is already here.<br>create table payments (/* ... */);

create table payment_transitions (<br>id text primary key default generate_ulid() not null,<br>payment_id text not null references payments(id),<br>to_state text not null,<br>most_recent boolean not null,<br>sort_key integer not null,<br>created_at timestamptz not null default now(),<br>updated_at timestamptz not null default now()<br>);

Each transition row states:

The parent resource it belongs to (payment_id)

The state this transition is moving into (to_state)

Whether this transition is the latest (most_recent)

An ordinal to allow for logical ordering of transitions (sort_key)

Now we add two index-backed unique constraints that are going to ensure our<br>state machine’s integrity.

create unique index idx_payment_transitions_by_parent_most_recent<br>on payment_transitions<br>using btree(payment_id, most_recent)<br>where most_recent;

create unique index idx_payment_transitions_by_parent_sort_key<br>on payment_transitions<br>using btree(payment_id, sort_key);

The first ensures we can only ever have a single transition that is<br>most_recent for any payment, a clear requirement if we ever want to sensibly<br>ask “what state is this payment in?”. The second ensures we get no duplicates on<br>sort_key: less important, but useful to ensure all transitions can be strictly<br>ordered.

Expressing the state machine in code

Now we have a table we’ll use to store transitions, we need to build a library<br>that can be used to express the state machine in our codebase.

In the Go library we’re using at incident.io, we start by expressing the<br>transition table as a domain model that our ORM can work with:

type PaymentTransition struct {<br>// ID is the unique ID of this transition<br>ID string `json:"id" gorm:"type:text;primaryKey;default:generate_ulid()"`<br>// PaymentID is a reference to the parent resource<br>PaymentID string `json:"payment_id"`<br>// ToState is where this transition was to<br>ToState PaymentState `json:"to_state"`<br>// MostRecent is true when this transition is the most recent<br>MostRecent bool `json:"most_recent"`<br>// SortKey provides ordinality over transitions<br>SortKey int `json:"sort_key"`<br>// CreatedAt is set upon transition creation<br>CreatedAt time.Time `json:"created_at"`<br>// UpdatedAt is set whenever this transition is modified<br>UpdatedAt time.Time `json:"updated_at"`

func (PaymentTransition) Parent() Payment {<br>return Payment{}

func (a PaymentTransition) State() PaymentState {<br>return a.ToState

// ParentColumn tells our machine library what column refers<br>// to the parent resource.<br>func (PaymentTransition) ParentColumn() (structField, column string) {<br>return "PaymentID", "payment_id"

You’ll notice PaymentState is a new type, which we define as a Go enum value:

type PaymentState string

const (<br>PaymentStatePendingSubmission PaymentState = "pending_submission"<br>PaymentStateSubmitted PaymentState = "submitted"<br>PaymentStatePaid PaymentState = "paid"<br>PaymentStateCancelled...

state transition machine transitions null json

Related Articles