Toast: Where PostgreSQL hides big values

radimm1 pts0 comments

TOAST: Where PostgreSQL hides big values | boringSQL

LIVE

boringSQL Live Session #2<br>17:30 CEST · 15:30 UTC · 11:30 AM ET<br>· Vote for your topic

Reserve your spot →

Table of Contents

The 2KB threshold

The four storage strategies

The shrinking procedure

Watching it happen

What the heap tuple looks like

Compression: pglz vs lz4

What it costs you

Limits

Following a tuple through the toaster

Keeping the page invariant safe

In earlier posts in this series we established that every heap tuple lives inside a strict 8KB page. Everything else is built on top of that hard limit: MVCC, HOT updates, and indexes that point at (page, line_pointer). And yet this still works:

CREATE TABLE docs (id int PRIMARY KEY, body jsonb);<br>INSERT INTO docs VALUES (1, (SELECT jsonb_agg(g) FROM generate_series(1, 100000) g));<br>That body value is somewhere north of half a megabyte. The heap page is still 8KB. Both statements are true at the same time, and the mechanism that makes them coexist is TOAST : The Oversized-Attribute Storage Technique.

The 2KB threshold

The page is 8KB, but PostgreSQL starts shrinking tuples at around 2KB. The goal is to keep at least four tuples per page, so anything bigger triggers TOAST.

Two constants are at play. TOAST_TUPLE_THRESHOLD decides when the toaster runs; TOAST_TUPLE_TARGET is the size it tries to shrink the tuple down to. Both default to 2032 bytes on a standard 8KB build. The threshold is fixed at compile time; the target is per-table tunable via the toast_tuple_target storage parameter.

When an INSERT or UPDATE produces a tuple wider than the threshold, PostgreSQL doesn't wait to see whether the row would actually fail to fit on a page. It starts compressing and relocating attributes one at a time, and stops the moment the tuple drops below the target. A row only modestly over the line often gets away with one cheap compression pass; a much wider one walks the full procedure.

Only variable-length attributes are candidates. A bigint is always 8 bytes, so there is nothing to compress and nowhere to move it. A text, bytea, jsonb, numeric, or array, on the other hand, is a varlena: a length-prefixed structure PostgreSQL can freely rearrange.

The four storage strategies

Each variable-length column has a storage strategy : a flag in pg_attribute.attstorage that answers two yes/no questions for the toaster. Can this value be compressed? Can it be moved out-of-line? PostgreSQL picks a default based on the column's type, and you can change it with ALTER TABLE ... ALTER COLUMN ... SET STORAGE. There are four valid values:

StrategyCompress?Out-of-line?Default for<br>PLAINnonofixed-width types<br>EXTENDEDyesyesmost varlena types<br>EXTERNALnoyes(opt-in)<br>MAINyeslast resort only(opt-in)

EXTENDED is what you get for text, bytea, jsonb, and friends unless you ask for something else. The other three are levers for specific tradeoffs:

EXTERNAL skips compression entirely. Useful when the column is already compressed (images, gzipped blobs), or when you plan to use substring/byte-range reads, which work better on uncompressed out-of-line data.

MAIN tries hard to keep the value in the main heap tuple even after compression. It only moves out-of-line as a last resort.

PLAIN forbids both. Fixed-width types use it because they have no choice; you can set it on a varlena column too, but then any value that doesn't fit will fail the INSERT.

You can inspect and change strategies through \d+ or directly:

ALTER TABLE docs ALTER COLUMN body SET STORAGE EXTERNAL;The shrinking procedure

Once the toaster is running, PostgreSQL walks the tuple's variable-length attributes in a fixed order, applying the cheapest transformations first:

Compress EXTENDED attributes, largest first. If any single attribute is large enough that it should clearly go out-of-line anyway, push it out immediately in the same pass.

If the tuple still doesn't fit, move remaining EXTENDED or EXTERNAL attributes out-of-line, largest first.

If still too big, compress MAIN attributes.

If still too big, move MAIN attributes out-of-line as a last resort.

Each pass stops as soon as the tuple drops below the target. A row with one chubby JSON column and a handful of small text fields usually only touches the JSON: a single compression pass typically does it.

The whole procedure lives in heap_toast_insert_or_update() in src/backend/access/heap/heaptoast.c, with the per-attribute helpers in src/backend/access/common/toast_helper.c. The four numbered loops in heap_toast_insert_or_update map one-to-one onto the four steps above. It's one of the more readable hot paths in the heap code if you want to see the policy in source form.

Watching it happen

To watch this happen we need a table with one wide column and the pageinspect extension we've been using throughout this series.

CREATE EXTENSION IF NOT EXISTS pageinspect;

CREATE TABLE toast_demo (<br>id integer GENERATED ALWAYS AS IDENTITY PRIMARY KEY,<br>label text,<br>body text<br>);<br>Insert three rows...

tuple line column postgresql table storage

Related Articles