Show HN: WASM scanner to debug Postgres deadlocks without leaking SQL

PostgreSQL Deadlock ShareLock Transaction Audit | StackEngine

Initializing Enclave... How to Fix PostgreSQL Deadlock Detected on ShareLock Transaction (With Root Cause Analysis)

Threat/Impact Level: HIGH | Downtime Risk: HIGH | Time to Fix: 15–45 mins

TL;DR

What broke: Two concurrent transactions acquired locks in inverse order — PostgreSQL's deadlock detector killed one to break the cycle, rolling back that transaction entirely.

How to fix it: Enforce a consistent lock acquisition order across all transactions touching the same rows; use SELECT FOR UPDATE with explicit ordering or SKIP LOCKED for queue-style workloads.

Use our Client-Side Sandbox below to paste your transaction logic and auto-refactor the lock ordering with zero data leaving your browser.

The Incident (What Does the Error Mean?)

ERROR: deadlock detected DETAIL: Process 12345 waits for ShareLock on transaction 67890; blocked by process 67890. Process 67890 waits for ShareLock on transaction 12345; blocked by process 12345. HINT: See server log for query details. CONTEXT: while updating tuple (0,42) in relation "orders"

PostgreSQL's deadlock detector runs every deadlock_timeout (default: 1 second) . When it fires, it picks one transaction as the victim and issues a hard rollback. The application receives this error on the next query execution. The rolled-back transaction's work is entirely lost — your application must detect this error code (40P01) and retry, or the operation silently fails.

ShareLocks in this context are row-level locks held by in-progress transactions , not table-level shared locks. The deadlock occurs when Transaction A holds a lock on Row 1 and wants Row 2, while Transaction B holds Row 2 and wants Row 1.

The Attack Vector / Blast Radius

This is not a one-off failure. In high-concurrency environments this is a recurring production degradation pattern :

Connection pool exhaustion: Threads waiting on locks pile up. If deadlock_timeout is 1s and you have 50 concurrent conflicting transactions, your connection pool saturates before the detector clears them.

Cascading retry storms: Naive retry logic without exponential backoff causes the same transactions to immediately re-conflict, worsening throughput under load.

Silent data loss: Applications that catch the error without retrying lose writes permanently — especially dangerous in financial ledgers, inventory systems, and order management where the rolled-back transaction updated multiple tables.

Replication lag amplification: On streaming replicas, the lock contention on primary causes WAL write spikes. Under sustained deadlock storms, replica lag can exceed your RTO .

ORM blind spots: Hibernate, SQLAlchemy, and ActiveRecord often wrap operations in implicit transactions with non-deterministic lock ordering based on object graph traversal order — making this nearly impossible to debug without query-level logging.

How to Fix It

Basic Fix — Enforce Consistent Lock Ordering

The root cause is always lock acquisition order inversion. Fix it by sorting the rows you intend to lock before acquiring locks.

-- Transaction A and B both update accounts: sender and receiver -- BAD: Each transaction locks in application-determined order (non-deterministic) - BEGIN; - UPDATE accounts SET balance = balance - 100 WHERE id = 1; -- locks row 1 - UPDATE accounts SET balance = balance + 100 WHERE id = 2; -- waits for row 2 - COMMIT;

-- (Concurrent Transaction B) - BEGIN; - UPDATE accounts SET balance = balance - 50 WHERE id = 2; -- locks row 2 - UPDATE accounts SET balance = balance + 50 WHERE id = 1; -- DEADLOCK - COMMIT;

-- GOOD: Always lock in ascending ID order regardless of transaction direction + BEGIN; + -- Pre-sort: always lock lower ID first + SELECT id FROM accounts WHERE id IN (1, 2) ORDER BY id FOR UPDATE; + UPDATE accounts SET balance = balance - 100 WHERE id = 1; + UPDATE accounts SET balance = balance + 100 WHERE id = 2; + COMMIT;

Enterprise Best Practice — SKIP LOCKED + Advisory Locks + Retry Logic

-- BAD: Blocking SELECT FOR UPDATE with no timeout, no retry handling - SELECT * FROM job_queue WHERE status = 'pending' FOR UPDATE;

-- GOOD: Non-blocking queue consumption with SKIP LOCKED + SELECT * FROM job_queue + WHERE status = 'pending' + ORDER BY created_at ASC + LIMIT 1 + FOR UPDATE SKIP LOCKED;

-- GOOD: Application-level retry with 40P01 detection (Python/psycopg2 example) + import psycopg2 + from psycopg2 import errors + import time, random + def execute_with_retry(conn, fn, max_retries=5): + for attempt in range(max_retries): + try: + with conn.cursor() as cur: + fn(cur) + conn.commit() + return + except errors.DeadlockDetected: + conn.rollback() + wait = (2 ** attempt) + random.uniform(0, 0.5) + time.sleep(wait) + raise Exception("Max retries exceeded on deadlock")

-- GOOD: PostgreSQL advisory locks for application-level mutex (no row lock needed) + SELECT pg_advisory_xact_lock(hashtext('transfer:' || LEAST(1,2)::text || ':' ||...

Show HN: WASM scanner to debug Postgres deadlocks without leaking SQL

Related Articles

The Newest Instagram "Exploit" Is the Goofiest I've Seen

Apple WWDC 2026 Livestream

Claude Fable 5

US Government directive to suspend access to Fable 5 and Mythos 5

It's Not Just X. It's Y