What Bazel taught us about Terraform
NEW
×
Table of Contents
← Back to Blog
RSS
What Bazel taught us about Terraform
Stategraph<br>Terraform<br>Build Systems<br>Bazel<br>Architecture
Tamiya Onodera
May 13, 2026
Build systems solved fast-and-correct twenty years ago. Bazel's motto is { Fast, Correct } — Choose two: you cannot be fast without being correct, because speed comes from caching, and caching only works when inputs are complete. Stategraph applies that same playbook to Terraform.
TL;DR
what-bazel-taught-us-about-terraform.tldr
$ cat what-bazel-taught-us-about-terraform.tldr
• Build systems are fast because they are correct: a DAG of tasks with complete, hashed inputs.
• Stategraph computes a fine-grained dependency graph over HCL and hashes blocks to detect change.
• Terraform builds a graph too; but rebuilds it from scratch each run. Stategraph's graph is persistent, hashed, and reused.
• Sandboxing, distributed caches, remote execution, and lock-free parallelism all carry over from Bazel.
• Drift is the one infrastructure problem Bazel never had to solve.
As a person with a build systems background, I see infrastructure as code as a build system that takes HCL as input and which produces deployed infrastructure as output. This is a concrete instantiation of what build systems do: they take tasks as input, execute these tasks, producing outputs (outputs are usually referred to as artifacts).
Executing a task involves gathering its inputs. This is complex when a task's inputs include artifacts produced by earlier tasks. This is where graph theory comes into play: a build system must compute the dependency graph between tasks in order to run them in the right order. Dependencies between tasks can be static (main.c depends on main.h) or dynamic (main.c depends on all *.c files). See build systems à la carte for a deeper dive.
HCL as a dependency graph
In the IaC ecosystem, the dependency graph is expressed in HCL: resources depend on tfvars, modules depend on other modules, and so on. Here, the build artifact is the deployed infrastructure itself. HCL's dependency graph is hard to compute statically: HCL can perform computations that depend on files on disk, and for_each/count accept arbitrary expressions. That is why we had to write our own HCL evaluator early on in our implementation journey.
How build systems are fast
My build system of choice is Google's Bazel. Bazel's motto is { Fast, Correct } — Choose two. This motto says a lot: to be fast, a build system must be correct. To be fast, you need to cache the results of earlier tasks. If a task's inputs haven't changed, you can reuse the artifacts that task produced previously. For this to be correct, your tasks' inputs must be complete: there should be no implicit dependencies. If an untracked input to your task changes and you reuse its cached artifacts, your outputs become incorrect.
Design Principle
Speed comes from caching. Caching only works when inputs are complete. There is no shortcut: a build system that cannot enumerate every input to a task cannot safely reuse its outputs, and a system that cannot safely reuse outputs cannot be fast.
How Stategraph is fast
When implementing Stategraph, we encounter the same situation: we compute a fine-grained graph of dependencies between your HCL blocks; and use this dependency graph to cache computations intelligently. We actually use the same mechanism as Bazel and friends to cache inputs: we hash them. If an input changed, its hash will change; and Stategraph will include its dependees (that's the blast radius) in its computation (reverse deps in Bazel jargon).
Bazel only rebuilds affected tasks, while Stategraph only plans the affected subgraph of resources.
A change to db.tf only triggers a replan of its reverse dependencies. Everything else stays cached.
But doesn't Terraform already do this?
Terraform/OpenTofu users will recognise that Terraform does build a dependency graph too. So what makes Stategraph different? We covered this in Terraform has a graph; the short version:
Terraform's graph is ephemeral. Every terraform plan re-parses your .tf files, re-evaluates every expression, and rebuilds the graph from scratch. There is no cross-run cache of HCL evaluation: every plan is a cold build, even when nothing changed.
The state lock matches: one apply at a time per state file, no matter how disjoint the changes are. Two engineers touching unrelated parts of the same state still serialize through one lock.
Stategraph keeps the graph as a persistent artifact. We hash HCL blocks to detect change, only plan the subgraph the change reaches, and lock at the granularity of what's actually modified. Terraform's graph is a means to an end; Stategraph's graph is the product.
The Core Difference
Terraform discards its dependency graph at the end of every run. Stategraph persists it, hashes its inputs, and uses it to decide what work to...