Giving LLMs a Formal Reasoning Engine for Code Analysis

(iterate think thoughts): Giving LLMs a Formal Reasoning Engine for Code Analysis

(iterate

think

thoughts)

Theme

April 8, 2026

Giving LLMs a Formal Reasoning Engine for Code Analysis

LLM coding assistants continue to become more capable at writing code, but they have an inherent weakness when it comes to reasoning about code structure. What's worse is that they assemble the picture of the code by grepping through source files and reconstructing call chains in ad hoc fashion. This works for simple questions, but quickly starts to fall apart for transitive ones such as "Can user input reach this SQL query through any chain of calls?" or "What's all the dead code in this module?" Such questions require exhaustive structural analysis that simply can't be accomplished using pattern matching. Chiasmus is an MCP server aiming to address the problem by giving LLMs access to formal reasoning engines, bundling Z3 for constraint solving and Tau Prolog for logic programming. Source files are parsed using tree-sitter and then turned into formal grammars, providing the LLM with a structured representation of the code along with a logic engine that can answer questions about it with certainty while using a fraction of the tokens. The project is grounded in the neurosymbolic AI paradigm described by Sheth, Roy, and Gaur. The core idea is that AI systems benefit from combining neural networks (perception, language understanding) with symbolic knowledge-based approaches (reasoning, verification). LLMs are excellent at understanding what you're asking and generating plausible code, but they lack the ability to prove properties about that code. Symbolic solvers have that ability but can't understand natural language or navigate a codebase. Chiasmus bridges the two: the LLM handles perception (parsing your question, understanding context, filling templates), while the solvers handle cognition (exhaustive graph traversal, constraint satisfaction, logical inference). The Problem with Grepping Through Source When an LLM assistant needs to answer "what's the blast radius of changing lintSpec?", here's what typically happens: Step 1: grep lintSpec src/**/*.ts → found in engine.ts (lintLoop) and mcp-server.ts (handleLint)

Step 2: grep lintLoop src/**/*.ts → called from solve() at lines 75 and 87

Step 3: grep handleSolve src/**/*.ts → called from createChiasmusServer switch... Three rounds of tool calls, each consuming tokens for both the query and the response. At each step, the LLM has to reason about what it found and decide what to grep next. And after all that, it's still only traced part of the chain while missing paths through correctionLoop, runAnalysis, and several other transitive callers. This isn't a failure of the LLM. It's a fundamental limitation of the approach. Grep finds string matches. Structural questions about code such as reachability, dead code, cycles, impact analysis require graph traversal, which grep cannot do. How Chiasmus Works: Tree-sitter → Prolog → Formal Queries Chiasmus takes a different approach. Instead of searching through text, it: Parses source files with tree-sitter into typed ASTs Walks the ASTs to extract structural facts: method definitions, call relationships, imports, exports Serializes these as Prolog facts a declarative representation of the call graph Runs formal queries via the Prolog solver to answer structural questions Step 1: Tree-sitter Parsing Unlike regex-based tools, tree-sitter understands language grammar making it possible to produce concrete syntax trees. Since it knows that foo() in const bar = () => { foo(); } is a call from bar to foo, it can answer semantic questions regarding the symbol. Chiasmus supports Python, Go, TypeScript, JavaScript, and Clojure out of the box, and provides adapters for other languages. When you pass source files to chiasmus_graph, the parser identifies method declarations arrow_function, method_definition in TS/JS; defn, defn- in Clojure. Next, it resolves call expressions call_expression → callee name, handling obj.method() → method, this.bar() → bar, db/query → query. It tracks scope of which routine is the caller for each call site and extracts imports and exports for cross-file resolution. Tree-sitter is an incremental parsing library that produces concrete syntax trees. Unlike regex-based tools, it understands language grammar, and so, it knows that foo() in const bar = () => { foo(); } is a call from bar to foo, not just a string that contains "foo". Step 2: Prolog Fact Generation The extracted relationships become Prolog facts: defines('src/formalize/validate.ts', lintSpec, routine, 16). defines('src/formalize/engine.ts', lintLoop, routine, 208). defines('src/formalize/engine.ts', solve, routine, 64). defines('src/mcp-server.ts', handleLint, routine, 527).

calls(lintLoop, lintSpec). calls(solve, lintLoop). calls(handleLint, lintSpec). calls(handleSolve, solve). calls(correctionLoop,...

Giving LLMs a Formal Reasoning Engine for Code Analysis

Related Articles

Amazon, Facebook, FBI have access to a private intelligence-sharing network

Show HN: GoPeek – open links in live mini browser windows without new tabs

Agent Memory: An Anatomy

SpaceX not the behemoth everyone thought

The Mirror Is Part of the Machine