We Ran a Complex Task – A LangChain Repo Analysis with Claude Fable Models

We Ran a Complex Task — A LangChain Repo Analysis with Five Claude Models | CTRL NODE

Engineering · Jul 2, 2026 · 11 min read We Ran a Complex Task — A LangChain Repo Analysis with Five Claude Models Anthropic just shipped Claude Fable . We wanted a real answer to a practical question:

If you run the same complex engineering task on Opus, Fable, Sonnet, and Haiku — what do you actually get back?

Not a benchmark score. Not a vibe check. A full principal-engineer audit of a production open-source monorepo — with evidence, severity labels, and an execution plan.

We ran that experiment inside CTRL NODE : one prompt, five agents, five models, one cloned repository.

1. The goal: one hard task, five models

What we tested

We gave every model the same four-phase audit prompt and the same target : the LangChain Python monorepo (a large, mature library ecosystem — not a toy repo).

The prompt asks for:

Repository Map — explore first, judge second

Audit Report — architecture, security, tests, performance, deps, DX, docs (with file:line citations)

Improvement Strategy — themes, trade-offs, measurable “done” criteria

Task Plan — milestones M0–M3, quick wins, effort/risk/deps on each item

Every finding must be evidence-based . Guessing is explicitly forbidden.

That is a genuinely heavy task: thousands of files, real CI configs, security-sensitive deserialization paths, and god-class modules on hot code paths. It is the kind of work teams normally spread across several senior engineers.

Why Fable vs the rest

Fable is positioned as a strong reasoning model for long, structured work. We included it alongside:

Model Role in the experiment

Claude Opus 4.8 Premium tier — threat modeling baseline

Claude Fable 5 New tier — strategy & execution planning

Claude Sonnet 5 Current Sonnet — primary audit pass

Claude Sonnet 4.6 Previous Sonnet — ops / CI lens

Claude Haiku 4.5 Fast tier — exploration & map

The hypothesis was not “Fable wins everything.” It was: each tier sees different things , and Fable might be the best at turning findings into a shippable backlog .

The prompt

The full prompt lives in our catalog as langchain-prompt.md. Core instruction (abbreviated):

You are a world-class, principal-engineer-level software engineer and technical audit expert. Perform an in-depth analysis of this code repository, provide an honest audit report, and offer a prioritized, actionable improvement plan.

Follow four phases in order: Discovery → Audit → Strategy → Task Plan. All judgments must cite real file paths and line numbers. Do not guess.

Deliverables requested per run:

audit-report-.md — full Markdown report

audit-report-.html — interactive dark-theme dashboard (tabs: Overview, Map, Audit, Strategy, Tasks)

Summary of the prompt: resumen-langchain-prompt.md.

2. How we set it up in CTRL NODE

We did not paste the prompt into five browser tabs. We ran it the way a team would : Bridge on a real machine, a project work directory pointing at the clone, one agent per model tier.

Prerequisites

Bridge (ctrlnode) installed and paired — see Bridge setup.

Claude SDK API key set in ~/.ctrlnode/.env (providers load automatically — no PROVIDERS flag needed):

ANTHROPIC_API_KEY=sk-ant-... BASE_PATH=/home/you/workspace

LangChain cloned on the Bridge host under BASE_PATH (CTRL NODE does not git-clone for you; the work directory points at an existing folder).

Project

In the web app: + NEW PROJECT

Field Value

NAME langchain-audit-experiment

AGENT TYPE Claude

WORK DIRECTORY Browse → select the LangChain clone → USE THIS DIRECTORY

DESCRIPTION Five-model audit benchmark

The work directory is what lets agents read the full tree in WORK DIRECTORY task mode — the same scope a staff engineer would need.

Agents (one per model)

Team → + ADD AGENT — we created five agents on the same project:

Agent name MODEL field Purpose

audit-opus claude-opus-4-8 Threat & design review

audit-fable claude-fable-5 Strategy & task plan

audit-sonnet-5 claude-sonnet-5 Primary audit

audit-sonnet-46 claude-sonnet-4-6 CI / ops pass

audit-haiku claude-haiku-4-5 Fast map

Models are selected in the MODEL combobox (synced from Bridge when online) or typed manually. Fable appears as claude-fable-5 in the Bridge model manifest (v2026.2.4+).

Optional AGENT SYSTEM INSTRUCTIONS were left minimal — we wanted the task prompt to carry the spec, not per-agent persona drift.

3. How we ran the prompt

For each agent, same procedure:

+ NEW TASK on the project

TITLE : LangChain principal audit —

INSTRUCTIONS : paste full contents of langchain-prompt.md

ASSIGN TO AGENT : pick the matching agent chip

OUTPUT MODE : WORK DIRECTORY (full repo scope; optional focus paths left empty)

NEW TASK → task lands in Backlog

RUN → dispatches to Bridge → agent moves to In progress

Bridge delivers the task with repositoryPaths and repo dispatch context so the Claude SDK runs against the LangChain tree on disk. Outputs (audit-report-*.md / .html) were collected...

We Ran a Complex Task – A LangChain Repo Analysis with Claude Fable Models

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

ZCode – Harness for GLM-5.2

Apertus – Open Foundation Model for Sovereign AI