We Ran a Complex Task – A LangChain Repo Analysis with Claude Fable Models

ctrlnode-ai1 pts0 comments

We Ran a Complex Task — A LangChain Repo Analysis with Five Claude Models | CTRL NODE

Engineering · Jul 2, 2026 · 11 min read<br>We Ran a Complex Task — A LangChain Repo Analysis with Five Claude Models<br>Anthropic just shipped Claude Fable . We wanted a real answer to a practical question:

If you run the same complex engineering task on Opus, Fable, Sonnet, and Haiku — what do you actually get back?

Not a benchmark score. Not a vibe check. A full principal-engineer audit of a production open-source monorepo — with evidence, severity labels, and an execution plan.

We ran that experiment inside CTRL NODE : one prompt, five agents, five models, one cloned repository.

1. The goal: one hard task, five models

What we tested

We gave every model the same four-phase audit prompt and the same target : the LangChain Python monorepo (a large, mature library ecosystem — not a toy repo).

The prompt asks for:

Repository Map — explore first, judge second

Audit Report — architecture, security, tests, performance, deps, DX, docs (with file:line citations)

Improvement Strategy — themes, trade-offs, measurable “done” criteria

Task Plan — milestones M0–M3, quick wins, effort/risk/deps on each item

Every finding must be evidence-based . Guessing is explicitly forbidden.

That is a genuinely heavy task: thousands of files, real CI configs, security-sensitive deserialization paths, and god-class modules on hot code paths. It is the kind of work teams normally spread across several senior engineers.

Why Fable vs the rest

Fable is positioned as a strong reasoning model for long, structured work. We included it alongside:

Model<br>Role in the experiment

Claude Opus 4.8<br>Premium tier — threat modeling baseline

Claude Fable 5<br>New tier — strategy & execution planning

Claude Sonnet 5<br>Current Sonnet — primary audit pass

Claude Sonnet 4.6<br>Previous Sonnet — ops / CI lens

Claude Haiku 4.5<br>Fast tier — exploration & map

The hypothesis was not “Fable wins everything.” It was: each tier sees different things , and Fable might be the best at turning findings into a shippable backlog .

The prompt

The full prompt lives in our catalog as langchain-prompt.md. Core instruction (abbreviated):

You are a world-class, principal-engineer-level software engineer and technical audit expert.<br>Perform an in-depth analysis of this code repository, provide an honest audit report,<br>and offer a prioritized, actionable improvement plan.

Follow four phases in order: Discovery → Audit → Strategy → Task Plan.<br>All judgments must cite real file paths and line numbers. Do not guess.

Deliverables requested per run:

audit-report-.md — full Markdown report

audit-report-.html — interactive dark-theme dashboard (tabs: Overview, Map, Audit, Strategy, Tasks)

Summary of the prompt: resumen-langchain-prompt.md.

2. How we set it up in CTRL NODE

We did not paste the prompt into five browser tabs. We ran it the way a team would : Bridge on a real machine, a project work directory pointing at the clone, one agent per model tier.

Prerequisites

Bridge (ctrlnode) installed and paired — see Bridge setup.

Claude SDK API key set in ~/.ctrlnode/.env (providers load automatically — no PROVIDERS flag needed):

ANTHROPIC_API_KEY=sk-ant-...<br>BASE_PATH=/home/you/workspace

LangChain cloned on the Bridge host under BASE_PATH (CTRL NODE does not git-clone for you; the work directory points at an existing folder).

Project

In the web app: + NEW PROJECT

Field<br>Value

NAME<br>langchain-audit-experiment

AGENT TYPE<br>Claude

WORK DIRECTORY<br>Browse → select the LangChain clone → USE THIS DIRECTORY

DESCRIPTION<br>Five-model audit benchmark

The work directory is what lets agents read the full tree in WORK DIRECTORY task mode — the same scope a staff engineer would need.

Agents (one per model)

Team → + ADD AGENT — we created five agents on the same project:

Agent name<br>MODEL field<br>Purpose

audit-opus<br>claude-opus-4-8<br>Threat & design review

audit-fable<br>claude-fable-5<br>Strategy & task plan

audit-sonnet-5<br>claude-sonnet-5<br>Primary audit

audit-sonnet-46<br>claude-sonnet-4-6<br>CI / ops pass

audit-haiku<br>claude-haiku-4-5<br>Fast map

Models are selected in the MODEL combobox (synced from Bridge when online) or typed manually. Fable appears as claude-fable-5 in the Bridge model manifest (v2026.2.4+).

Optional AGENT SYSTEM INSTRUCTIONS were left minimal — we wanted the task prompt to carry the spec, not per-agent persona drift.

3. How we ran the prompt

For each agent, same procedure:

+ NEW TASK on the project

TITLE : LangChain principal audit —

INSTRUCTIONS : paste full contents of langchain-prompt.md

ASSIGN TO AGENT : pick the matching agent chip

OUTPUT MODE : WORK DIRECTORY (full repo scope; optional focus paths left empty)

NEW TASK → task lands in Backlog

RUN → dispatches to Bridge → agent moves to In progress

Bridge delivers the task with repositoryPaths and repo dispatch context so the Claude SDK runs against the LangChain tree on disk. Outputs (audit-report-*.md / .html) were collected...

audit claude task langchain fable prompt

Related Articles