Native Coding Agent Optimized for Local LLM and DeepSeek v4 with Vector Memory

coolwulf1 pts0 comments

cwcode — a terminal coding agent

cwcode

A terminal coding agent built around DeepSeek V4 Pro,<br>Qwen3.6‑27B, Kimi, Azure, and anything else that speaks OpenAI&rsquo;s<br>chat API.

Written in Go. Lives in your terminal. Edits real code. Recovers from its own<br>mistakes. Costs about $0.40 to leave running for an hour.

5%

of Claude&rsquo;s token cost<br>on DeepSeek V4 Pro

85%+

prefix-cache hit ratio<br>after turn 3

~12k

lines of Go<br>no external services

What it is

cwcode is a Bubbletea TUI that drives any OpenAI-compatible chat endpoint<br>as a tool-using coding agent. It ships with profiles for DeepSeek (Pro and<br>Flash), Azure OpenAI, Kimi for Coding, and a local vLLM /<br>llama.cpp profile for Qwen3.6-27B on a home server. Switching profiles<br>mid-session is one slash command.

It has bash, file edit, glob, grep, web fetch, headless-Chrome fetch<br>(driven via CDP through your real browser), sub-agents, a persistent<br>semantic-memory store, content-addressed checkpoints with rewind, a<br>plan/code mode toggle, and an autonomous goal loop. The tool registry<br>is six hundred lines and adding a new tool is a two-method Go interface.

It is not a SaaS. There is no account, no telemetry, no remote control<br>plane. Your API key sits in ~/.cwcode/config.json. Your<br>session history sits in ~/.cwcode/sessions/. If your<br>network is down and the model endpoint is local, the agent keeps<br>working.

Why it&rsquo;s different

Hash-anchored edits

The read_file tool annotates every line with a 3-character<br>content hash: 42:a3f| return x.<br>The edit_lines tool takes<br>(line, hash, new_text) and rejects the entire batch if<br>any hash drifted. The model never has to reproduce content<br>character-perfect to land an edit. Adopted from<br>Can Akay&rsquo;s<br>February 2026 post and ported to Go in about 200 lines.<br>Output tokens per session dropped 30–40% on V4 Pro.

Sticky prefix cache

The system prompt is byte-stable across turns. Tool definitions<br>serialize in a deterministic order. Reasoning content is stripped<br>from outbound requests on every provider by default. DeepSeek&rsquo;s<br>prompt-cache hit path is ~120&times; cheaper than the miss path,<br>and our /cache slash command shows session-cumulative<br>hit ratio that routinely exceeds 85% after the third turn.

Plan vs code mode

A single Shift+Tab toggle between read-only<br>planning (the LLM only sees non-mutating tools) and full execution.<br>The model doesn&rsquo;t see the flag — it just sees a different<br>(smaller) tool registry and a system-prompt addendum. The human<br>holds final control unless you opt into YOLO mode.

Checkpoint & rewind

Before any file-mutating tool runs, the harness snapshots the<br>pre-state of every path the tool declares it will touch. Snapshots<br>are SHA-256-keyed blobs in ~/.cwcode/sessions//objects/,<br>deduped automatically. /rewind N restores files,<br>truncates conversation history, and pre-fills the input box with<br>the original prompt.

Storm-breaker

When the same tool fails identically three times in a row, the<br>harness doesn&rsquo;t silently abort. It synthesizes a<br>plain-language response (&ldquo;I&rsquo;m unable to continue:<br>read_file failed three times because the path was empty.<br>Please clarify…&rdquo;), streams it like a normal reply, and<br>appends it to history so follow-ups have context.

Autonomous goal loop

/goal appends a goal to<br>goals.md. /goal on starts an autonomous<br>loop that runs back-to-back turns until every checkbox is marked<br>done or until a safety cap of 20 consecutive cycles. We use this<br>for four-hour overnight runs on annotated tasks.

No SaaS lock-in

Config is JSON. Sessions are JSON. Checkpoints are content-addressed<br>blobs. Memory store is a SQLite file. Everything lives under<br>~/.cwcode/. If the project disappeared tomorrow your<br>sessions are still readable.

What it looks like

Captured during real work on our dose-prediction codebase: the agent<br>proposing an edit_file change to a Go test, with a unified<br>diff highlighted inline, the reasoning trace streaming below, and the<br>current task list pinned to the bottom of the pane.

cwcode running a Go test edit; multi-tab tmux session,<br>dose-prediction project, DeepSeek profile.

Install

Download a pre-built binary for your platform from the<br>Google Drive release folder<br>(current build: v1.11; macOS arm64 / amd64 and Windows amd64). Drop it<br>somewhere on your PATH and make it executable:

curl -L -o ~/.local/bin/cwcode<br>chmod +x ~/.local/bin/cwcode<br>cwcode -version

You&rsquo;ll need an OpenAI-compatible endpoint (DeepSeek API key,<br>Azure deployment, local vLLM, or whatever else you have on hand).

Configure a profile in ~/.cwcode/config.json:

"active_profile": "deepseek-pro",<br>"profiles": {<br>"deepseek-pro": {<br>"provider": "deepseek",<br>"endpoint": "https://api.deepseek.com",<br>"model": "deepseek-v4-pro",<br>"api_key": "sk-...",<br>"ctx_size": 262144

Run it.

cwcode # Bubbletea TUI<br>cwcode -p "fix the bug" # one-shot, no session<br>cwcode -continue # resume the most recent session<br>cwcode -plain # stdout REPL (no TUI)

Built-in tools

namepurposeneeds...

cwcode deepseek tool rsquo session agent

Related Articles