Knowledge Agents: Beat Frontier Models with Better Structure

lklinger1 pts0 comments

Knowledge Agents: Beat Frontier Models with Better Structure

SubscribeSign in

Knowledge Agents: Beat Frontier Models with Better Structure<br>No Mythos, No Problem

James Wang<br>Jun 21, 2026

19

Share

Anthropic recently had to pull Mythos/Fable due to an edict from the US government. While Mythos was a step up from Opus, I’ve been actively moving smaller in terms of my agentic models—and matching the quality of output of some of the largest frontier models.<br>The use cases have spanned from hard “hedge fund level” (for want of a better description) market analysis, financial management, and AI personal assistants to even helping a few friends in difficult medical situations. I’ve called this pattern “knowledge agents” with a generic template available to everyone here. They literally inject the right knowledge into the AI agent plugged into it. Anyone can do this, with or without my template.<br>As my README proudly declares (yes, I absolutely do have AI write my documentation—do you like writing comprehensive technical documentation?):<br>This methodology was developed and battle-tested on a markets knowledge agent, meant to replicate James Wang’s thought process in markets: ~10,000 pages of scanned financial market reference materials + ~100 web articles, producing 381 concept documents and 54 thesis documents with hybrid BM25 + semantic search. This was further tested on other specialized knowledge areas—including company-specific policy docs (for a “corporate knowledge agent”) and rare research areas (women’s sexual health, given James’s background)—to great effect. The generalized version here captures a domain-agnostic methodology so it can be applied to any subject.

These were the first, but at this point I have twelve of these specialist “knowledge agents” that handle queries from other agents. Or, obviously, from me. When I’m coding new things that require specialist knowledge, I often start Claude Code in a knowledge agent folder instead of making a new folder and have it benefit from the expert knowledge within it to plan. Especially for specialized machine learning algorithms or economic models, I get far better results this way than with a “subject-agnostic” model—even a really big frontier model.

Readers have given me the feedback that infographics are useful. This seems pretty self-explanatory, but let me know if you like them!<br>In general, I have used Claude Opus in these knowledge agent “harnesses” (one way to describe this “superstructure” around the AI). As such, it’s pairing the really big model with injected knowledge from the harness. However, I’ve found that I get very, very good results even with far smaller models. The LLM is merely the “engine”—all of the expert knowledge is provided from my knowledge agent system, which surfaces the relevant knowledge at the right time.<br>Relevant, of course, is key. As most of you know, you can’t just drag 10,000 pages of documents into your chat window. Even if you could, you’d get a mess of irrelevant information drowning the LLM. In practicality, you’ll probably run out of context and never get an answer if the platform even lets you do it.<br>This has allowed me to move many of my agents from Anthropic’s Claude (ahead of a billing change that would have cost me $2k+ per month that now got delayed) to a locally run open-weight Qwen model. It’s a tiny fraction of the size of Claude Opus (the flagship model) and is able to run on hardware I have plugged in at home. It’s next to my feet right now as I’m typing.<br>(As a random note, you do have to point your non-Claude agent at CLAUDE.md or copy it into AGENTS.md—AGENTS.md is often the convention for non-Claude systems as the equivalent “agent needs to read these instructions first”)

In my prior article, I mentioned I could run Qwen 3.6 27B on an old 3090 Ti. I have admittedly upgraded (right) to a 5090 since I've found I've been able to utilize my local agents a lot. Is it overkill? Yeah, probably. Also, yes, this looks like a gaming setup... because that's what consumer GPUs are targeted toward.<br>How does it work?

The simple answer, as said, is that it injects the right, specific knowledge into the AI agent at the right time. The longer answer? Let’s first talk about the forms that knowledge takes in LLMs.<br>First, a significant portion of frontier models’ huge footprint is “knowledge.” I might call it pseudo-knowledge, since it’s probabilistic and there’s no guarantee it’ll give you the right answer... but the biggest models have been trained on an enormously broad set of data. This is captured in numerical weights as “parametric” knowledge. While that’s very useful if you’re casually asking Claude Opus or GPT-5.5 about some random topic, it’s entirely irrelevant if you either already have the data you want to reference or the data isn’t publicly available anyway—so it could never train on it. The latter is quite common in fields that are specialist (areas of medical research), secretive (high finance), or...

knowledge agents models from agent claude

Related Articles