GitHub - T-Chartrand/tuningfork: Grounding rules for LLM agents, derived from human reality-testing · GitHub
/" data-turbo-transient="true" />
Skip to content
Search or jump to...
Search code, repositories, users, issues, pull requests...
-->
Search
Clear
Search syntax tips
Provide feedback
--><br>We read every piece of feedback, and take your input very seriously.
Include my email address so I can be contacted
Cancel
Submit feedback
Saved searches
Use saved searches to filter your results more quickly
-->
Name
Query
To see all available qualifiers, see our documentation.
Cancel
Create saved search
Sign in
/;ref_cta:Sign up;ref_loc:header logged out"}"<br>Sign up
Appearance settings
Resetting focus
You signed in with another tab or window. Reload to refresh your session.<br>You signed out in another tab or window. Reload to refresh your session.<br>You switched accounts on another tab or window. Reload to refresh your session.
Dismiss alert
{{ message }}
T-Chartrand
tuningfork
Public
Notifications<br>You must be signed in to change notification settings
Fork
Star
main
BranchesTags
Go to file
CodeOpen more actions menu
Folders and files<br>NameNameLast commit message<br>Last commit date<br>Latest commit
History<br>15 Commits<br>15 Commits
docs
docs
examples
examples
src/tuningfork
src/tuningfork
tests
tests
.gitignore
.gitignore
LICENSE
LICENSE
README.md
README.md
pyproject.toml
pyproject.toml
View all files
Repository files navigation
tuningfork
Grounding rules for LLM agents, derived from human reality-testing.
Humans who must routinely distinguish real perception from convincing internal<br>fabrication have spent decades refining practical checks for it. Those checks<br>turn out to map directly onto the agent hallucination problem — often more<br>cleanly than the framings in the ML literature. tuningfork is that mapping,<br>written down as nine rules and shipped as a small, dependency-free Python<br>reference implementation.
The name comes from one of those human techniques: a physical tuning fork held<br>to the ear interrupts auditory hallucination through an independent channel.<br>It doesn't argue with the false signal — it breaks the state. That is the<br>design principle of this entire library.
The core insight
A check terminates when the verifier sits outside the system being doubted.
A model re-reading its own output shares its own failure modes — it can<br>fluently confirm its own fabrication. A grep, a parser, a checksum, an exit<br>code cannot. One deterministic confirmation from an independent channel is<br>final; a hundred same-model re-checks are not. Everything here follows from<br>that: the environment is the source of truth, and the model's memory is a<br>cache that may be stale.
The nine rules
Rule<br>Phase<br>One-liner
G0 Asymmetric Trust<br>governs all<br>Content can convict, but never acquit — trust flows from source-tracing only
G1 Verify-Before-Assert<br>foresee<br>A claim that could be tool-checked must be, before it's stated
G2 Closed-Loop Execution<br>recognize<br>Report observed results, never issued commands. Read-only observations are terminal
G3 Disagreement Triangulation<br>recognize<br>Tool beats memory; one independent check on surprises; one deterministic confirmation is final
G4 Negative-Space Probing<br>foresee<br>Probe for existence before relying on remembered entities; keep a catalog of known fabrication signatures
G5 Reproducibility Snapshot<br>snap out<br>After a correction, rebuild state from tool output only — nothing from the broken narrative carries over
G6 Cost-Tiered Budget<br>continuous<br>Tier verification by blast radius, decided before generation; suspiciously perfect claims get their tier raised
G7 Passive Independent Validators<br>continuous<br>Cheap deterministic monitors run on everything and never ask the generator's permission
G8 Source Re-attribution<br>after the verdict<br>A verified-false output is evidence about the generator — mine it; belief and action are decoupled
Full text with rationale: docs/framework.md · The story behind it: docs/essay.md
Quick start
pip install -e .
from tuningfork import (GroundedAgent, ValidatorBank,<br>CitationValidator, PathValidator, JsonBlockValidator)
bank = ValidatorBank([<br>CitationValidator(valid_source_ids=["1", "2", "3"]),<br>PathValidator(evidence_paths=tool_returned_paths),<br>JsonBlockValidator(),<br>])
agent = GroundedAgent(generate=my_llm_callable, bank=bank)<br>result = agent.run("Summarize sources [1]-[3] and list the config files involved.")
print(result.tier.rationale) # how the claim was priced before generation<br>print(result.report.summary()) # what the independent channels observed<br>print(result.trustworthy) # validators' verdict, not the model's
The harness permits exactly one regeneration pass on validator failure —<br>fed the validator evidence, not an apology prompt. A second failure is<br>reported as unresolved, because retrying the same channel is re-checking the<br>check.
The child agent
v0.3.0 adds a small runnable agent with the overlay on: an...