We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files
Home All Posts We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files
AI in Software Engineering,
AgentLinter
23/06/2026
We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files
Codacy
9 mins read
In this article:
Subscribe to our blog:
Teams are shipping code with AI coding assistants such as Claude Code, Cursor, and GitHub Copilot, increasingly guided by repository-level instruction files. The problem is that most teams treat these files like informal documentation rather than the production-level configuration they have become.
We ran AgentLinter across 34,266 repositories to see how organizations are actually managing their agent instruction files. The findings reveal a consistent gap: ambiguous instructions, missing failure behavior, and security risks that would never pass review in application code . This article breaks down what we found and what it takes to enforce the same discipline on agent configs that you already apply to the rest of your codebase.
In this article:
Why agent configuration files require security controls
What we found across 34,266 repositories
Finding 1: Ambiguity is the most common issue
Finding 2: Missing escape hatches create runaway behavior
Finding 3: Security issues are lower volume but higher severity
Why defined rules are not the same as enforced rules
What good enforcement looks like for agent instructions
How to harden agent config files this week
Why agent configuration files require security controls
Securing AI agent configuration files in software repositories comes down to a few core practices: externalize secrets instead of hardcoding them, enforce strict access controls, validate configurations for clarity and safety, and continuously scan for potentially leaked credentials. The problem is that most teams have not yet started treating agent config files with the same discipline they apply to application code. A McKinsey survey found only ~30% of organizations at AI governance maturity level 3 or higher.
AI coding agents like Claude Code, Cursor, and GitHub Copilot rely on repository-level instruction files to guide their behavior. You might recognize names like CLAUDE.md, .cursorrules, copilot-instructions.md, or AGENTS.md. The files tell the agent what coding standards to follow, which directories to avoid, how to handle errors, and what commands it can run.
Here is where the risk comes in. Agent instruction files influence how agents read code, make edits, run shell commands, and interact with sensitive data. Yet teams often author and review them with less rigor than a typical pull request. When an instruction file is vague, contradictory, or contains hardcoded secrets, the agent amplifies that weakness across every task it performs.
Traditional documentation is interpreted by humans who can infer intent and ignore irrelevant details. Agent instruction files, on the other hand, are consumed by LLM-based systems that act on the text directly. Human docs guide judgment. Agent configs shape execution.
What we found across 34,266 repositories
We ran AgentLinter, an open-source static analysis tool for AI agent configuration files, across 34,266 repositories with the feature enabled. The scan covered 1,353 organizations.
The results showed a clear gap between how teams treat application code and how they treat agent instructions:
1,604 repositories had issues flagged in their agent config files
354 organizations had at least one repository with findings
Over 13,000 issues related to comprehensibility and clarity
Nearly 5,000 issues related to missing escape hatches
Approximately 1,150 issues related to security risks
The findings break down into three categories: ambiguity, missing failure behavior, and security vulnerabilities. Each category represents a different kind of risk, but they share a common root cause. Teams are giving agents production-level influence without production-level enforcement over the rules that guide them.
Finding 1: Ambiguity is the most common issue
The most frequent problem we found was unclear instructions. More than 7,700 instances of undefined terms appeared across the scanned repositories. Other common issues included vague directives and overly complex sentences that an agent cannot reliably interpret.
LLM-based agents are highly sensitive to unclear language. When an instruction says "format code properly" or "follow best practices," the agent has to infer what that means. Different agents, or even the same agent on different runs, may interpret it differently.
Consider the difference between weak and concrete instructions:
Weak instruction<br>Concrete instruction
"Be helpful."<br>"When a user asks for a code change, implement the change and explain what you modified."
"Format code properly."<br>"Use 2-space indentation for TypeScript files."
"Use our standard style."<br>"Follow the...