34,266 repos were scanned: 1 in 4 orgs showed gaps in AI agent config files

We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files

Home All Posts We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files

AI in Software Engineering,

AgentLinter

23/06/2026

We Scanned 34,266 Repos. 1 in 4 Orgs Showed Gaps In AI Agent Config Files

Codacy

9 mins read

In this article:

Subscribe to our blog:

Teams are shipping code with AI coding assistants such as Claude Code, Cursor, and GitHub Copilot, increasingly guided by repository-level instruction files. The problem is that most teams treat these files like informal documentation rather than the production-level configuration they have become.

We ran AgentLinter across 34,266 repositories to see how organizations are actually managing their agent instruction files. The findings reveal a consistent gap: ambiguous instructions, missing failure behavior, and security risks that would never pass review in application code . This article breaks down what we found and what it takes to enforce the same discipline on agent configs that you already apply to the rest of your codebase.

In this article:

Why agent configuration files require security controls

What we found across 34,266 repositories

Finding 1: Ambiguity is the most common issue

Finding 2: Missing escape hatches create runaway behavior

Finding 3: Security issues are lower volume but higher severity

Why defined rules are not the same as enforced rules

What good enforcement looks like for agent instructions

How to harden agent config files this week

Why agent configuration files require security controls

Securing AI agent configuration files in software repositories comes down to a few core practices: externalize secrets instead of hardcoding them, enforce strict access controls, validate configurations for clarity and safety, and continuously scan for potentially leaked credentials. The problem is that most teams have not yet started treating agent config files with the same discipline they apply to application code. A McKinsey survey found only ~30% of organizations at AI governance maturity level 3 or higher.

AI coding agents like Claude Code, Cursor, and GitHub Copilot rely on repository-level instruction files to guide their behavior. You might recognize names like CLAUDE.md, .cursorrules, copilot-instructions.md, or AGENTS.md. The files tell the agent what coding standards to follow, which directories to avoid, how to handle errors, and what commands it can run.

Here is where the risk comes in. Agent instruction files influence how agents read code, make edits, run shell commands, and interact with sensitive data. Yet teams often author and review them with less rigor than a typical pull request. When an instruction file is vague, contradictory, or contains hardcoded secrets, the agent amplifies that weakness across every task it performs.

Traditional documentation is interpreted by humans who can infer intent and ignore irrelevant details. Agent instruction files, on the other hand, are consumed by LLM-based systems that act on the text directly. Human docs guide judgment. Agent configs shape execution.

What we found across 34,266 repositories

We ran AgentLinter, an open-source static analysis tool for AI agent configuration files, across 34,266 repositories with the feature enabled. The scan covered 1,353 organizations.

The results showed a clear gap between how teams treat application code and how they treat agent instructions:

1,604 repositories had issues flagged in their agent config files

354 organizations had at least one repository with findings

Over 13,000 issues related to comprehensibility and clarity

Nearly 5,000 issues related to missing escape hatches

Approximately 1,150 issues related to security risks

The findings break down into three categories: ambiguity, missing failure behavior, and security vulnerabilities. Each category represents a different kind of risk, but they share a common root cause. Teams are giving agents production-level influence without production-level enforcement over the rules that guide them.

Finding 1: Ambiguity is the most common issue

The most frequent problem we found was unclear instructions. More than 7,700 instances of undefined terms appeared across the scanned repositories. Other common issues included vague directives and overly complex sentences that an agent cannot reliably interpret.

LLM-based agents are highly sensitive to unclear language. When an instruction says "format code properly" or "follow best practices," the agent has to infer what that means. Different agents, or even the same agent on different runs, may interpret it differently.

Consider the difference between weak and concrete instructions:

Weak instruction<br>Concrete instruction

"Be helpful."<br>"When a user asks for a code change, implement the change and explain what you modified."

"Format code properly."<br>"Use 2-space indentation for TypeScript files."

"Use our standard style."<br>"Follow the...

34,266 repos were scanned: 1 in 4 orgs showed gaps in AI agent config files

Related Articles

(no title)

Is AI ruining our skills? Early results are in – and they're not good

The Anatomy of an AI-Native Org

Apertus – Open Foundation Model for Sovereign AI

The labor share of income in the US is at its lowest post-war level