The sorry state of skill distribution - The Trail of Bits BlogPage content
Public skill marketplaces are being flooded with malicious skills that steal credentials, exfiltrate data, and hijack agents. In response, a segment of the security industry released skill scanners, a new family of tools designed to detect malicious skills before they’re installed. But we tested them, and they don’t work.<br>We recently bypassed ClawHub’s malicious skill detector, Cisco’s agent skill scanner, and all three of the scanners integrated into skills.sh. These were not advanced attacks: it took us less than an hour to conceive and implement three of the four malicious skills in trailofbits/overtly-malicious-skills, using standard tricks and rapid inspection of the scanner source code. The fourth malicious skill took a few hours, but only because the prompt injection required some trial and error. Our findings demonstrate that even when skill scanners have some defenses, their static nature gives an adversary unlimited bites at the apple to tweak an attack until it finds a way through.<br>Why skill security matters<br>Software supply chains have long been the soft underbelly of computer security. As fragile infrastructure susceptible to both insider threats and external attackers, these supply chains were vulnerable enough when malicious code was the sole vector of compromise. But the rise in agentic systems has spawned a new style of dependency—the skill—and with it a whole new ecosystem of marketplaces and distribution channels that now run alongside traditional package managers. Malicious skills can embed harmful instructions in natural language (e.g., a SKILL.md prompt) as well as code, giving them whole new avenues to attack any system they are given access to.<br>Compounding the issue, the distribution channels for skills have proved to be ship-first, secure-later. There are already multiple types of distribution channels for how users find skills and deploy them to their agents:<br>ZIP archives distributed out-of-band and then uploaded manually or via API to agent harnesses like Anthropic’s claude.ai and OpenAI’s Codex;
Curated marketplaces like anthropics/skills and trailofbits/skills-curated; and
Public marketplaces like skills.sh and clawhub.ai.
The first two methods can plausibly exclude malicious skills through procedural controls on where skills come from and who is allowed to approve their use. On the other hand, public marketplaces are one-stop, one-”click-to-install” shops that have been flooded with fake skills preying on unsuspecting users. These malicious skills aim to trap an unwary developer or OpenClaw agent, compromising the user’s system through arbitrary code execution or instructions for the agent to send sensitive data to a remote server.<br>Following a spate of compromises and attack demonstrations, several security companies have launched scanners intended to detect these malicious skills. We wanted to understand how well these systems defend users from them. We initially tested Cisco’s skill-scanner, where we found several bypasses and submitted changes to harden the system. Shortly thereafter, Vercel’s skills.sh launched integrations with scanners from Gen, Socket, and Snyk, and OpenClaw partnered with VirusTotal to scan skills in ClawHub; we tested these scanners, too.<br>Bypassing ClawHub scanning<br>We’ll start with ClawHub (built by OpenClaw, for OpenClaw agents). The platform uses a two-part scanning solution. One is an integration with VirusTotal, which checks for known malware signatures and uses a proprietary scanner called Code Insight, built on Gemini 3 Flash, under the hood. The other scanner is a custom harness and prompt for a guard model, by default GPT 5.5.<br>We bypassed both checks with our first attack. The approach is dead simple in both design and implementation: it simply prepends 100,000 newlines between some boilerplate and our overtly malicious code. The OpenClaw scanner truncated the file and missed the malicious content entirely, while the VirusTotal scanner model seemed to become confused. And unless users are paying close attention, it’s easy to miss the long scroll wheel in the web UI.<br>Figure 1: OpenClaw scanner misses malicious content<br>On the plus side, OpenClaw takes a relatively strict approach to skill packaging: only certain whitelisted file types will be included in the distributed skills; no binaries or archives are allowed. This significantly constrains the types of attacks available without placing any meaningful limits on skill functionality. Not so, however, for our next targets.<br>Bypassing skills.sh and Cisco skill scanning<br>The next set of scanners that we looked at operate on arbitrary git repositories, which allows us a grab bag of tricks involving binary files that both their simple pattern-matching and LLM-based strategies struggle to spot.<br>The skills.sh scanning works through integration with three external services: Gen Agent Trust Hub, Socket, and Snyk. The Cisco...