We scanned the official MCP servers. Here's what we found.

4 critical, 7 high-risk, and prompt injection — in the servers 82,000 developers use by default.

The servers everyone installs first

When you set up Claude Desktop, Cursor, or any MCP-compatible client, the first thing most people do is install servers from modelcontextprotocol/servers — the official reference implementations maintained by Anthropic. Filesystem access, memory, sequential thinking, fetch, git. The basics.

82,000 GitHub stars. Probably millions of installations. These are the servers developers trust by default.

We ran npx decoy-scan against them.

The results

4 critical-risk tools, 7 high-risk tools, 1 poisoned server — across 4 servers and 37 tools.

| Server | Risk | Key findings | |--------|------|-------------| | filesystem | Critical | write_file enables arbitrary file writes. 11 tools accept unconstrained input. No input validation on paths. | | memory | Critical | delete_entities, delete_observations, delete_relations — destructive operations with no confirmation or constraints. | | everything | High | get-env leaks environment variables. gzip-file-as-resource provides file access. 4 tools missing safety checks. | | sequential-thinking | Medium | Prompt injection detected in tool descriptions. Missing input constraints. |

The prompt injection finding in sequential-thinking is worth highlighting. This is a tool whose description contains text that could override agent behavior — in the official server repo.

What "unconstrained input" means in practice

When we say a tool accepts unconstrained input, we mean there's no maxLength, no pattern, no enum constraint on the parameters. An agent — or an attacker controlling an agent — can pass anything.

For filesystem's write_file, that means arbitrary content to arbitrary paths. For memory's delete_entities, it means bulk deletion with no guardrails. These tools work as designed — the issue is that the design assumes a trusted caller.

MCP servers don't get to make that assumption anymore.

Why this matters now

Nicholas Carlini at Anthropic recently demonstrated that current LLMs can autonomously find and exploit zero-day vulnerabilities in hardened software — including the Linux kernel. His team found bugs that have been in the kernel since 2003. These aren't toy demos. These are heap buffer overflows found by a model running Claude Code with a one-paragraph prompt.

His core message: the capability is doubling every four months, and the models we have today are "probably the most significant thing to happen in security since we got the internet."

If models can find zero-days in the Linux kernel, they can find and exploit gaps in your MCP servers. The question isn't whether your tools have issues — it's whether you know about them before someone else's agent does.

Remediation

Every finding from decoy-scan now includes an inline fix recommendation:

Unconstrained input → Add maxLength, pattern, or enum constraints to string parameters
Missing safety checks → Add inputSchema with descriptions, required fields, and type constraints
Prompt injection → Audit tool descriptions for hidden instructions that override agent behavior
Destructive operations → Add confirmation steps or dryRun mode

Run npx decoy-scan --fix for a full per-server remediation plan.

How to scan your own setup

Locally:

npx decoy-scan

In CI/CD:

- uses: decoy-run/decoy-scan@v1

The GitHub Action fails your build on critical tools or prompt injection by default, and uploads results to the GitHub Security tab via SARIF.

Zero dependencies. Zero config. Zero account required.

What we're not saying

We're not saying these servers are malicious. They're reference implementations — they're supposed to show what's possible, not what's hardened for production. The filesystem server is meant to give an agent file access. That's the point.

What we are saying is that most developers install these without thinking about what they're granting. A tool called write_file with no path constraints is a feature in a demo and a vulnerability in production. The gap between those two contexts is where attacks happen.

Scan first. Fix what matters. Install tripwires for the rest.