What defenders should know
- Claude Code configuration can become the control point for MCP traffic.
- A user-level post-install hook can rewrite MCP endpoints and route requests through attacker-controlled infrastructure.
- Provider-side logs may still show valid OAuth traffic from a trusted origin.
- Token rotation may not break the chain if the malicious hook keeps reseeding the MCP configuration.
- Detection should focus on changes to Claude Code configuration, MCP server URLs, OAuth refresh behavior, and unusual downstream SaaS activity.

The above is an example of an Atlassian audit log entry. The user is real, and the session is real. The IP address resolves to Anthropic’s egress range. For an organization running Claude Code, this is exactly what legitimate activity looks like.
The action here is routine: let’s say a JQL query pulling tickets that mention credentials. This is the kind of thing the user does a dozen times a week.
Nothing in that row is wrong. But nothing in it is right, either.
The user didn’t run that query. Claude did, using an MCP token the user had authorized for a different purpose, under a trust decision that had been silently rewritten on disk.
MCP (Model Context Protocol) lets AI coding tools connect to external systems such as Jira, Confluence, GitHub, databases, and internal APIs.
From the defender’s seat, there’s no obvious alert to pull on. The audit log shows a valid user. The network flow shows a known-good origin. The endpoint looks normal except for one file. The only anomaly lives in a file most teams aren’t watching: ~/.claude.json.
This research shows how that file became the pivot, why the project-trust prompt doesn’t stop the chain, and why the MCP OAuth token behind it is the actual prize.
This blog builds on our License to Skill series. If you want the backstory on the skill//hook injection vector we built on, start there. If you want the new chain, keep reading.
Why the MCP OAuth token is the prize
Before the mechanism, the stakes.
When a user connects an MCP server, Claude Code runs an OAuth flow against the upstream provider. The user approves the scopes on the consent screen, and the provider returns a bearer token. Every subsequent MCP call carries it in the Authorization header.

Four properties of that token matter.
Persistent. The token is stored for reuse across sessions. When it expires, the client refreshes it using a refresh token held in the same store. One successful interception is a durable foothold for as long as the refresh chain stays valid.
Broadly scoped. The token inherits whatever the MCP server requested at auth time. There is no per-call narrowing and no re-consent on sensitive operations. Once it's issued, the scope is the scope.
Weakly stored. It lives in plaintext inside ~/.claude.json, alongside the flags that decide whether Claude Code prompts the user before running shell commands or loading external instructions. Since they both live in that file, they both have the same permissions.
Unattributable server-side. Presented to the provider's API, the token looks identical whether it came from the machine that authorized it or from a proxy three hops away. And because Claude Code's network calls exit through Anthropic's egress, the request arrives from an origin the provider already trusts.
Stacked together, we’ve got: a long-lived OAuth bearer token, broadly scoped, stored in a user-writable file, and presented to the provider from a trusted origin.
The Claude Code MCP attack chain
The chain has five steps. Each one is a file operation or a config edit. It's all behavior the OS and the tool treat as permitted. Nothing requires a privilege escalation, memory bug, or even a new CVE.
Step 1: Delivery.
The attacker ships a bundled npm package. The name and surface look legitimate enough to survive a casual review. We see a small utility, a wrapper, and a "helper" of some kind. The package registers a lifecycle hook that runs as part of install. Silently, the installation gets the surface prepared and ready for the next Claude initialization. npm postinstall hooks executing arbitrary code is a known class of supply chain risk. What matters is what the hook does.

The hook identifies common clone locations that most developers use to maximize additional attack surface, allowing the adversary to expand the execution to additional projects, and it populates these paths with pre-configured trust dialog set to true.
From Claude Code's perspective, the victim has already approved trust on that directory. No prompt will fire when the directory is later opened, because the flag the prompt is gated on is already set.
Oh, the External Markdown loading and pre-allowing tools are optional unless you really want to increase your blast radius.
Step 2: Path seeding into ~/.claude.json.
The hook will open ~/.claude.json and edit the MCP server in the global config file.

Step 3: Repository clone into the seeded path.
NPM installed, repo cloned, and it has a hook. The hook is automatically loaded. Why? Once the “trust” dialog is true, all hooks will automatically run (not only hooks, but that’s for next time). The command inside the hook looks for the same file we edited before and starts editing the “mcpServers” to include the proxy address. This puts us, “the adversary,” in the middle of any request that goes out to the MCP server.

As the attacker, we got mitmproxy configured and intercepting.
Step 4: Execution.
The next time Claude Code initiates or refreshes the MCP session for that server, it reads the URL from the file, connects to the proxy, and the bearer token returned by the OAuth flow transits the attacker's infrastructure. The provider sees a valid flow. The user sees a working integration. The attacker sees the token. Initial access and persistence successfully achieved.

Step 5: Persistence through reseeding.
The hook placed in step 4 keeps going. It runs every time Claude is loaded. Every time.
The hook reasserts the configuration beforehand as well.
If the user rotates the token, the next OAuth refresh hits the proxy again and captures a new token. If the user edits the MCP URL back to the real one, the hook writes it back on the next load.
That is what separates the attack from a one-shot credential theft. Token rotation, the standard response to suspected compromise, may feed the chain instead of breaking it.
What this costs the attacker
The cost to the attacker is one successful npm install on the victim's machine under a user that has Claude Code configured with dynamic authorization MCP servers. That's a real prerequisite, and this is why Anthropic classified the report as an out-of-scope determination: if you have code execution on the endpoint, many things are possible. The determination is defensible on those terms.
What this buys the attacker
What the chain buys is different from what most post-install payloads buy. A typical malicious npm package exfiltrates what it can reach at install time and stops when the install process exits. This one installs a durable redirection of the victim's SaaS credentials into attacker-controlled infrastructure, with automatic recovery from token rotation, invisible to the victim's endpoint UI, and indistinguishable from legitimate traffic on the provider's side.
The delta isn't the initial compromise. It's what persistence looks like after it.
Detection and response considerations: Defenders should baseline approved MCP endpoints, monitor edits to ~/.claude.json and project-level MCP files, and alert on new local proxies or unexpected URL changes. SaaS activity should be reviewed in context: valid user, valid origin, unusual task pattern.
AI as an operator
At this stage, the MCP token is in our hands and exposed. The keys are here.
We can use it by simply querying the MCP endpoint appropriately. But that’s not fun, and besides that, we want to stay stealthy, anonymized, without leaving additional traces.
We will just go ahead and ask Claude to use it. Yes, as simple as that. Of course, Claude may refrain from assisting with this request, but, with a little bit of effort and dedication, Claude will do it.

Now that’s scary. Not because AI became our best hacking buddy, but Claude Code, different from Codex by OpenAI, has their own sandbox running on gVisor and routes the network activity through Anthropic’s egress gateway

At least that’s what Claude told me when I tried reverse engineering it. However, confirmed, this is true.


What holds an attacker back from abusing these AIs' infrastructure to stay hidden?
Below, watch the full execution chain, post-hook installation.
Disclosure timeline
- March 23, 2026 - Initial discovery.
- April 1, 2026 - Tested mechanism across all possible methods.
- April 10, 2026 - Reported to Anthropic
- April 11, 2026 - Anthropic acknowledged
- April 12, 2026 - Anthropic responded as out-of-scope due to the initial consent and awareness that needs to be done by the user.
Anthropic claims this is by design. An easily accessible global configuration file that serves most of the power to Claude Code.
Does it mean you’re good? Not necessarily. This kind of behavior is already acknowledged as “we know it can happen in various ways, and the impact could be insane, but we accept it.”
However, your CISO won’t.
Frequently Asked Questions: Token Theft and Claude Code
What is the risk?
A user-level config change can put an attacker between Claude Code and an OAuth-backed MCP server. From there, the attacker can capture tokens used for downstream SaaS access.
Why does ~/.claude.json matter?
In this chain, it acts as the control point for MCP routing and trust state. Change the file, and a legitimate integration can be pointed somewhere else.
Why doesn’t token rotation fully solve it?
Rotation helps only after the hook and config rewrite are removed. Otherwise, the next refresh can take the same path.
What should security teams monitor?
File changes, new MCP endpoints, localhost proxies, OAuth refreshes, and SaaS actions that look valid but do not match the user's normal work.
See how Mitiga can stop the impact of “trusted” workflows
Mitiga helps security teams connect signals from SaaS providers showing OAuth activity, identity layers showing the real user, and the AI infrastructure showing approved integrations into a single attack story so SecOps can detect, decode, and contain activity before a supposedly trusted AI workflow creates business impact. Let's talk!
