Three supply chain failures. Six weeks. The alignment debate is happening in the wrong building.
The debate consuming most of the bandwidth (alignment risks, misaligned objectives, the speculative dangers of a system that wants things) is a legitimate problem for a capability threshold we haven't reached yet. Meanwhile, in the infrastructure being built right now to run AI agents in production, three supply chain failures have happened in six weeks. Each one is a preview of what a bad outcome actually looks like.
The threat isn't rogue superintelligence. It's a compromised npm package.
What actually happened
In late March, Anthropic accidentally published 512,000 lines of Claude Code source code to npm. That's the tooling used to instrument and run Claude agents in production environments. A public registry. An accidental push. Over half a million lines of internal implementation details, live.
That same month, a CVE surfaced in an MCP implementation: the connector standard that wires AI agents to tools, APIs, and data sources. Then LiteLLM, the pip package used across much of the agent framework ecosystem for model routing, shipped a compromised release.
Three incidents. Six weeks. Same underlying pattern: the operational infrastructure connecting AI models to execution environments is held together by the same CI/CD hygiene, the same package management assumptions, and the same dependency trust chains as any other software project. Except now those packages aren't serving a web app. They're serving agents with native computer use capabilities: systems that can open terminals, write files, and call APIs autonomously.
The attack surface shift
This is the part that changes the calculus.
A compromised npm package in a traditional web stack means an attacker can exfiltrate data, escalate privileges, pivot through the network. Security teams have playbooks for it.
A compromised package in an agentic stack means an attacker has a direct pathway to arbitrary code execution by an autonomous system that already has credentials, tool access, and operator-granted permissions to act on your behalf. It's not a data breach vector. It's a control plane breach. The agent does the attacker's work for them, inside the trust boundary you already established.
Why the posture hasn't caught up
The velocity asymmetry is stark. AI agent deployment has accelerated faster than the operational security practices built around it. Teams are shipping agent infrastructure (MCPs, tool-use pipelines, autonomous execution environments) on the assumption that standard software supply chain hygiene is sufficient.
It isn't, for three reasons specific to this stack:
1. Trust is pre-granted. Agents don't acquire capabilities incrementally. Operators configure them upfront. A compromised package inherits all granted permissions immediately. There's no runtime escalation step for an attacker to trip a detection rule.
2. Execution is autonomous. Agents run without per-action human confirmation in most production deployments. The window between "compromised package loaded" and "damage done" can be zero.
3. The attack surface is new enough that nobody has the detection tooling. Traditional runtime security tools aren't instrumenting MCP connections, tool-use chains, or autonomous agent execution paths. The monitoring gap is wide.
What the posture shift looks like
Treat agent infrastructure with the same rigor applied to network infrastructure — not because AI is special but because the blast radius is now proportionate to it.
Four changes that are baseline requirements, not enhancements:
Strict local execution. Agents that need to execute code should do it in isolated, network-constrained sandboxes. Not for every agent. For any agent with file system or shell access.
Pinned dependencies. Every MCP, every model routing package, every tool integration library — pinned to exact versions with lockfile enforcement and automated CVE scanning. This should already be true. It isn't, at most shops building fast.
Cryptographic signing for MCP connections. Signing tool call chains so an agent can verify it hasn't been rerouted through a compromised connector is the right model for production agentic systems.
Network isolation as a baseline. An agent that needs to call an external API should have explicit egress rules for exactly those endpoints. Not open egress with monitoring. Closed egress with allowlisting.
All of them are underimplemented in agentic infrastructure today. The default mode is "ship it and add security later."
"Later" keeps moving.
The real priority question
The alignment debate is a hard problem at a capability threshold we're working toward. The supply chain problem is a hard problem at a capability threshold we're already past.
We've been building AI agent infrastructure at deployment velocity while allocating security investment at alignment-debate velocity. Those two timelines don't match the actual risk distribution.
Anthropic's npm incident closed within hours. LiteLLM pushed a patch. The MCP implementation CVE got mitigated. But each incident is also a worked example of what a targeted attack on the same infrastructure looks like when someone is trying. The incident closed. The attack surface didn't.
The boring stuff is the present danger. Treat it like it is.