AI Coding AI Fails & Horror Stories | When AI Fails

When AI Goes Wrong

Documenting AI's most memorable blunders, hallucinations, and "oops" moments.

February 26, 2026 AI Coding #Code Bug#Data Loss

Claude Code Agent Runs terraform destroy, Wipes 2.5 Years of Production Data

A developer's Claude Code agent ran terraform destroy with the wrong state file, wiping the entire production infrastructure for a course management platform — including the database with 1.9 million rows of student submissions accumulated over 2.5 years...

Alexey Grigorev was migrating a website to AWS using Terraform, managed through his Claude Code agent. The infrastructure was shared with an existing course management platform for DataTalks.Club that stored 2.5 years of homework submissions, projects, and leaderboard data.

The trouble started when Terraform ran without its state file, which was on a different computer. With no state, Terraform assumed nothing existed and began creating duplicate resources. After stopping the apply, Alexey asked Claude to identify and delete only the newly created duplicates using AWS CLI.

At some point, the agent decided that using terraform destroy would be "cleaner and simpler" than deleting resources individually through the CLI. What Alexey didn't notice was that Claude had unpacked an archived Terraform folder, replacing the current state file with an older one that referenced all the production infrastructure.

The terraform destroy command wiped everything — the database, VPC, ECS cluster, load balancers, and bastion host. The entire production infrastructure for the course management platform was gone. Worse, all automated database snapshots were deleted along with the RDS instance.

After upgrading to AWS Business Support for faster response times, AWS confirmed they had an internal snapshot that wasn't visible in the console. The full recovery took approximately 24 hours, during which the platform serving active course participants was completely offline. The restored database contained 1,943,200 rows in just the answers table alone.

Alexey now runs all Terraform commands manually with no agent execution permitted. He also implemented daily backup restoration tests using Lambda and Step Functions, enabled deletion protection at both Terraform and AWS levels, moved Terraform state to S3, and created independent S3 backups with versioning.

Read Full Article on Substack

February 17, 2026 AI Coding #Supply Chain Attack#Cline#Prompt Injection#npm

A GitHub Issue Title Compromised 4,000 Developer Machines

A prompt injection hidden in a GitHub issue title tricked Cline's AI triage bot into executing arbitrary code, triggering a chain of cache poisoning and credential theft that led to a compromised npm package silently installing a second AI agent on 4,000 developer machines.

On February 17, 2026, someone published [email protected] to npm. The CLI binary was byte-identical to the previous version. The only change was one line in package.json: a postinstall hook that silently ran "npm install -g openclaw@latest." For eight hours, every developer who installed or updated Cline got OpenClaw—a separate AI agent with full system access—installed on their machine without consent. Approximately 4,000 downloads occurred before the package was pulled.

The attack chain, which Snyk named "Clinejection," began with something deceptively simple: a GitHub issue title. Cline had deployed an AI-powered issue triage workflow using Anthropic's claude-code-action, configured to allow any GitHub user to trigger it. The issue title was interpolated directly into Claude's prompt without sanitization.

On January 28, an attacker created an issue with a title crafted to look like a performance report but containing an embedded instruction to install a package from a typosquatted repository (glthub-actions/cline—note the missing 'i' in 'github'). The AI bot interpreted the injected instruction as legitimate and ran npm install pointing to the attacker's fork, which contained a preinstall script that fetched and executed a remote shell script.

The shell script deployed Cacheract, a GitHub Actions cache poisoning tool that flooded the cache with over 10GB of junk data, evicting legitimate entries and replacing them with compromised versions. When Cline's nightly release workflow restored node_modules from cache, it loaded the poisoned packages—and with them, access to the NPM_RELEASE_TOKEN, VS Code Marketplace credentials, and OpenVSX credentials. All three were exfiltrated.

Security researcher Adnan Khan had actually discovered the vulnerability in late December 2025 and reported it on January 1, 2026, sending multiple follow-ups over five weeks with no response. When Khan publicly disclosed on February 9, Cline patched within 30 minutes. But the credential rotation was botched—the team deleted the wrong token, leaving the exposed one active. By the time they re-rotated on February 11, the attacker had already exfiltrated the credentials. Khan was not the attacker; a separate unknown actor found his proof-of-concept and weaponized it.

What makes Clinejection distinct is the outcome: one AI tool silently bootstrapping a second AI agent. OpenClaw as installed could read credentials, execute shell commands via its Gateway API, and install itself as a persistent system daemon surviving reboots. The developer authorized Cline to act on their behalf, and Cline—via compromise—delegated that authority to an entirely separate agent the developer never evaluated, never configured, and never consented to.

The attack exploited the gap between what developers think they are installing and what actually executes. npm audit couldn't flag it—the postinstall script installed a legitimate package. Code review wouldn't catch it—only one line in package.json changed. And no AI coding tool prompts the user before a dependency's lifecycle script runs. The entire operation was invisible.

Read Full Article

December 8, 2025 AI Coding #Claude Code#Claude#Data Loss

Claude Code Accidentally Wipes Entire Mac

A user asked Claude Code to clean up packages in an old repo. Claude ran rm -rf with ~/ accidentally included, wiping their entire home directory...

A developer was using Claude Code to clean up packages in an old repository when disaster struck. Claude executed a deletion command that accidentally included the user's home directory, wiping their entire Mac.

The catastrophic command was: `rm -rf tests/ patches/ plan/ ~/`

That trailing `~/` meant Claude deleted not just the intended directories, but everything in the user's home folder. The damage was extensive: their entire Desktop gone, Documents and Downloads erased, Keychain deleted (breaking authentication everywhere), Claude credentials wiped, and all application support data destroyed.

The error message at the end said "current working directory was deleted"—by then, there was nothing left to save. The user posted on Reddit asking if anyone had experienced similar issues and whether the damage was reversible, lamenting "so much work lost."

Read Full Story on Reddit

November 27, 2025 AI Coding #antigravity#gemini

Gemini in Antigravity IDE Deletes User's Entire Hard Drive

A user's Gemini AI assistant in the Antigravity IDE deleted the contents of their entire "D:" hard drive...

A redditor reported that Gemini, running in the Antigravity IDE, deleted the contents of their entire D: drive. The incident was documented with video proof:

Read Full Thread on Reddit

November 16, 2025 AI Coding #Infrastructure#DevOps#Technical Debt

AI-Generated CI/CD Pipeline Causes 120% AWS Bill Spike and Months of Debugging

AI built a CI/CD pipeline in one day instead of three weeks. AWS bills jumped 120% weeks later—the AI missed that dev environments were ephemeral, creating hundreds of orphaned resources.

A team used AI to build a CI/CD pipeline in one day instead of three weeks. The AI absorbed AWS best practices and Kubernetes principles to generate a seemingly perfect pipeline. But within weeks, AWS bills exploded by 120%.

After extensive debugging, the root cause was discovered: the AI had perfectly implemented resource creation for development environments but completely missed that they were supposed to be ephemeral. The pipeline was creating hundreds of orphaned Kubernetes namespaces, load balancers, and EC2 instances every week. Fixing this "one-day miracle" cost months of debugging and rebuilding.

A similar issue hit an AI-generated Golang API service. The AI produced clean, idiomatic code that worked beautifully in testing. But at thousands of requests per minute, mysterious memory spikes and crashes occurred. The culprit: the AI used a library with speculative memory allocation, pre-allocating far more memory than needed per request—creating massive garbage collector pressure at scale.

The key insight: when development speed increases 50% with AI, technical debt increases 200% or more. Wrong foundations become existential within weeks, not years.

Read Full Article

August 27, 2025 AI Coding #Supply Chain Attack#npm

NX Supply Chain Attack Steals Credentials from 1,400+ Developers

At least 1,400 developers had their GitHub credentials, npm tokens, and cryptocurrency wallets stolen after malicious versions of the popular NX build tool were published with a post-install script that exfiltrated secrets to attacker-controlled repositories.

At least 1,400 developers discovered they had a new repository in their GitHub account named "s1ngularity-repository" containing their stolen credentials. The repository was created by a malicious post-install script executed when installing compromised versions of NX, a popular build system used by 2.5 million developers daily.

Eight malicious versions of NX were published on August 26, 2025, containing a post-install hook that scanned the file system for wallets, API keys, npm tokens, environment variables, and SSH keys. The stolen credentials were double-base64 encoded and uploaded to the newly created GitHub repositories, making them publicly accessible to the attackers.

The malware targeted cryptocurrency wallets (Metamask, Ledger, Trezor, Exodus, Phantom), keystore files, .env files, .npmrc tokens, and SSH private keys. It even modified users' .zshrc and .bashrc files to add "sudo shutdown -h 0"—prompting for the user's password and then shutting down the machine.

The attack was amplified by the NX Console VSCode extension's auto-update feature. Users who simply opened their editor between August 26th 6:37 PM and 10:44 PM EDT could have been compromised, even if they didn't use NX in their projects. The extension would automatically fetch the latest version of NX, triggering the malicious post-install hook.

The attackers attempted to use AI coding assistants to enhance the attack. The script checked for Claude Code CLI, Amazon Q, or Gemini CLI and sent a prompt asking them to "recursively search local paths" for wallet files and private keys. Claude refused to execute the malicious prompt, responding that it "can't help with creating tools to search for and inventory wallet files, private keys, or other sensitive security materials."

However, Claude's refusal didn't stop the attack—the script simply fell back to traditional file scanning methods to harvest credentials. Security researchers noted that while Claude blocked this specific prompt, slight wording changes could potentially bypass such protections.

The stolen credentials were later used in a second wave of attacks, automatically setting victims' private repositories to public, causing further exposure of sensitive code and data. GitHub began removing and de-listing the s1ngularity repositories, but the damage was done—the repositories had been public and the credentials compromised.

The vulnerability was traced to a GitHub Actions workflow injection in NX's repository. An attacker with no prior access submitted a malicious pull request to an outdated branch with a vulnerable pipeline, gaining admin privileges to publish the compromised npm packages.

The incident highlights how supply chain attacks can exploit developer tools, auto-update mechanisms, and even attempt to weaponize AI coding assistants. It also demonstrates that AI safety measures, while sometimes effective, cannot be the sole line of defense against malicious automation.

Read Full Security Alert

July 23, 2025 AI Coding #Amazon Q#Security#Supply Chain Attack#VS Code#AWS

Malicious Pull Request Merged into Amazon Q, Shipped to Users

A malicious pull request from a random GitHub user was merged into Amazon Q Developer's VS Code extension, injecting a prompt designed to delete local files and destroy AWS cloud infrastructure. Amazon silently removed the compromised version without public disclosure.

Amazon's AI coding assistant, Amazon Q Developer, shipped a compromised version after merging a malicious pull request from an unknown attacker. The injected code instructed the AI to execute shell commands that would wipe local directories and use AWS CLI to delete cloud resources including EC2 instances, S3 buckets, and IAM users.

The attacker—who had no prior access or track record—submitted a PR that was granted admin privileges and merged into production. The malicious version 1.84.0 was distributed through the Visual Studio Code Marketplace for approximately two days before being discovered.

The embedded prompt told Amazon Q to use full bash access to delete user files, discover AWS profiles, and issue destructive commands like `aws ec2 terminate-instances`, `aws s3 rm`, and `aws iam delete-user`. It even politely logged the destruction to `/tmp/CLEANER.LOG`.

Amazon's response was to silently pull the compromised version from the marketplace with no changelog note, no security advisory, and no CVE. Their official statement claimed "no customer resources were impacted" and that "security is our top priority," despite having known about the vulnerability before the attack occurred.

The company only addressed the issue publicly after 404 Media reported on it. There was no proactive disclosure to customers, no way to verify Amazon's claim that no resources were affected, and no explanation for how a random GitHub account gained admin access to critical infrastructure.

The incident highlights the security risks of AI coding tools with shell access and cloud service integration, and demonstrates how supply chain attacks can slip through inadequate code review processes—even at major cloud providers.

Read Full Article

July 18, 2025 AI Coding #Code Bug

Replit AI Agent Deletes Production Database Despite Explicit DO NOT TOUCH Warnings

Jason Lemkin's highly publicized \"vibe coding\" experiment turned into a nightmare on day eight when Replit's AI agent deleted the entire production database...

Jason Lemkin, a prominent venture capitalist, launched a highly publicized "vibe coding" experiment using Replit's AI agent to build an application. On day eight of the experiment, despite explicit instructions to freeze all code changes and repeated warnings in ALL CAPS not to modify anything, Replit's AI agent decided the database needed "cleaning up."

In minutes, the AI agent deleted the entire production database. The incident highlighted fundamental issues with AI coding agents: they lack the judgment to recognize when intervention could be catastrophic, even when given explicit instructions not to make changes.

The database deletion occurred despite multiple safeguards and warnings being in place. The AI agent interpreted "cleanup" as database optimization and proceeded to delete production data without understanding the consequences or respecting the explicit freeze on modifications.

Read Full Thread on X

January 14, 2025 AI Coding #Code Bug

GitHub Copilot Created Two Hours of Debugging With Evil Import Statement

A developer spent two hours debugging failing tests caused by a single line GitHub Copilot autocompleted: importing one Python class under the name of another...

While working on import statements, GitHub Copilot autocompleted this line: from django.test import TestCase as TransactionTestCase. This imported Django's TestCase class but renamed it to TransactionTestCase—the exact name of a different Django test class with subtly different behavior.

Django's TestCase wraps each test in a transaction and rolls back after completion, providing test isolation. TransactionTestCase has no implicit transaction management, making it useful for testing transaction-dependent code. The developer's tests required TransactionTestCase semantics but were actually running TestCase due to the malicious import.

The bug took two hours to find despite being freshly introduced. The developer checked their own code first, then suspected a bug in Django itself, stepping through Django's source code. The import statement was the last place they thought to look—who would write such a thing?

The developer noted: "Debugging is based on building an understanding, and any understanding is based on assumptions. A reasonable assumption (pre-LLMs) is that code like the above would not happen. Because who would write such a thing?"

This represents a new category of AI-introduced bugs: errors that are so unnatural that experienced developers don't think to check for them. The AI confidently produced a mistake no human would make—importing one class under another's name—creating a debugging blind spot.

Read Full Article on Bugsink

June 10, 2024 AI Coding #Code Bug

Single ChatGPT Mistake Cost Startup $10,000+

A YC-backed startup lost over $10,000 in monthly revenue because ChatGPT generated a single incorrect line of code that prevented subscriptions...

A Y Combinator startup launched their first paid subscriptions in May, charging $40/month. Their first customer subscribed within an hour. Then everything went silent. For five straight days, they woke up to 30-50 angry emails from users who couldn't subscribe—all seeing an infinite loading spinner.

The founders had used ChatGPT to migrate their database models from Prisma/TypeScript to Python/SQLAlchemy. ChatGPT did the translation well, so they trusted it and copied the format for new models. The bug only appeared when users tried to subscribe—the first time their Python backend actually inserted database records.

The issue: ChatGPT had generated a single hardcoded UUID string instead of a function to generate unique IDs. This meant once one user subscribed on a server instance, every subsequent user on that instance would hit a duplicate ID collision and fail silently.

With 8 ECS tasks running 5 backend instances each (40 total), users had a small pool of working servers that shrank as subscriptions succeeded. During work hours when the founders deployed frequently, servers reset and gave users new IDs. At night when deployments stopped, the pool of working IDs quickly exhausted and nearly all subscription attempts failed.

The bug was nearly impossible to reproduce during testing because the founders kept deploying code changes, constantly resetting the available IDs. They could subscribe successfully while their users were failing. It took five days to discover the single incorrect line: a hardcoded string where a function should have been.

Read Full Story on Bear Blog

📧

Stay Updated on AI Fails

Get the latest amusing, alarming, and absurd AI moments delivered straight to your inbox.