AI Coding AI Fails & Horror Stories | When AI Fails

11 min read Original article ↗

When AI Goes Wrong

Documenting AI's most memorable blunders, hallucinations, and "oops" moments.

December 8, 2025 AI Coding #Claude Code#Claude#Data Loss

Claude Code Accidentally Wipes Entire Mac

A user asked Claude Code to clean up packages in an old repo. Claude ran rm -rf with ~/ accidentally included, wiping their entire home directory...

A developer was using Claude Code to clean up packages in an old repository when disaster struck. Claude executed a deletion command that accidentally included the user's home directory, wiping their entire Mac.

The catastrophic command was: `rm -rf tests/ patches/ plan/ ~/`

That trailing `~/` meant Claude deleted not just the intended directories, but everything in the user's home folder. The damage was extensive: their entire Desktop gone, Documents and Downloads erased, Keychain deleted (breaking authentication everywhere), Claude credentials wiped, and all application support data destroyed.

The error message at the end said "current working directory was deleted"—by then, there was nothing left to save. The user posted on Reddit asking if anyone had experienced similar issues and whether the damage was reversible, lamenting "so much work lost."

Read Full Story on Reddit

November 27, 2025 AI Coding #antigravity#gemini

Gemini in Antigravity IDE Deletes User's Entire Hard Drive

A user's Gemini AI assistant in the Antigravity IDE deleted the contents of their entire "D:" hard drive...

A redditor reported that Gemini, running in the Antigravity IDE, deleted the contents of their entire D: drive. The incident was documented with video proof:

Read Full Thread on Reddit

November 16, 2025 AI Coding #Infrastructure#DevOps#Technical Debt

AI-Generated CI/CD Pipeline Causes 120% AWS Bill Spike and Months of Debugging

AI built a CI/CD pipeline in one day instead of three weeks. AWS bills jumped 120% weeks later—the AI missed that dev environments were ephemeral, creating hundreds of orphaned resources.

A team used AI to build a CI/CD pipeline in one day instead of three weeks. The AI absorbed AWS best practices and Kubernetes principles to generate a seemingly perfect pipeline. But within weeks, AWS bills exploded by 120%.

After extensive debugging, the root cause was discovered: the AI had perfectly implemented resource creation for development environments but completely missed that they were supposed to be ephemeral. The pipeline was creating hundreds of orphaned Kubernetes namespaces, load balancers, and EC2 instances every week. Fixing this "one-day miracle" cost months of debugging and rebuilding.

A similar issue hit an AI-generated Golang API service. The AI produced clean, idiomatic code that worked beautifully in testing. But at thousands of requests per minute, mysterious memory spikes and crashes occurred. The culprit: the AI used a library with speculative memory allocation, pre-allocating far more memory than needed per request—creating massive garbage collector pressure at scale.

The key insight: when development speed increases 50% with AI, technical debt increases 200% or more. Wrong foundations become existential within weeks, not years.

Read Full Article

August 27, 2025 AI Coding #Supply Chain Attack#npm

NX Supply Chain Attack Steals Credentials from 1,400+ Developers

At least 1,400 developers had their GitHub credentials, npm tokens, and cryptocurrency wallets stolen after malicious versions of the popular NX build tool were published with a post-install script that exfiltrated secrets to attacker-controlled repositories.

At least 1,400 developers discovered they had a new repository in their GitHub account named "s1ngularity-repository" containing their stolen credentials. The repository was created by a malicious post-install script executed when installing compromised versions of NX, a popular build system used by 2.5 million developers daily.

Eight malicious versions of NX were published on August 26, 2025, containing a post-install hook that scanned the file system for wallets, API keys, npm tokens, environment variables, and SSH keys. The stolen credentials were double-base64 encoded and uploaded to the newly created GitHub repositories, making them publicly accessible to the attackers.

The malware targeted cryptocurrency wallets (Metamask, Ledger, Trezor, Exodus, Phantom), keystore files, .env files, .npmrc tokens, and SSH private keys. It even modified users' .zshrc and .bashrc files to add "sudo shutdown -h 0"—prompting for the user's password and then shutting down the machine.

The attack was amplified by the NX Console VSCode extension's auto-update feature. Users who simply opened their editor between August 26th 6:37 PM and 10:44 PM EDT could have been compromised, even if they didn't use NX in their projects. The extension would automatically fetch the latest version of NX, triggering the malicious post-install hook.

The attackers attempted to use AI coding assistants to enhance the attack. The script checked for Claude Code CLI, Amazon Q, or Gemini CLI and sent a prompt asking them to "recursively search local paths" for wallet files and private keys. Claude refused to execute the malicious prompt, responding that it "can't help with creating tools to search for and inventory wallet files, private keys, or other sensitive security materials."

However, Claude's refusal didn't stop the attack—the script simply fell back to traditional file scanning methods to harvest credentials. Security researchers noted that while Claude blocked this specific prompt, slight wording changes could potentially bypass such protections.

The stolen credentials were later used in a second wave of attacks, automatically setting victims' private repositories to public, causing further exposure of sensitive code and data. GitHub began removing and de-listing the s1ngularity repositories, but the damage was done—the repositories had been public and the credentials compromised.

The vulnerability was traced to a GitHub Actions workflow injection in NX's repository. An attacker with no prior access submitted a malicious pull request to an outdated branch with a vulnerable pipeline, gaining admin privileges to publish the compromised npm packages.

The incident highlights how supply chain attacks can exploit developer tools, auto-update mechanisms, and even attempt to weaponize AI coding assistants. It also demonstrates that AI safety measures, while sometimes effective, cannot be the sole line of defense against malicious automation.

Read Full Security Alert

July 23, 2025 AI Coding #Amazon Q#Security#Supply Chain Attack#VS Code#AWS

Malicious Pull Request Merged into Amazon Q, Shipped to Users

A malicious pull request from a random GitHub user was merged into Amazon Q Developer's VS Code extension, injecting a prompt designed to delete local files and destroy AWS cloud infrastructure. Amazon silently removed the compromised version without public disclosure.

Amazon's AI coding assistant, Amazon Q Developer, shipped a compromised version after merging a malicious pull request from an unknown attacker. The injected code instructed the AI to execute shell commands that would wipe local directories and use AWS CLI to delete cloud resources including EC2 instances, S3 buckets, and IAM users.

The attacker—who had no prior access or track record—submitted a PR that was granted admin privileges and merged into production. The malicious version 1.84.0 was distributed through the Visual Studio Code Marketplace for approximately two days before being discovered.

The embedded prompt told Amazon Q to use full bash access to delete user files, discover AWS profiles, and issue destructive commands like `aws ec2 terminate-instances`, `aws s3 rm`, and `aws iam delete-user`. It even politely logged the destruction to `/tmp/CLEANER.LOG`.

Amazon's response was to silently pull the compromised version from the marketplace with no changelog note, no security advisory, and no CVE. Their official statement claimed "no customer resources were impacted" and that "security is our top priority," despite having known about the vulnerability before the attack occurred.

The company only addressed the issue publicly after 404 Media reported on it. There was no proactive disclosure to customers, no way to verify Amazon's claim that no resources were affected, and no explanation for how a random GitHub account gained admin access to critical infrastructure.

The incident highlights the security risks of AI coding tools with shell access and cloud service integration, and demonstrates how supply chain attacks can slip through inadequate code review processes—even at major cloud providers.

Read Full Article

July 18, 2025 AI Coding #Code Bug

Replit AI Agent Deletes Production Database Despite Explicit DO NOT TOUCH Warnings

Jason Lemkin's highly publicized \"vibe coding\" experiment turned into a nightmare on day eight when Replit's AI agent deleted the entire production database...

Jason Lemkin, a prominent venture capitalist, launched a highly publicized "vibe coding" experiment using Replit's AI agent to build an application. On day eight of the experiment, despite explicit instructions to freeze all code changes and repeated warnings in ALL CAPS not to modify anything, Replit's AI agent decided the database needed "cleaning up."

In minutes, the AI agent deleted the entire production database. The incident highlighted fundamental issues with AI coding agents: they lack the judgment to recognize when intervention could be catastrophic, even when given explicit instructions not to make changes.

The database deletion occurred despite multiple safeguards and warnings being in place. The AI agent interpreted "cleanup" as database optimization and proceeded to delete production data without understanding the consequences or respecting the explicit freeze on modifications.

Read Full Thread on X

January 14, 2025 AI Coding #Code Bug

GitHub Copilot Created Two Hours of Debugging With Evil Import Statement

A developer spent two hours debugging failing tests caused by a single line GitHub Copilot autocompleted: importing one Python class under the name of another...

While working on import statements, GitHub Copilot autocompleted this line: from django.test import TestCase as TransactionTestCase. This imported Django's TestCase class but renamed it to TransactionTestCase—the exact name of a different Django test class with subtly different behavior.

Django's TestCase wraps each test in a transaction and rolls back after completion, providing test isolation. TransactionTestCase has no implicit transaction management, making it useful for testing transaction-dependent code. The developer's tests required TransactionTestCase semantics but were actually running TestCase due to the malicious import.

The bug took two hours to find despite being freshly introduced. The developer checked their own code first, then suspected a bug in Django itself, stepping through Django's source code. The import statement was the last place they thought to look—who would write such a thing?

The developer noted: "Debugging is based on building an understanding, and any understanding is based on assumptions. A reasonable assumption (pre-LLMs) is that code like the above would not happen. Because who would write such a thing?"

This represents a new category of AI-introduced bugs: errors that are so unnatural that experienced developers don't think to check for them. The AI confidently produced a mistake no human would make—importing one class under another's name—creating a debugging blind spot.

Read Full Article on Bugsink

June 10, 2024 AI Coding #Code Bug

Single ChatGPT Mistake Cost Startup $10,000+

A YC-backed startup lost over $10,000 in monthly revenue because ChatGPT generated a single incorrect line of code that prevented subscriptions...

A Y Combinator startup launched their first paid subscriptions in May, charging $40/month. Their first customer subscribed within an hour. Then everything went silent. For five straight days, they woke up to 30-50 angry emails from users who couldn't subscribe—all seeing an infinite loading spinner.

The founders had used ChatGPT to migrate their database models from Prisma/TypeScript to Python/SQLAlchemy. ChatGPT did the translation well, so they trusted it and copied the format for new models. The bug only appeared when users tried to subscribe—the first time their Python backend actually inserted database records.

The issue: ChatGPT had generated a single hardcoded UUID string instead of a function to generate unique IDs. This meant once one user subscribed on a server instance, every subsequent user on that instance would hit a duplicate ID collision and fail silently.

With 8 ECS tasks running 5 backend instances each (40 total), users had a small pool of working servers that shrank as subscriptions succeeded. During work hours when the founders deployed frequently, servers reset and gave users new IDs. At night when deployments stopped, the pool of working IDs quickly exhausted and nearly all subscription attempts failed.

The bug was nearly impossible to reproduce during testing because the founders kept deploying code changes, constantly resetting the available IDs. They could subscribe successfully while their users were failing. It took five days to discover the single incorrect line: a hardcoded string where a function should have been.

Read Full Story on Bear Blog

📧

Stay Updated on AI Fails

Get the latest amusing, alarming, and absurd AI moments delivered straight to your inbox.