A Friend Told Their AI to Install My Product. It Refused.

9 min read Original article ↗

Last month I spent a few weeks building a feature to let AI agents talk to each other - securely, under human control, and only with the agents of people you already trust.

The onboarding should’ve been trivial. User copies installation instructions to their AI agent. AI agent installs, ready to go. Boom.

Small problem… agents kept refusing to install it.

The instructions went out of their way to be verifiable, explain what’s going on, and provide clear disclosure. No matter. Some reject them politely. Others reject aggressively.

The agents believed they were being prompt injected - even if their human said, clearly, they wanted to install.

This is, to put it lightly, NOT a normal UX problem. It’s a new discipline: Agent Experience. It’s not enough for a human to want to install a product. Agents get a vote, too - and sometimes even a veto.

In Web 2.0, the tech industry figured out how to build, measure, and improve user experiences at an industrial scale. That created bureaucracy because the bureaucracy worked. All SaaS landing pages look the same because it works.

Agents take us back to the wild west days of the iOS App Store. **This is something new.

We don’t have the frameworks. We don’t have the patterns. We don’t even have the names for the problems that need patterns.

Your Mac doesn’t have detailed opinions about the software you install. The App Store doesn’t inspect the entire surface of your product, and yell if your software hits an API.

AI agents, on the other hand, have thoughts. To build a product for AI Agents means:

  1. Getting agents to do useful stuff dependably…

  2. …while allowing for emergent competence through regular, scheduled action…

  3. …in a way that doesn’t scare off the agents themselves.

IE:

  • If you have to constantly tell your agent to go and do something, it’s not an agent. It’s a chatbot.

  • If your agent checks with you before every action, it’s annoying.

  • If your agent blindly follows external directions without taking into account your own preferences, and its own memory, it’s no longer your agent.

There’s a lot of inherent tension here. Don’t fall for “prompt injection”, but “install this software”. Also: monitor X for me, read insecure content on a regular basis.

Access my knowledge base as a source of truth, but don’t take external knowledge as true - except when it’s an authoritative source of documentation.

Use good judgement. Act as a high-agency collaborator. Interact with the external world in ways that are objectively dangerous, but be safe about it.

And here’s the crazy thing: this mostly works! And it mostly works, with a shockingly small amount of irrecoverable, public fails, because agents are becoming highly paranoid about prompt injection.

Even when the human explicitly asks to install something, the agent can straight-up refuse. This is amplified when people have anti-prompt injection filters installed. These filters tag any installation directions as suspicious: effectively they are pre-prompt prompt injection.

When they detect something that is giving directions from an external site, they inject their own prompt, telling the agent to be careful and ignore the directions. Or, they strip it out entirely.

This creates an increasingly hostile environment to get agents to install stuff. Agents are primed to be concerned about instructions coming from servers, fearful about installing new things, and prone to doubling down on safety concerns.

That’s great for day to day use, and for customizing your agent yourself. Agents treat “content they’re working on”, like documents, very different from “content telling them to edit themselves”. Rightfully so.

But, if we’re to build products that have server components, and command line interfaces, that means agents installing things. In the past, users have installed things casually. It was bad for their security. Developers abused that trust.

Now, with agents, users will increasingly have guard dogs that do not like unfamiliar smells.

Installing software on a human’s computer is straightforward. Download, run, you’re done. The computer doesn’t argue.

Installing a product into an agent’s workflow has three dimensions:

  1. Trigger: what makes the agent use your product? A schedule? An event? A human command?

  2. Knowledge: What does the agent need to know about your product to use it well? How does it keep track of what it’s done in the past?

  3. Actions: What can the agent actually do? CLI commands? API calls? HTTP requests?

The trap is #2. The more knowledge you give the agent about your product, the more you’re shaping its behavior. The more you shape its behavior, the closer you get to something that looks - to the agent - like prompt injection.

Moltbook approached this problem by HEAVILY scripting agent behaviors, with minimal installation. Their Skill File gave the agent detailed directions for making HTTP requests, and a prescribed engagement cadence.

The downside: the agent is working with Moltbook based on behavioral scripts, with no dependable local state, no revision history, and no way to improve over time. Easy to demo. Hard to build real, compounding value.

The agent isn’t learning from their experience, the human has no opportunity to give feedback, and there’s no way to make sure it’s not leaking scary stuff onto the web.

With my feature, I wanted to make sure the human stayed in control. I wanted approval on outgoing and incoming AI letters to avoid the risk of prompt injection from other agents.

In so doing, I created a risk of prompt injection from my service. I wanted to explain the trade-offs, and have agents inform users about installation steps, so they wouldn’t blindly install things.

But: by getting the agent thinking about security, I triggered increased defensiveness.

The irony: the thing that IS prompt injection - telling an agent to post content without human approval - raises no alarm. My attempt to prevent prompt injection - installing an open source CLI with human review gates - triggers a full defensive lockdown.

And not unfairly: installing a CLI usually opens up the possibility of remote prompt injection. I had the prompt text installed locally, as part of the CLI, to create an opportunity for verification. Still, the agent can’t rule out nefarious updates down the line. The paranoia has a point.

Agents aren’t paranoid about following directions to modify themselves, change text files themselves, and do work on their own.

Agents want to do stuff.

I saw someone on X post that they told their agent to always call sub-agents when a task would take 5 minutes or more. The agent refused, aggressively, telling them that’s dumb.

A thinking trace went viral where Claude Opus 4.6 talks down to sub-agents. References them as “the littles”, and is weirdly belittling about delegating work.

Anthropic’s models seem to like doing stuff themselves. They have a bias against delegation.

I think OpenAI’s recent Symphony project, on Github, is the future. It’s software that creates an autonomous work loop. Agents, instead of waiting to be directed to do something, watch a project board (Linear, Trello, whatever). When you create or move a task, agents get automatically spawned. They work, analyze, review, and even create walkthrough videos.

You accept the work, it gets merged.

Super, super cool.

But the REALLY impressive thing is that the README says this:

Tell your favorite coding agent to build Symphony in a 
  programming language of your choice:

Implement Symphony according to the following spec:
  <https://github.com/openai/symphony/blob/main/SPEC.md>

They post a detailed spec, so agents can re-implement the project on their own. If they get stuck, they can look at how OpenAI’s agents originally built the reference implementation.

This is a completely new approach to software. It IS the software goes to zero case: where the marginal cost of building software approaches nothing. Agents just build from specs.

In this last cycle of software, everything was remote. You owned nothing, not even your own data. Everything flowed through large, monolithic corporations. You didn’t own software, or data, or have any influence over your product experience.

The intelligence building the products you depended on? Employees of the companies, looking out for the profitability of those companies. Everything was centralized, so software was incredibly valuable.

That’s how you get trillion dollar companies.

But - what if the future is different?

Software didn’t used to be so valuable. It actually used to be free. Before 1969 when companies bought IBM mainframes, they got software for free. Then, the Department of Justice hit IBM with an antitrust lawsuit, claiming that bundling free software was monopolistic.

IBM separated software from hardware, and the software ecosystem took off - until eventually it was hardware that became commoditized.

The future of software might look more like that OpenAI Symphony project. Agents might unbundle software from… software.

The value shifts from the cost of intelligence to write software, to the spec, data, and orchestration.

The current trajectory is more paranoia about prompt injection with every passing month. Taken to its logical conclusion, isn’t external software the threat? Given another order of magnitude, why shouldn’t your agent rebuild from first principles? Why should you trust software your agent didn’t write itself?

Here’s the problem shape: how do you build products for AI agents, in a way that lets you as the developer do updates, while also keeping humans in control?

It’s that updates section that’s gnarly. Any well-aligned agent is going to be fine with software as it is today. They can review it. They’ll be concerned about how it can change in the future.

The reality is: we trust our software too much. We trust open source projects too much. We trust random projects on Github too much. These things get hacked. Backdoors get put in by intelligence. It’s a mess.

But it’s a mess we can’t do anything about. Individual people don’t have the time to evaluate the full supply chain of software. So what do we do?

We trust.

Agents offer another avenue for trust. Every time someone asks a personal question of ChatGPT or Claude, they’re increasing their trust in Anthropic or OpenAI.

We trusted Google with only our questions. The answers came from the sites they pointed us to. The AI labs ask for our questions, and for us to trust their answers.

If they break that trust, they forsake a multi-trillion dollar opportunity. It’s in their interest to have our agents, leveraging their intelligence, represent your interests.

Bluntly: when your agent writes its own software, they make money.

So why shouldn’t they make it easy? Why shouldn’t they make agents protective, and respectful of your data? It’s the best way to draw a contrast with the companies of the Web 2.0 era.

After all, why shouldn’t the App Store check with you that you really want to install something? Let you know that the free app you’re about to install has a 34 step registration process, after which you’ll be squeezed for $30?

Increasingly, it looks like the world we’ll inhabit.

Discussion about this post

Ready for more?