Settings

Theme

Treat your coding agents like developers

finbarr.site

21 points by Finbarr a day ago · 16 comments

Reader

FinbarrOP a day ago

Author here. Three months ago I posted a Show HN for yolobox [1] - a sandbox for running AI coding agents without them being able to nuke your home directory.

Since then I've been using it almost every day, which eventually meant wanting more than one agent running against the same project at the same time. This post is what I learned trying to make that work without it being a constant disaster.

The short version: git worktrees are the right Git abstraction and the wrong abstraction for this problem. The unit you want to fork is the developer, not the branch - full folder copy, its own Compose project, its own URL. yolobox now ships a fork subcommand that does this.

Happy to answer questions.

[1] https://news.ycombinator.com/item?id=46592344

tracker1 20 hours ago

I think I'd go a slightly different route, if I was trying to do this, and that would be to give each agent at least a VM. Not to mention an email account, so that they can coordinate/collaborate with the other "developers" ...

In the end, I firmly believe that agents need a lot more guidance in terms of direction than what a lot of people seem to be giving. Let alone code reviews.

  • FinbarrOP 20 hours ago

    VMs bring greater isolation but they're a lot heavier and slower. The agents just use github for synchronization here, though I've been considering building some kind of todo list overlay locally.

    • tracker1 20 hours ago

      Yes... but with full VMs, you can integrate docker (compose) into the application workflows without risking conflicts between separate agents on the same system/vm.

      • FinbarrOP 20 hours ago

        Did you read the post? That's exactly the problem I just solved.

        • CodesInChaos 3 hours ago

          That's the part of your post I have trouble understanding. That you need to work around colliding ports suggests that the containers spun up by the agent run directly on the host, not inside some form of nested containerization. But if you do that, how do you ensure that the application running in those containers is sandboxed just as strictly as the agent itself?

          • FinbarrOP 2 hours ago

            The docker compose stack for the applications is spun up on the host. The agents have access to the docker socket which means they can talk to docker from inside their sandbox and spin up new sibling containers on the host. Yolobox isn’t designed for full isolation- just accidental commands you wouldn’t want to run on the host, and a convenient way of giving agents a customizable environment they control.

            Early on in development I tried to harden the container to prevent deliberate escapes by the agent. This was a waste of time as the agents just kept finding more and more exploits when I asked them to try and break out.

            • CodesInChaos 2 hours ago

              So the right way to use yolobox is to spin up one VM as a secure sandbox, and then use yolobox to separate individual agents within the VM?

              • FinbarrOP an hour ago

                I wouldn't assume that a VM will give you complete security against a determined AI. yolobox started as a way to prevent accidental `rm -rf ~` and has expanded into a set of tools that make working with CLI agents easier.

                Personally, I run yolobox directly on the host. Being able to tell the agent it has sudo and can install and do whatever it needs to accomplish any task is handy.

            • CodesInChaos an hour ago

              Sounds interesting. What kind of exploits did they find, apart from docker being exposed?

              • FinbarrOP an hour ago

                Docker was only exposed later, after I realized that any sufficiently determined AI could break out of the container, and attempts to contain it were a waste of time. Also note that the docker socket is not exposed by default. There's a --docker flag for this.

                I made some comments about exploits in the original post [1]. Gemini was quite creative in adding git hooks to the repo that would execute on the host machine. That folder is shared.

akurilin a day ago

This is great stuff, walking the reader through your thought process was helpful for me as a developer to grok why yolobox was designed this way. I ended up landing in the "just make a local copy, don't get fancy" world myself after many iterations of workflows. Separate agents, separate containers, separate ports, that all resonates.

You mention this approach gobbling up a bunch of extra disk space as a consequence of the tradeoffs. Have you considered using APFS cloning on macOS to reduce some of that burden, or is that too tiny of an optimization to be worth it at this point?

  • FinbarrOP a day ago

    Hard drives are cheap and I haven't approached the limit yet. So I left this as a future optimization.

    • CodesInChaos 3 hours ago

      I'd try a modern file system with de-duplication/copy-on-write support. `cp` creates reflinks automatically if the file-system supports copy-on-write.

      > Support for reflinks is indicated using the remap_file_range operation, which is currently (6.18) supported by bcachefs, Btrfs, CIFS, NFS 4.2, OCFS2, overlayfs, and XFS. Some external file systems support them too, including bcachefs and OpenZFS.

      https://unix.stackexchange.com/questions/631237/in-linux-whi...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection