Treat your coding agents like developers

24 points by Finbarr 2 months ago · 17 comments

Reader

FinbarrOP 2 months ago

Author here. Three months ago I posted a Show HN for yolobox [1] - a sandbox for running AI coding agents without them being able to nuke your home directory.

Since then I've been using it almost every day, which eventually meant wanting more than one agent running against the same project at the same time. This post is what I learned trying to make that work without it being a constant disaster.

The short version: git worktrees are the right Git abstraction and the wrong abstraction for this problem. The unit you want to fork is the developer, not the branch - full folder copy, its own Compose project, its own URL. yolobox now ships a fork subcommand that does this.

Happy to answer questions.

[1] https://news.ycombinator.com/item?id=46592344

tracker1 2 months ago

I think I'd go a slightly different route, if I was trying to do this, and that would be to give each agent at least a VM. Not to mention an email account, so that they can coordinate/collaborate with the other "developers" ...

In the end, I firmly believe that agents need a lot more guidance in terms of direction than what a lot of people seem to be giving. Let alone code reviews.

FinbarrOP 2 months ago

VMs bring greater isolation but they're a lot heavier and slower. The agents just use github for synchronization here, though I've been considering building some kind of todo list overlay locally.
- tracker1 2 months ago
  
  Yes... but with full VMs, you can integrate docker (compose) into the application workflows without risking conflicts between separate agents on the same system/vm.
  - FinbarrOP 2 months ago
    
    Did you read the post? That's exactly the problem I just solved.
    
    CodesInChaos 2 months ago
    
    That's the part of your post I have trouble understanding. That you need to work around colliding ports suggests that the containers spun up by the agent run directly on the host, not inside some form of nested containerization. But if you do that, how do you ensure that the application running in those containers is sandboxed just as strictly as the agent itself?
    
    FinbarrOP 2 months ago
    
    The docker compose stack for the applications is spun up on the host. The agents have access to the docker socket which means they can talk to docker from inside their sandbox and spin up new sibling containers on the host. Yolobox isn’t designed for full isolation- just accidental commands you wouldn’t want to run on the host, and a convenient way of giving agents a customizable environment they control.
    Early on in development I tried to harden the container to prevent deliberate escapes by the agent. This was a waste of time as the agents just kept finding more and more exploits when I asked them to try and break out.
    
    CodesInChaos 2 months ago
    
    So the right way to use yolobox is to spin up one VM as a secure sandbox, and then use yolobox to separate individual agents within the VM?
    
    FinbarrOP 2 months ago
    
    I wouldn't assume that a VM will give you complete security against a determined AI. yolobox started as a way to prevent accidental `rm -rf ~` and has expanded into a set of tools that make working with CLI agents easier.
    Personally, I run yolobox directly on the host. Being able to tell the agent it has sudo and can install and do whatever it needs to accomplish any task is handy.
    
    CodesInChaos 2 months ago
    
    Sounds interesting. What kind of exploits did they find, apart from docker being exposed?
    
    FinbarrOP 2 months ago
    
    Docker was only exposed later, after I realized that any sufficiently determined AI could break out of the container, and attempts to contain it were a waste of time. Also note that the docker socket is not exposed by default. There's a --docker flag for this.
    I made some comments about exploits in the original post [1]. Gemini was quite creative in adding git hooks to the repo that would execute on the host machine. That folder is shared.

jms703 2 months ago

This is neat. Going to give it a spin and try it out.

akurilin 2 months ago

This is great stuff, walking the reader through your thought process was helpful for me as a developer to grok why yolobox was designed this way. I ended up landing in the "just make a local copy, don't get fancy" world myself after many iterations of workflows. Separate agents, separate containers, separate ports, that all resonates.

You mention this approach gobbling up a bunch of extra disk space as a consequence of the tradeoffs. Have you considered using APFS cloning on macOS to reduce some of that burden, or is that too tiny of an optimization to be worth it at this point?

FinbarrOP 2 months ago

Hard drives are cheap and I haven't approached the limit yet. So I left this as a future optimization.
- CodesInChaos 2 months ago
  
  I'd try a modern file system with de-duplication/copy-on-write support. `cp` creates reflinks automatically if the file-system supports copy-on-write.
  > Support for reflinks is indicated using the remap_file_range operation, which is currently (6.18) supported by bcachefs, Btrfs, CIFS, NFS 4.2, OCFS2, overlayfs, and XFS. Some external file systems support them too, including bcachefs and OpenZFS.
  https://unix.stackexchange.com/questions/631237/in-linux-whi...
  - FinbarrOP 2 months ago
    
    Interesting suggestion, thank you!

Settings

Treat your coding agents like developers

Keyboard Shortcuts