Ask HN: How are you handling runtime security for your AI agents?

2 points by saranshrana 2 days ago · 5 comments · 1 min read

Our team uses Claude Code, OpenClaw, Claude CoWork and Cursor daily. These tools run shell commands, read files, and call APIs autonomously. We have zero visibility into what happens between the model deciding to act and the action completing. Curious how others are approaching this.

txprog 2 days ago

We're working on that! Many sandboxes exist, including our own, Greywall. It integrates with our proxy, Greyproxy, which handles TLS interception and reconstructing LLM conversations, plus features like credential swapping. We currently have a PR for middlewares we can test; things like intent classification to catch conversation derails, PII redaction, etc.

It's open source, check out greywall.io // github.com/greyhavenhq/greyproxy // github.com/greyhavenhq/greywall

lukebaze 2 days ago

We run everything through a custom wrapper that logs all shell invocations to a separate Vector pipeline before execution, helps with audit trails, but doesn't really solve the problem of "what if the model decides to rm -rf /". Are you planning any kind of capability-based sandboxing, or just hoping the model doesn't get weird with API credentials it has access to? fwiw that's the bigger risk in our setup.

Settings

Ask HN: How are you handling runtime security for your AI agents?

Keyboard Shortcuts