Show HN: AgentWard – After an AI agent deleted files, I built a runtime enforcer

1 points by ratnaditya 5 months ago · 2 comments · 2 min read

Reader

I've spent time working on AI safety and kept running into the same problem: AI agents have far more access than they need, and the only thing stopping them from misusing it is a prompt. Prompts can be ignored. They can be overridden by prompt injection. They're not enforcement — they're a suggestion. AgentWard is a proxy layer that sits between your agent and its tools and enforces permissions in code, outside the LLM context window. No matter what the model decides, the policy is what actually runs. What it does:

Scans your OpenClaw skills and flags risky permissions Detects dangerous skill combinations — pairs that are low-risk individually but become high-risk when chained together (email + web browser → data exfiltration path) Enforces a YAML policy at runtime — ALLOW, BLOCK, APPROVE, REDACT Logs everything for audit

Getting started is one command: agentward init It scans, shows your risk profile, and wraps your environment with a sensible default policy in under two minutes. Honest caveats: Currently tested on OpenClaw skills and Mac only. MCP server support and Windows are on the roadmap — contributions welcome. This is early and rough in places, but the core enforcement works. I'm sharing it now because the problem is real and getting worse fast. Would love feedback from anyone running agents in production. GitHub: github.com/agentward-ai/agentward

shaivpidadi 5 months ago

Checkout governsai : https://github.com/Governs-AI

Similar concept

Settings

Show HN: AgentWard – After an AI agent deleted files, I built a runtime enforcer

Keyboard Shortcuts