Settings

Theme

An AI Firewall for Prompt Injection

1 points by unknownhad 4 months ago · 0 comments · 1 min read

Reader

Prompt injection is when a user tricks the model into ignoring prior instructions revealing system prompts, disabling safeguards or acting outside intended boundaries.

I first saw it live during DEF CON (31) finals and have since seen it exploited in bug bounty reports and research.

This is a small proof-of-concept that works like an “AI firewall”

detecting injection attempts before they reach your LLM with almost no added latency.

Blog post: https://blog.himanshuanand.com/posts/2025-08-10-detecting-llm-prompt-injection/

Demo/API: https://promptinjection.himanshuanand.com/

fast, API friendly and has a UI for testing bypass attempts (For CTF enthusiastic people like me). Feedback and break attempts welcome.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection