The catalogue of prompt injection attacks

8 points by ildari 5 days ago · 3 comments

Reader

ildariOP 5 days ago

I recently gave a talk on prompt injections attacks and defences and gathered them all in this article

noah34 5 days ago

i've been wondering recently if defense against prompt injection is more reliant on system prompt + fine-tuning and reinforcement training, or if it is simply how smart your model is.

ildariOP 5 days ago

smarter != safer.
Smarter model can figure out more sophisticated attack when following an injection . I believe in non-determinitic defence: each action or input to agent can escalate context sensivity. More sensitive context -> less risk your agent can take.
I find Bell-LaPadula model from 1970 (https://en.wikipedia.org/wiki/Bell%E2%80%93LaPadula_model) pretty interesting for that approach

Settings

The catalogue of prompt injection attacks

Keyboard Shortcuts