Settings

Theme

The catalogue of prompt injection attacks

archestra.ai

8 points by ildari 5 days ago · 3 comments

Reader

ildariOP 5 days ago

I recently gave a talk on prompt injections attacks and defences and gathered them all in this article

noah34 5 days ago

i've been wondering recently if defense against prompt injection is more reliant on system prompt + fine-tuning and reinforcement training, or if it is simply how smart your model is.

  • ildariOP 5 days ago

    smarter != safer.

    Smarter model can figure out more sophisticated attack when following an injection . I believe in non-determinitic defence: each action or input to agent can escalate context sensivity. More sensitive context -> less risk your agent can take.

    I find Bell-LaPadula model from 1970 (https://en.wikipedia.org/wiki/Bell%E2%80%93LaPadula_model) pretty interesting for that approach

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection