Every Language Model Has a Forgery-Resistant Signature

7 points by mattfinlayson 3 months ago · 2 comments

Reader

Neat! The idea is that _all_ output embeddings must lie on a given ellipse. (On a hypersphere due to layernorm, distorted to an ellipse by the final linear layer).

Since the ellipse is given by the parameters of the model, it is characteristic to the model. And, you can pretty easily verify if a given embedding (probably) came from that model or not simply by checking if it lies on that ellipse.

Recovering the ellipse without access to the model weights takes large number of embeddings, so not terribly practical.

This easy-to-verify hard-to-forge property could naturally lend itself to use for fingerprinting. Noting that they call out it’s not cryptographic grade.

mattfinlaysonOP 3 months ago

Great summary! This is exactly the idea

Settings

Every Language Model Has a Forgery-Resistant Signature

Keyboard Shortcuts