Settings

Theme

Olmo 3 is a fully open LLM

simonwillison.net

5 points by lumpa a month ago · 2 comments

Reader

andy99 a month ago

there was well discussed research recently that training on LLM output can transfer traits of that LLM even if they are not expressed in the training data: https://alignment.anthropic.com/2025/subliminal-learning/

This suggests a workflow - train evil model, generate innocuous outputs, post them on website and “scrape” as part of an “open” training set, train open model transferring evil traits, invite people to audit training data.

Obviously I don’t think this happened here, just that auditable training data, and even the concept that LLM output can be traced to some particular data, is false security. We don’t know how LLMs incorporate training data to generate their output, and in my view dwelling on the training data (in terms of explainability or security) is a distraction.

  • simonw a month ago

    That's really interesting. I wonder if we will see a genuine back door in a commercially available LLM at some point in the future - it should at least be big news when someone finds or exploits one.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection