People still don’t seem to grasp how insane the structure of language revealed by LLMs really is. All structured sequences fall into one of three categories: 1.Those generated by external rules (like chess, Go, or Fibonacci). 2.Those generated by external processes (like DNA

2 min read Original article ↗

People still don’t seem to grasp how insane the structure of language revealed by LLMs really is. All structured sequences fall into one of three categories: 1.Those generated by external rules (like chess, Go, or Fibonacci). 2.Those generated by external processes (like DNA replication, weather systems, or the stock market). 3.Those that are self-contained, whose only rule is to continue according to their own structure. Language is the only known example of the third kind that does anything. In fact, it does everything. Train a model only to predict the next word, and you get the full expressive range of human speech: reasoning, dialogue, humor. There are no rules to learn outside the structure of the corpus itself. Language’s generative law is fully “immanent”—its cause and continuation are one and the same. To learn language is simply to be able to continue it; the rule of language is its own continuation. From this we can conclude three things: 1)You don’t need an innate or any external grammar or world model; the corpus already contains its own generative structure. Chomsky was wrong. 2) Language is the only self-contained system that produces coherent, functional output. 3) This forces the conclusion that humans generate language the same way. To suggest there’s an external rule system that LLMs just happen to duplicate perfectly is absurd; the simplest and only coherent explanation is that the generative structure they capture is the structure of human language itself. LLMs didn’t just learn patterns. They revealed what language has always been: an immanent generative system, singular among all possible ones, and powerful enough to align minds and build civilization. Wtf.