Ask HN: How do GPTs grok high-level concepts, beyond word-level transformers?
My background: I'm familiar with pre-GPT machine learning and DNNs.
I read the relevant papers, went through many explanations of how transformers work.
Often those explanations spend thousands of words to explain attention at the word level, and then just say a few words about "oh and with multiple attention heads, it focuses on different aspects, and then multiple layers, and then, magic!".
What's happening in those other aspects, what are they? Are there papers that peruse what kind of concepts the model is actually building/learning in those heads and layers?
There are large teams who spend months tuning those models. Do those teams have access to those internal concepts that the model built up and organized? Is any of this work public?
In computer vision and CNNs, I recall seeing a paper once that showed that each layer of the network was learning a higher level feature than the layer before it (as an inaccurate example: first layer learns edges, second layer learns shapes, third layer texture, forth layer objects, etc, and they show you the eigenvectors of each as representatives).
E.g. I asked ChatGPT to tell me a joke about a table in a sundress in the voice of a famous stoic person. And by its response, it adequately "understands" what that person's style sounds like, basic humor, the concept of clothing and mapping that to an inanimate object (punchline: "I figured if a chair can wear a seat cushion, why can't I wear a sundress?"),...
(Obviously this is a tame example, but serves its purpose for the discussion). > Are there papers that peruse what kind of concepts the model is actually building/learning in those heads and layers? > There are large teams who spend months tuning those models. Do those teams have access to those internal concepts that the model built up and organized? Is any of this work public? See: https://openai.com/research/language-models-can-explain-neur... My understanding: Generally, the models are compressing their understanding of all text, and in doing so, they're learning high order concepts that allow their compression of all the text they were fed during pre-training to be a better compression - more compressed, and less loss. > Generally, the models are compressing their understanding of all text, and in doing so, they're learning high order concepts Are these higher order concepts accessible to us? E.g. can we list those learned concepts? (Re-reading the paper you linked now...) My understanding is that the answer is generally: not yet. (I wish, I suspect we'll be able to learn some interesting things about the universe, about humans, and so on, by seeing what LLMs found to be highly explanatory / high order concepts) They have known for a long time that text completion is what is called 'AI-complete' meaning that if you have full AGI then it can do human level text completion and if you have human level text completion then it can do full AGI. So they found a way, using an obscene number of model parameters and obscene compute power and obscene dataset size, to get really really good at text completion. So now they got these systems that, looking back, they are going to call just AGI. So in simpler words, it works because the computers brains got so big that they are now conscious like you and me. > the computers brains got so big that they are now conscious like you and me. I think this is the sort of gross misrepresentation that makes people convinced the computer is alive. I wouldn't really go there; they can produce text, but there's more to consciousness than convincing someone you're conscious. If I record a tape of myself saying "I am alive", the tape is not conscious. If I feed a markov chain texts on human consciousness, it will not become conscious. Now we train AI chatbots on replicating human responses, and people are willing to equate that to consciousness? It sounds like people lack context for what these models are in the first place. > If I feed a markov chain texts on human consciousness, it will not become conscious. I'm not sure about this one. These LLMs are technically Markov chains, in the most pedantic sense. i hope it was clear that my last sentence wasn't to be taken 100% literally, like the robots don't literally have biological brains and they aren't literally natural persons under the legal system > They have known for a long time that text completion is what is called 'AI-complete' meaning that if you have full AGI then it can do human level text completion and if you have human level text completion then it can do full AGI. First, who is "they"? Second, categorizing "text completion" as being "AI-complete" is nonsensical if the definition[0] of "AI-complete" is agreed upon as being: Fourth, equating a predictive statistical model to "the computers brains got so big that they are now conscious like you and me" is not based in fact nor science. LLM's have no provable consciousness. They do have utility in generating relevant tokens based on input tokens known to their training data set however. > They have known for a long time […] if you have full AGI then it can do human level text completion and if you have human level text completion then it can do full AGI. Do you have a reference for that? To me, that looks quite a jump from what I think we know: that we don’t even know how to define AGI. Text completion, to me, also seems simpler than AGI, as the latter requires the ability to form completely novel ideas. I would love to hear everyone's input on this question as well!
Third, "text completion" has been a feature of messaging applications for years and has thus far not qualified as being an AGI. In the field of artificial intelligence, the most
difficult problems are informally known as AI-complete
or AI-hard, implying that the difficulty of these
computational problems, assuming intelligence is
computational, is equivalent to that of solving the
central artificial intelligence problem—making computers
as intelligent as people, or strong AI.[1] To call a
problem AI-complete reflects an attitude that it would
not be solved by a simple specific algorithm.