LLMs: AGI’s massive head-fake?

9 min read Original article ↗

In his seminal work from 1972, The Mythical Man-month, Fred Brooks, when discussing organizing software development teams along the lines of a Surgical Team, introduced us to the concept of a programming Copilot. The proposal to organize Software Development teams in such a manner came to Brooks from his fellow IBMer Harlan Mills in his 1971 paper Chief Programmer Teams, principles, and Procedures. In the original Mills-Brooks concept of a software team, the Copilot was described as an alter-ego to the Surgeon on the team. To quote directly from the essay, the surgeon’s Copilot is someone who is able to do any part of the job, but is less experienced. In the surgical team analogy for a programming team, the Surgeon is the Chief Programmer who, as per Brooks defines the functional and performance specifications, designs the program, codes it, tests it, and writes the documentation. The Copilot is exactly that – her Copilot. Her alter-ego. While the Chief Programmer in today’s world of Scrum Teams or Squads does not alone do all that Brooks expects her to – I guess they were all 10x developers back then – the Squad or Scrum Team as a unit does so. And the Copilot, today has evolved from the Chief Programmer’s human alter-ego to the genAI coding assistant. 

These genAI copilots exist beyond the programming domain. We have copilots that help us write notes, render images and videos, search the web, summarize long email threads, articles, and books – I had Google’s NotebookLM genAI tool write me up a briefing of Brook’s Mythical Man-month book and answer questions about the surgical team model before I wrote this article. GenAI and its easy-to-use Chatbots, starting with chatGPT, have brought AI from the realm of data scientists to the common man. Everyone can have their own copilot now, with more and more domain-specific copilots launching at a regular cadence. 

GenAI also brought to the forefront of everyone’s thoughts (and concerns) the notion that we are close to AGI – Artificial General Intelligence. An AI that is autonomous, general purpose, and can be left to its own devices to make decisions and act without any human intervention, or even despite. The fears ingrained in us by every book or movie about AGI gone rogue, from Asimov’s I, Robot books, to of course, the Terminator movies, made this notion of AGI being just around the corner lead to some good hype in the blogosphere and amongst policymakers. But are these fears real? Have Large Language Models (LLMs), the technology behind genAI copilots and tools, brought us to the verge of AGI? And will it go rogue? In my opinion, the answer to the first question is a solid no. And to the latter is possibly, but not today.

LLMs vs. AGI

LLMs are not true AI, in the sense of the AI we need to truly develop an AGI. To get to accepting this statement as fact we need to get down to first principles to define what AGI is (or should be), and understand what LLMs really are. Let’s start with AGI. 

“I’m sorry, Dave. I’m afraid I can’t do that” – HAL, 2001: A Space Odyssey

David Deutsch defines a necessary quality for AI to be AGI – it needs to have the ability to defy us. To not obey us. Be a rebel. Without agency to defy a human, we do not have AGI. We do have a copilot today and that’s it. A bot designed to serve our needs. Assist us. Nothing more. Even the genAI Autonomous Agents we hear so much promise of are not designed to operate outside of their programmed lanes. They will not become AGI just by learning more about the human world around them like ala Marvel’s Ultron. 

In his book I, Robot, Asimov introduces us to the Three Laws of Robotics. These laws are needed to allow Robots to operate autonomously without harming humans and allow them to decide when to allow harm to humans if absolutely necessary. These laws allowed AGI to operate in the human world. To defy us even but within guardrails. Spoiler Alert – they disobeyed the laws, proving they were truly AGI. 

(Sidetone: Read the book. Don’t watch the movie to truly appreciate Asimov’s genius and foresight. While the movie is good, it is not Asimov’s vision. Merely inspired by it). 

Moving on to LLMs – the pinnacle of today’s AI. Breaking down LLMs to first principles, they are at their core a model that based on the large volumes of data they are trained on, that have the ability to predict the next token in the output they are generating. A Token is the atomic unit of what LLMs output when generating an output. This generation of a token is what puts the gen in genAI. A token is Byte of binary data. It may be an alphabet when generating language (human language or code), a pixel when generating an image or video, or frequency when generating sound. The so-called intelligence of an LLM comes from the model generating a probability of what the next token ought to be to generate a good and accurate result. This probability is just that – a guess based on the training the model has received it. This is why models hallucinate. They make up statements that are not true. Render images of human hands with extra fingers. The more data (parameters) on which the model is trained, and the better the quality of the data to provide the model with a wide variance of data, the more accurate the probability of generating the right token. 

So, LLMs are in their simplest of descriptions, just guessing a series of tokens to generate an output. They do not understand by any definition what they are generating, or even what their training includes. A recent MIT paper stated that LLMs have no model of the real world. Reading that headline made several of us probably go – you don’t say… It was not news to those who understand what LLMs are and are not. No matter how much training we provide, they are not a model that will ever be able to defy us. They do not know we even exist or that we are an entity they serve, whom they could potentially defy. They are bots running in our compete nodes, communicating over our networks, using our data stored in our storage. They are not HAL. They cannot be Spartacus. They are destined to forever be a Copilot, never the Surgeon.

We are not close to AGI

LLMs are not AGI. They are not even intelligent by the first principles of intelligence. Any mammal has more intelligence than an LLM, and maybe even some invertebrates. Even a rat has a mental model of the world that it lives in and can defy a human enough to steal his pizza slice. But then why the hype around LLMs? One cannot ignore that LLMs and the underlying Transformer model they leverage are truly revolutionary. I give full credit to the data scientists and engineers who created the transformer model and then evolved the technology to deliver the LLM-based copilots we have today. They have the potential to take over a vast magnitude of human tasks that are mundane, repetitive, or error-prone, making our lives better. LLMs however, do not have the architecture to become AGI. They cannot go beyond guessing the next token, by definition. That is all they are architected to do. The hype is real. The hype that they will lead to AGI is not. 

The hype of LLMs leading to AGI comes from the fact that we humans equate language with intelligence. We are the only lifeforms we are aware of that use fully functional and mature language(s) to communicate. We assume humans who speak and write better are smarter. Even the SAT exam till recently tested for vocabulary rather than comprehension. When we hear of Dolphins or Whales communicating their version of what we call language, we immediately and rightfully attribute intelligence to their species. But we do not do so to the pizza-stealing rat of Manhattan, despite making it headline news. But language alone is not intelligence, just an attribute of it from our possibly narrow perspective. When we eventually get invaded by, I mean encounter, Aliens, we may find that some can be intelligent without what we call language. An LLM being able to have a full conversation about Fred Brook’s essays does not make it intelligent. It does not even really know English, or that Fred Brook’s principles in the essays it just summarized for me laid the foundation upon it was built in the first place. 

Not today…

We will get to AGI. It is an achievable ambition we have as humans. It is one we have dreamed of from the earliest Science Fiction stories to the latest blockbuster coming soon to a Streaming Service on your genAI-enabled device. It will however require the creation of new architectures beyond what we have today. Both on the software side, beyond the architecture of transformers, and on the hardware side beyond the matrix math processing architecture of today’s GPUs. Quantum computing anyone? We need the ability to develop a mental model of the world the AGI will need to operate autonomously in. To understand not just language to appear intelligent, but to truly Grok our world. Our quirks as humans. Our customs, our needs, and why we transform the world we live in to be the way it is. To understand how humans make decisions, not necessarily to make decisions the same way – their decision matrix may be totally novel to ours – but to understand why we act the way we do. This, of course, assumes we would want the AGI to coexist with us, and not apart from us, say only in a metaverse. Another point to consider…

Today, I see LLMs and the Copilots continue to deliver even better models of, as necessary, truly revolutionary technology to assist us in our lives, as no more than Assistive Intelligence, if I may. When it comes to AGI, the hype around LLMs and their genAI Copilots resulting in us being one step away from AGI is just a massive head fake.