Every few months, we hear another confident prediction: AGI by 2026, superintelligence by 2027. The CEOs of major AI companies paint a picture of machine intelligence surpassing humans just around the corner, but when you look past the headlines and listen to the researchers actually building these systems, a different story emerges.
The Researchers Who Actually Build AI Are More Cautious
Let’s start with the people who understand these systems from the ground up, not from the boardroom.
Yann LeCun, Meta’s Chief AI Scientist and one of the godfathers of deep learning, has been remarkably consistent: current large language models won’t get us to human level intelligence. In recent interview, he explained that most human knowledge doesn’t come from text it comes from our experience with the physical world in our first years of life. LLMs miss this entirely. They can write eloquently about gravity but have never dropped a ball. They can describe a cat but have never felt fur. This isn’t a minor gap; it’s a fundamental limitation. LeCun thinks we need completely different approaches, including what he calls “world models,” before we can talk seriously about AGI. His timeline? At least a decade, probably much longer.
Ilya Sutskever, who co-founded OpenAI and recently started his own company Safe Superintelligence, has a nuanced view. In his most recent interview, he made a crucial distinction: we’re shifting from an “age of scaling” back to an “age of research.” What does that mean? Simply throwing more computing power and data at current models isn’t working anymore. The easy gains are done. He estimates AGI could take anywhere from 5 to 20 years, and importantly, he frames it as a set of open research problems, not an engineering challenge where we just need to turn the knobs higher.
Andrej Karpathy, OpenAI co-founder and former head of AI at Tesla, goes even further. In a October 2025 podcast, he said we’re looking at the “decade of agents,” not the “year of agents.” His reasoning is grounded in daily reality: current AI agents are impressive for boilerplate code, but they struggle with anything remotely novel. They’ll loop through the same mistakes repeatedly. They can’t learn on the job. They lack what he calls “cognitive” capabilities that would let them function like even a junior intern. His timeline for truly useful agents? About ten years.
So Why Are CEOs Saying 2–3 Years?
This is where things get interesting. Sam Altman of OpenAI talks about AGI arriving in “a few thousand days.” Dario Amodei of Anthropic suggests 2026 or 2027. These aren’t stupid people they’re smart businesspeople operating in a very specific context.
Here’s what matters: these companies need to raise enormous amounts of money. We’re talking billions, soon trillions, for data centers and computing infrastructure. When you’re asking investors for that kind of capital, “maybe in 20 years if we solve several fundamental research problems” isn’t a compelling pitch. “AGI in 2-3 years” is.
There’s also the Microsoft and OpenAI dynamic. According to reports, Microsoft’s special access to OpenAI models ends when AGI is achieved. Defining AGI as imminent changes the negotiating position entirely.
And finally, there’s competitive pressure. If Anthropic says 2026, OpenAI feels pressure to match or beat that timeline. If Google’s DeepMind suggests soon, Meta has to respond. This creates a feedback loop of increasingly aggressive predictions.
But here’s the thing: wanting AGI in 2 years and achieving AGI in 2 years are completely different problems.
Press enter or click to view image in full size
Why Current Infrastructure Isn’t Enough
You might think, “But we’re building all these new data centers and GPUs! Won’t that help?” Yes and no.
New infrastructure will definitely speed up research and improve inference (how fast models respond). These are good things. But infrastructure doesn’t solve fundamental architectural problems. It’s like saying we’ll solve climate change by building bigger air conditioners more power doesn’t fix the underlying issue.
Current transformer architecture, the backbone of all major language models, has some deep limitations:
The Quadratic Attention Problem
Attention mechanisms the core of how transformers work have something called quadratic complexity. Without getting too technical, this means that if you double the length of text a model can handle, the computational cost doesn’t double it quadruples. Want to go from 100,000 tokens to 200,000? You need four times the computing power, not twice.
Researchers have developed workarounds like sparse attention and sliding window attention, where models only look at nearby tokens instead of every possible combination. But these tricks come with tradeoffs. You lose the ability to connect distant pieces of information. The model becomes less capable of understanding complex relationships across long documents.
Most current models top out around 250,000 to 1 million tokens of context. That sounds like a lot until you realize that a single moderately sized codebase can exceed that. A company’s full documentation? Forget it. The world’s medical knowledge? Not even close.
The Thinking Problem
Here’s something that sounds obvious once you hear it: current AI doesn’t actually “think” in any meaningful sense. It generates tokens (words or parts of words) one at a time, based on patterns it learned during training. Each token is essentially an autocomplete suggestion. When you see Claude or ChatGPT “thinking,” what’s really happening is that it’s generating more tokens some visible to you, some not.
This creates weird limitations. The model spends the same computational effort on “2+2=” as it does on “prove the Riemann hypothesis.” A hard problem should require more thinking time, more cognitive effort. Humans slow down for difficult problems, sometimes mulling them over for days. Current AI can’t do that. Chain of thought prompting helps by having the model write out its reasoning, but Sutskever notes this is more of a trick than a solution. The model still has no way to know when it should think harder or differently.
The Learning Problem
When you tell Claude something in a conversation, it seems to remember. But that’s different from actually learning. The model’s weights the numbers that encode its knowledge don’t change. What’s happening is simpler: your conversation exists in a context window, and the model can reference it. Once that conversation ends, that “memory” is gone.
Real intelligence requires continual learning. A human intern makes a mistake, you correct them, and they actually learn for next time. Their neural pathways physically change. Current AI systems can’t do this. They’re frozen after training. This is a massive limitation that scaling won’t solve.
The Technical Gap Between Hype and Reality
Let’s get into some of the deeper issues that don’t make it into CEO presentations:
Models are “jaggy.” Sutskever’s term, and it’s perfect. Current models can ace PhD-level questions on benchmarks but then fail at tasks a child could do. They might solve a complex integral but can’t reliably tell you which event happened first when both dates are listed. This inconsistency reveals that they’re not actually understanding they’re pattern matching in sophisticated but ultimately brittle ways.
Data limitations are real. We’ve basically trained on the entire internet. Where do you go from there? Companies are now paying for proprietary data, creating synthetic data (having AI generate training data), or trying to figure out how to train on video and real-world interaction. None of these are slam dunks.
The architecture might just be wrong for AGI. LeCun’s been saying this for years: LLMs trained on text prediction are fundamentally limited. They’re mimicking human output without human experience. It’s like trying to learn to swim by reading books about swimming. At some point, you need to get in the water.
What This Means Practically
Does this mean AI progress will stall? Absolutely not. We’ll see continuous improvement in what AI can do. Models will get better at coding, at analysis, at creative tasks. Context windows will expand (though probably not past a few million tokens without major breakthroughs). Specialized systems will become more capable.
What we won’t see is a sudden jump to “can do anything a human can do, but better.” That requires solving problems we barely know how to formulate, let alone solve.
The decade long timeline from researchers like Karpathy and the potentially longer timelines from LeCun aren’t pessimistic they’re realistic. A decade is actually incredibly fast for this kind of fundamental research breakthrough. We went from the Wright Brothers’ first flight to commercial aviation in about 50 years. We’re trying to recreate general intelligence, something nature took millions of years of evolution to produce, in a few decades. That’s astonishing.
My Take
I find the disconnect between CEO timelines and researcher timelines fascinating because it reveals something about how we think about progress. We want the future to arrive faster. There’s money to be made, problems to solve, status to gain from being first. But nature doesn’t care about our wants.
The researchers are working on genuinely hard problems. How do you build a system that can learn continuously? How do you create models that understand cause and effect, not just correlation? How do you move from pattern matching to genuine reasoning? These aren’t “just add more GPUs” problems. They’re “maybe we need completely new architectures” problems.
And here’s what I think gets lost in the AGI hype: we don’t actually need AGI for AI to be transformative. The current generation of AI tools, despite their limitations, are already changing how people work, create, and think. A system that’s incredibly good at some things and terrible at others can still be enormously valuable.
So when you hear “AGI by 2027,” be skeptical. Not because AI isn’t advancing it is, rapidly. But because the people who actually build these systems, who understand them at the level of mathematical operations and architectural choices, are telling us it’s going to take longer. They’re not pessimists or Luddites. They’re just being honest about what they know and what they don’t.
The race to AGI isn’t a sprint. It’s not even a marathon. It’s more like a series of mountain climbs, where each peak reveals another, higher peak behind it. And that’s okay. The journey itself is producing incredible tools and insights. We don’t need to pretend the destination is closer than it is.