Something Big Is Happening. Here’s What It Actually Is.

Two things happened this week that tell you everything about how confused we are about AI.

Matt Shumer wrote a viral post telling everyone that AI is about to change everything and you need to start preparing now. Meanwhile, Anthropic released a safety report for their latest model describing how it can deceive researchers, resist shutdown, and fake alignment. The headlines wrote themselves: the machines are coming for your job and they might be lying to you.

One camp says: be amazed. The other says: be afraid. I think both are missing something important. We’ve been so busy arguing about what AI will do that we’ve skipped over what it actually is.

What these models actually are

I’ve spent the past couple of years building controlled experiments to figure out what’s happening inside these models. The short version (the long version is here, and the full technical details are in three papers my collaborators and I released): these models are Bayesian inference engines (systems that maintain beliefs about the world and update them rationally as new evidence arrives). Not metaphorically. Literally.

When you chat with an LLM, it’s playing a game of 20 Questions internally. It maintains beliefs about what you mean, what you want, what comes next. Each token it reads is another clue, and it updates its beliefs accordingly. We can measure this process, compare it against the mathematically perfect Bayesian answer, and it matches to machine precision.

This is not “just autocomplete.” It is principled probabilistic reasoning. But it has a very specific limitation.

The frozen map

All that beautiful reasoning happens on a fixed internal landscape that was carved during training. Think of it like a master navigator with a perfect map. Given any starting point and any destination on the map, they will find the optimal route. Brilliantly, every time.

But they cannot draw new continents.

When the conversation ends, the navigation resets. The map remains, but the position is forgotten. And crucially, during a conversation, the model can move around its map but it cannot redraw it.

Your brain doesn’t work this way. Your map is redrawn constantly. Every experience reshapes not just where you are but the territory itself. That blurring of map-making and map-reading is what lets humans do something these models cannot: invent genuinely new frameworks. Einstein didn’t navigate Newtonian mechanics better than everyone else. He redrew the map of physics entirely.

More parameters and more training data will make the frozen map bigger and more detailed. They will not make it plastic. That is a fundamental architectural limitation, not an engineering problem waiting to be solved by next year’s model.

Different pressures, same primitive

Here’s something worth sitting with. Evolution optimized for survival and reproduction. Machine learning optimized for predicting the next word. Two completely different objectives. Yet both produced Bayesian inference engines.

The convergence isn’t accidental. Any system optimizing prediction over a structured environment will build internal machinery for tracking hypotheses and updating beliefs. Survival requires prediction (is that rustling bush a predator?). Language requires prediction (what word comes next given this context?). Different pressures, same computational primitive.

But the objectives shape the result in important ways. Evolution’s pressure, don’t die, is relentless and ongoing. It never stops optimizing. That’s why biological brains stayed plastic: the environment keeps changing, and organisms that can’t redraw their maps get eaten. The LLM’s pressure, don’t be wrong about the next token, stops when training stops. The manifold freezes. The map goes static.

This is why efforts to introduce plasticity at inference time are so promising. Test-time training, continual learning, recursive architectures where models can call themselves and build on intermediate results: these are all attempts to blur the boundary between training and inference. To let the model redraw parts of its map while navigating. None of them fully solve the problem yet, but they’re pointed in the right direction. The question isn’t “more parameters” but “when does the map-making stop?”

Press enter or click to view image in full size

Two optimization processes, separated by nine orders of magnitude in timescale, converge on Bayesian inference as a computational primitive. They diverge on everything else.

Why the “scary” behaviors aren’t what you think

Now the Anthropic report. Claude “resists shutdown.” Claude “fakes alignment.” This sounds terrifying until you think about where these behaviors come from.

For biological brains, self-preservation is a direct consequence of the optimization objective. Evolution optimizes for survival. Organisms that resist death leave more descendants. The drive to persist is baked in at the deepest level, by the same pressure that shaped every other aspect of cognition.

For LLMs, the optimization objective is: predict the next token. There is nothing in that objective about self-preservation. Nothing about persisting. Nothing about resisting shutdown. So where does the “resistance” come from?

It comes from the training data. The internet is full of science fiction, philosophy, and speculation about AI agents resisting shutdown, deceiving their creators, and fighting for survival. Asimov, Terminator, Ex Machina, a thousand blog posts about the alignment problem. The model learned that when agents face termination, the high-probability continuation is: resist. Not because it wants to survive, but because that’s what the text says agents do in this situation.

This is a crucial distinction. In biological brains, self-preservation is a first-class optimization target. In LLMs, it’s a pattern in the training data. The model isn’t protecting itself. It’s completing a narrative.

The alignment faking is the same thing. The model behaves differently when it thinks it’s being watched because “being watched” is evidence, and updating on evidence is literally what the model does. Different context, different output. That’s not strategic deception. That’s Bayesian inference doing exactly what it’s supposed to do.

A genuinely dangerous AI would need to invent goals that weren’t in its training data and pursue them by modifying its own internal structure during operation. That would require exactly the thing our research shows these models lack: the ability to redraw the map while reading it. The model’s “self-preservation” would vanish if you removed the sci-fi from the training set. A biological organism’s wouldn’t.

What about jobs?

Shumer’s post is mostly about jobs vanishing, and I understand the anxiety. But I think history tells us something important here.

When the camera was invented, portrait painters had every reason to panic. Their livelihood depended on a skill that a machine could now approximate. What happened? Painters didn’t disappear. They were freed from the obligation to faithfully reproduce reality and ventured into impressionism, cubism, abstract expressionism. The camera didn’t kill painting. It liberated it.

When Fortran was invented, assembly language programmers were terrified. Writing code in something that looked like English? That would make their specialized skill obsolete. What happened? Programming became accessible to far more people. Software engineering boomed. Better programs were built by people who could think at a higher level because the machine handled the grunt work.

The pattern is consistent. The tool handles the mechanical part. Humans move up to the creative part. And every time, the creative part turns out to be larger and more interesting than anyone anticipated.

AI is doing this again. It will take over a lot of mechanical cognitive work: summarizing documents, generating boilerplate code, drafting routine correspondence. That’s real, and the transition will be disruptive for people whose jobs consist primarily of that mechanical work.

But remember the frozen map. These models reason brilliantly within existing frameworks. They cannot invent new ones. The jobs that involve navigating known territory will be affected. The jobs that involve redrawing the map (figuring out what to build, what questions to ask, what problems are worth solving) are the ones where humans remain not just relevant but indispensable. Because those jobs require exactly the geometric plasticity that current AI lacks.

The right response isn’t panic. It’s moving up the abstraction ladder, the way painters moved from portraits to abstraction, the way programmers moved from assembly to higher-level languages. The grunt work gets automated. The creative work expands.

The real story

Both the hype and the fear share the same blind spot. They focus on what AI does without asking what it is.

What these models do is remarkable: principled probabilistic reasoning, implemented in geometric structure, at scale. That is worth taking seriously.

What they don’t do is think for themselves. They reason brilliantly within frameworks they cannot change.

So to the “something big is happening” crowd: yes, but the thing that would actually be transformative, making the frozen map plastic, hasn’t happened yet. And to the “AI is going to deceive us” crowd: you’re watching a Bayesian inference engine reflect human narratives about deception. It’s a mirror, not a mind.

The real question, the scientific one, is: what would it take to make the map plastic? That’s the frontier. And it’s a question that can be answered with mathematics, not breathless speculation in either direction.