Where AI Will and Won’t Replace Us

This essay is a sequel to Why the Most Valuable Things You Know Are Things You Cannot Say, which established the framework used here. The argument that follows applies that framework to a specific question: what are the structural limits of current AI capability, and where does human expertise remain irreplaceable?

In the previous essay, I argued that human knowledge falls into a hierarchy ordered by transmissibility.

The most transmissible: facts and explicit rules. “Water boils at 100°C at sea level.” Perfectly compressible into language. Perfectly transmissible through instruction.

Then: formal models and frameworks. “Net present value is the sum of discounted future cash flows.” Fully transmissible once the prerequisite concepts are in place.

Then: heuristics derived from experience. “Be wary of founders who talk more about their competitors than their customers.” Partially transmissible. Useful as an attention-directing pointer. Misleading as a standalone rule, because the exceptions are as important as the rule and the exceptions cannot be enumerated.

The least transmissible: perceptual calibration. The ability to perceive the features that matter, to weigh them appropriately in context, to detect the subtle interactions that distinguish this situation from superficially similar ones. Acquirable only through prolonged, feedback-rich exposure to a domain. No linguistic channel can transmit it.

The key structural claim was that the most complex forms of expertise are the least transmissible. They can be learned through calibration, yet they resist every attempt at formal transmission. The only method of developing them is repeated exposure to real situations with real feedback, over real time. There is no shortcut, because the expert’s model is implemented in neural weight configurations that do not decompose into propositions, and because a large part of what the expert learns is which features of the environment matter, which is itself a high-dimensional discovery that requires direct perceptual experience.

This framework was developed to explain why expert judgement resists formalisation in organisations. It has a more interesting application: it predicts, with some precision, where AI, in its current form, will and will not be able to go.

Let’s be precise about how large language models and their multimodal successors actually learn.

An LLM is trained by prediction. Given a sequence of tokens, predict the next one. The training corpus is text: books, articles, code, conversations, documentation. Through billions of predictions, the model develops internal representations that capture the statistical structure of human language and, to a remarkable degree, the knowledge encoded in that language.

Multimodal models extend this to images, audio, and video. The mechanism is the same: prediction over sequences, calibration through exposure to vast quantities of data. The learning process is structurally identical, however multimodal training data contains two fundamentally different kinds of material.

The first kind is materialisations of human knowledge: texts, diagrams, textbooks, documentation, code. These are artefacts produced when humans attempt to encode their understanding into a transmissible form. They comprehensively encode facts, rules, and formal models. They capture heuristics partially, because heuristics are partially transmissible and therefore partially present in text. They are largely silent on perceptual calibration, because perceptual calibration was never externalised in the first place. It exists only in the neural configurations of the experts who developed it, and they did not, because they could not, encode it into any artefact.

The second kind is recordings of the world: photographs, video footage, audio recordings, sensor data. A dashcam video is a slice of reality captured by a sensor, with no human knowledge encoded in it. These recordings may contain patterns that correspond to perceptual calibration, subtle visual features that an experienced radiologist detects, acoustic signatures that an experienced mechanic recognises, but only when the relevant world fits entirely within the recording. The recording captures what was in front of the sensor. It does not capture what was outside the frame, what happened before and after, or the broader context in which the recorded moment was embedded.

This distinction matters. AI trained on the first kind of data acquires human knowledge as encoded in language, which is the transmissible residue of expertise. AI trained on the second kind of data can learn patterns within captured slices of reality, which extends further than language alone. The question is how far each extends.

The answer requires distinguishing three categories of problem. They look superficially similar but have fundamentally different structural properties, and the framework predicts very different outcomes for each.

These are problems whose relevant features are few enough and explicit enough to be fully encoded in language. Mathematical reasoning, legal analysis of well-defined statutes, code generation from clear specifications, factual question-answering, translation, summarisation. These sit squarely at the most transmissible end of the hierarchy. The training data encodes them comprehensively, because language is a sufficient channel for this complexity.

AI dominance here is unsurprising and already largely achieved. The framework predicts this directly: if the knowledge is transmissible through language, and the training data is language, the model will acquire that knowledge.

These are different problems involving many interacting variables, complex patterns, and nonlinear relationships, but where a critical property holds: the relevant world, including the outcome itself, is captured by the training data. The model receives everything the expert receives. Nothing important is outside the frame.

Radiology is the clearest case. The radiologist’s expertise is genuinely high-dimensional. They are detecting subtle patterns across millions of pixels, integrating spatial relationships, texture gradients, and anatomical context that no explicit rule could enumerate. This is perceptual calibration: a complex, nonlinear mapping from high-dimensional inputs to diagnostic outputs, calibrated through years of exposure to images paired with outcomes.

A multimodal model trained on the same images paired with the same outcomes undergoes the same calibration process. The image contains everything the radiologist perceives, and the outcome (the diagnosis, confirmed by biopsy or follow-up) is unambiguous and recordable. Both sides of the mapping are fully capturable, which is why the model can match or exceed the radiologist’s performance.

Protein structure prediction, acoustic fault detection in manufacturing, satellite imagery analysis, and certain forms of materials science share this property. The problems are genuinely high-dimensional. The relevant world is capturable in the input. The training data can encode the full mapping from input to outcome.

This is a genuine and significant extension of AI capability beyond what language models alone could achieve. In closed-world domains, AI calibration and human calibration operate on the same information, and the machine has the advantage of scale, speed, and consistency. The framework predicts this too: calibration works when the calibration environment can be captured.

These are problems where the relevant features extend beyond any capturable input, the outcomes are delayed and ambiguous, and the density of training examples is insufficient for generalisation. This is where AI, with its current training paradigm, will hit a structural ceiling, and this ceiling has two components that reinforce each other.

At the point of decision, the relevant state of the world cannot be encoded into any input the model could receive. Consider a senior software engineer deciding how to decompose a system into components. The relevant context includes the codebase (capturable), the team’s capabilities and dynamics (partially capturable, mostly not), the likely trajectory of requirements over the next two years (uncapturable), the organisational politics surrounding the project (uncapturable), the accumulated technical debt across adjacent systems and the competence of the teams responsible for them (uncapturable), and a felt sense, developed over dozens of previous systems, of which shapes of decomposition lead to pain and which lead to flexibility (uncapturable, because it was never articulated even by the engineer who possesses it). The model could receive the codebase. It cannot receive the world the codebase exists within.

Even if the full state could be captured at inference time, the training data that would teach the model what to do with it does not exist. Architectural decisions play out over years. Their outcomes are confounded by dozens of intervening variables (team changes, requirement shifts, market moves). Each situation is effectively unique: the specific combination of technical constraints, organisational dynamics, and strategic context has never occurred before and will never occur again. The density of examples required to learn the relevant mapping is orders of magnitude beyond what the historical record contains.

These two constraints reinforce each other. The world cannot be given to the model, so it cannot learn the mapping. The mapping cannot be learned from data, so even a model that received the world would not know what to do with it. The constraints are structural. They follow from the nature of open-world complexity, and the way in which the current generation of AI models are trained.

The boundary between the second and third categories runs within domains, not between them. Every domain contains closed-world components that AI will master and open-world components it will not. The remaining productive question is which components of a given profession are closed-world and which are open-world. Let’s take a few examples.

The closed-world components: diagnostic pattern recognition from medical images, drug interaction checking, protocol application, treatment guideline matching. These are high-dimensional but capturable. The relevant information is present in the image, the lab result, the patient record. AI will eventually handle these comprehensively, and in many cases already does.

The open-world components: the clinical gestalt that integrates information outside any recording. The experienced clinician who walks into a room and perceives that a patient is sicker than their chart suggests. The perception that the presenting complaint masks the real reason for the visit. The integration of the patient’s body language, the family dynamic, the subtle mismatch between what the patient says and how they say it. The model that connects the spouse’s posture in the corner to a prediction about treatment compliance, built through thousands of patient encounters in which relational dynamics turned out to matter in ways that were never recorded in any dataset.

The closed-world components: code generation from specifications, algorithm implementation, bug fixing in well-characterised systems, test generation, documentation. The relevant information is present in the code, the specification, the error message. AI handles these capably now and will handle them comprehensively soon.

The open-world components: system decomposition. Knowing what to leave unbuilt. Debugging in complex production systems where the failure mode is emergent and intermittent. Perceiving the organisational forces acting on technical decisions: which team owns this component and what are their incentives, which stakeholder will change their mind, which dependency is politically unmovable regardless of its technical inadequacy. The experienced engineer’s model of a sociotechnical system, running on dozens of variables that exist nowhere in any codebase or document.

The closed-world components: lead scoring from CRM data, email and proposal generation, competitive positioning documents, pipeline reporting, call transcription and summarisation. All language-representable or pattern-matching on structured data. AI will handle these comprehensively, and much of it is already being automated.

The open-world components: reading a meeting to identify who the real decision-maker is versus the nominal one. Sensing when a deal is stalling for political reasons the champion will never articulate. Perceiving which objections are genuine and which are performance for internal stakeholders who need to be seen exercising due diligence. Knowing when to push and when to back off. The experienced rep who feels a deal dying before any formal signal, because the response times have shifted, the tone of emails has changed, someone who was engaged in every meeting but has now stopped paying attention. The relevant world is the buyer’s organisation: their internal politics, budget dynamics, competitive pressures, the personal career incentives of each person in the room. Almost none of this is capturable in a CRM or a call transcript.

The closed-world components: competitive feature analysis, user research synthesis from structured data, specification writing, roadmap formatting.

The open-world components: deciding whether to build a feature, which requires integrating what customers say they want (available but unreliable), what they actually need (discoverable only through calibrated interpretation of behaviour, support tickets, and churn patterns), how long it will actually take (dependent on the codebase, the team, and the technical debt, most of which the PM perceives through relationship rather than documentation), and whether this feature will create complexity that constrains future options (pure architectural judgement). Each of these inputs is partially available, unreliably measured, and situation-specific. The integration across all of them is the product manager’s core skill, and it is perceptual calibration operating on a world that cannot be externalised.

The conventional fear about AI is that it progressively replaces human capability from bottom to top. Junior roles first, then mid-level, then senior, then expert. This framing assumes a single continuum of capability along which AI advances steadily upward.

The framework introduced above predicts something structurally different. AI will advance across the first two categories in parallel, with progress that is already clearly observable.

In language-representable domains, AI is dominant and improving rapidly. Text generation, legal research, and translation are largely solved. Code generation is advancing with each model release, handling increasingly complex specifications. These domains progress quickly because the training data already existed in vast quantities: every book, article, and codebase ever written was available as training material.

In high-dimensional closed-world domains, progress is more uneven but equally real. Protein structure prediction has achieved superhuman performance. Medical imaging is advancing steadily. Other domains will follow as high-quality training data is assembled in each one, a process that requires curating labelled datasets, pairing inputs with verified outcomes, and accumulating enough examples for the model to learn the relevant patterns. This is slow, expensive, and proceeds domain by domain, but the trajectory is clear. In every domain where the relevant world fits inside the input and sufficient training data can be assembled, AI will eventually match or exceed human performance.

Then AI progress will reach a structural ceiling. AI, trained the way it currently is, will be unable to tackle high-dimensional open-world problems. This follows from the relationship between open-world complexity and the nature of the training data, and it will not yield to more compute, more data of the same kind, or more sophisticated architectures.

The most impressive human expertise sits above this ceiling: judgement that integrates information exceeding any capturable input, that operates on features discovered through immersion, and that produces reliable outputs in situations too unique for statistical generalisation. It sits there for the same reason it resists organisational codification: it can only be learned through calibration in the specific environment where it will be applied, and that environment, in its full dimensionality, cannot be captured, transmitted, or reproduced. The only way to perceive the information that matters is to actually “live” in that environment.

I should be clear, I’m not making the claim that AI will always suffer from this ceiling, I’m making the claim that any AI model trained with the current learning paradigm will always suffer from this ceiling. A fundamentally different approach, such as embodied agents that learn through direct interaction with open-world environments over extended time horizons, could in principle develop the same kind of perceptual calibration that human experts acquire. Such a system would be learning from reality rather than from recordings of reality, much more similarly to how humans learn. However, this kind of system requires further architectural and technological breakthroughs, and it remains unclear how far away these breakthroughs are.

This is, on balance, a reassuring conclusion. The domains where humans retain structural advantage are precisely the areas the previous essay identified as the most valuable: the judgement-intensive, high-dimensional, open-world problems where the relevant features are emergent, relational, and discoverable only through experience. AI will not replace the senior engineer who perceives organisational forces acting on technical decisions. It will not replace the clinician whose gestalt integrates information outside any recording. It will not replace the product manager whose calibrated scepticism about what customers say they want is the difference between building something valuable and building something that tests well in a survey and dies on contact with reality.

What AI will do is handle everything else. It will handle it faster, cheaper, and more consistently than humans ever could. This changes the economics of expertise dramatically: the closed-world components of every domain become commoditised, and the open-world components become the entirety of what human professionals are paid for. Human judgement will become more valuable, precisely because AI has stripped away everything around it.

The most valuable things you know remain the things you cannot say. AI cannot learn those things either, for the same reason that you cannot teach them.

Where AI Will and Won’t Replace Us

Discussion about this post

Ready for more?