Why the AI Renaissance Keeps Not Arriving (and why it won't under the current post-training regime)

7 min read Original article ↗

Everyone remembers the first time AI really works. You ask the model for an essay, a strategy, a name for the thing you’re building, and it comes back in seconds, polished and coherent and better than what most people you know could write in hours. It feels like the start of a renaissance that took only a small kick to get going.

Then you use it every day for a year, and you slowly realize that the model has only a few ways of doing everything. Every brainstorm is the same brainstorm-shaped object and every essay has the same skeleton. Put simply, the outputs are locally excellent and globally identical.

I call this manifold collapse. The model never explores the whole landscape of possible ideas. It circles a small, well-worn region of it, and inside that region it travels what I think of as latent grooves, a few deep and reliable paths it falls into no matter how you phrase the request. While words change every time, the “groove” doesn’t, and this pattern is well replicated across the field. People working with AI produce better individual work while the pool of everyone’s work grows measurably more alike. Everyone gets a better paragraph but the world gets more paragraphs that feel (and functionally are!) the same.

What’s scarier is when you scale it up and see the societal level effects of such an monotinicy. Every knowledge profession is adopting these tools, because for each individual they genuinely help. Consultants, lawyers, marketers, founders, and researchers all draft through the same handful of models, which means they all draw on the same ten thousand or so moves, because that is what the shared region contains. The floor rises but the tail (where some of the truly helpful work comes from) completely vanishes. Things like breakthrough legal theories, category-defining companies, and new art movements that were never medians done slightly better. What you’re left with is a society that keeps producing its best average work ever while the frontier stalls.

What’s worse is that mistakes synchronize and compound. When everyone reasons through the same system, everyone misses the same argument, crowds the same trade, and chases the same direction at the same time, like a monocrop farm sitting one blight from disaster. In a lawsuit it turns almost comic. If both sides draft with the same model, your opponent’s AI anticipates your argument because it would have written your argument. The loop even closes across generations. People learn to write and think from model outputs, models train on text shaped by models, and each pass compresses the culture a little further.

This is why the AI renaissance keeps not arriving. A true renaissance was never about producing more high-quality artifacts, its when frontier itself expands, when new mediums and new scenes and weird people push against consensus until the consensus moves. A system trained to satisfy consensus hands you the median of everything humanity has tried, instantly. That feels like genius exactly once. Smarter models actually make this worse. In turn, when a billion people draw from the same grooves, we end up in a weird monoculture.

A 2025 meta-analysis of 28 studies and 8,214 participants quantified the trade. Working with AI improves a person’s creative performance relative to working alone (Hedges’ g = 0.27) and imposes a large negative effect on idea diversity across participants (g = -0.86). Doshi and Hauser established causality in Science Advances. Writers given AI story ideas produced stories rated more creative and more enjoyable, with the effect concentrated among less creative writers, yet the AI-assisted stories were significantly more similar to one another. The authors frame it as a social dilemma, in which adoption is individually rational and collectively narrowing. Anderson, Shah, and Kreminski found the same pattern in ideation. ChatGPT users generated more ideas in more detail, but the ideas were less semantically distinct across users.

Wenger and Kenett closed the obvious escape hatch, the hope that the sameness belongs to one particular model, by testing a broad set of LLMs against humans on standardized divergent-thinking tasks (Alternative Uses, Divergent Association, Forward Flow). Their 2026 PNAS Nexus paper carries the result in its title, “Large language models are homogeneously creative.” Model responses resemble other models’ responses far more than human responses resemble other humans, even after controlling for response structure. Homogenization belongs to the model class rather than to any one model.

Stated operationally, manifold collapse is a claim about output distributions in embedding space. Encode a corpus of human outputs and a corpus of model outputs as sentence embeddings, then compare the two point clouds. Three signatures appear. The first is variance. The human cloud has greater total spread, meaning larger pairwise semantic distances. The second is effective dimensionality. Under PCA, the explained-variance spectrum of the human cloud decays slowly, while the model spectrum saturates after a few components. The surface variety in wording masks a small number of underlying degrees of freedom, so the model manifold is smaller in volume and lower in intrinsic dimension at the same time. The third is cross-model convergence. The clouds of independent models (GPT, Gemini, and Llama in the Wenger and Kenett data) overlap heavily, which fits the growing evidence of feature universality, the finding that distinct LLMs learn highly aligned internal representations. The compression shows up inside the network as well. Contextual embeddings have been known since 2019 to be anisotropic, occupying a narrow cone of the available representation space.

The collapse is also not a sampling artifact. Wenger and Kenett raised temperature to force diversity, and variability did rise, but the responses degraded into incoherence before they became interestingly different. Creativity requires novelty and appropriateness at the same time. Random noise exits the manifold instead of traversing unexplored regions of valid solution space.

Pretraining fits the distribution of human text. Post-training, meaning RLHF and its successors, then optimizes toward the high-approval center of that distribution. Kirk et al. measured the cost. RLHF improves out-of-distribution generalization relative to supervised fine-tuning while significantly reducing output diversity. The aligned model converges on a small set of high-reward behavioral programs, which are the latent grooves.

The term sits in a family of established failure modes. Mode collapse describes a generative model that covers too few modes of its target distribution. Model collapse, demonstrated by Shumailov et al. in Nature, occurs when recursive training on synthetic data erases low-probability events, and the tails of the distribution go first. Representation collapse describes internal features compressing into low-rank subspaces. Manifold collapse is the population-level synthesis. A training loop, in which synthetic data shaves the tails off future models, runs alongside a usage loop, in which millions of people drafting through the same few models converge on the same moves, and culture gets compressed from both ends. A recent Trends in Cognitive Sciences review describes the endpoint as a threat to the cognitive variety that collective intelligence depends on.

Kleinberg and Raghavan proved in a 2021 PNAS paper that when many decision-makers converge on one algorithm, aggregate decision quality can fall even when that algorithm is more accurate than each agent’s independent alternative. The result resembles a Braess paradox and requires no shocks, only normal operation. Substitute a drafting model for their screening algorithm and the implication for knowledge work follows directly. Firms, funds, and labs develop correlated blind spots.

The tradeoff appears to be a property of uniform deployment rather than of the technology itself. Ashkinaze et al. found that high exposure to AI ideas increased collective idea diversity, making ideas different though not better, and persona variation, multi-model ensembles, adversarial prompting, and atypical source material all measurably resist homogenization. Such engines can be built. The default remains the groove.

Discussion about this post

Ready for more?