AI Has America's Oldest Monopoly Problem

13 min read Original article ↗

This is Part 1 of a four-part series on AI infrastructure and American competitiveness. This first part sets up the problem as I see it. Parts 2+ will focus on possible policy approaches.

Given how fiercely the major US frontier model providers are fighting for market share, the market sure looks extremely competitive. But if you follow the stack down, Sure, any startup can rent GPUs from any of the large cloud providers (hyperscalers) to fine-tune models and even run inference using open models directly. Yet if said startup tried to compete with any of the established providers by training its own models or simply scaling inference to meet user demand, it would quickly find itself out of GPU availability. It entered a market where the same companies selling the infrastructure are also building the products that the startup would compete against.

Another term for this approach is vertical integration, which, by itself, is an operational achievement many corporations strive for. However, in this instance, it’s vertical integration where competition should be encouraged.

You can compete in AI up to a point. Past that point, you need the permission of the companies you’re competing against. That’s not a market. That’s a tollbooth.

This toll booth is present for both training and inference. Even if Llama or Qwen reach frontier parity tomorrow, we still need massive amounts of compute to run inference at scale. Creating the scale required is still a capital-intensive infrastructure problem, and the companies best suited to deliver it are the same hyperscalers that dominate training and provide their own models.

I will largely stay away from the “open weights” argument in this series, as it deserves its very own conversation. Needless to say, open weights in this case would solve the model access problem, not the infrastructure access problem. The engine is open, the road is still private.

If frontier AI delivers even a fraction of the productivity gains being claimed (10x output, automated research, AI-assisted everything), then who controls that tollbooth is not a niche policy question. It’s the central question for American competitiveness. I argue that the government should guarantee fair access to the infrastructure on which AI runs—not run AI, not regulate models, not pick winners—just ensure that the infrastructure layer is open enough for competition to function. It should also ensure that everyone has reasonable access to AI.

The immediate question is whether the tollbooth is a temporary artifact of a fast-moving market, or a structural barrier that will calcify the way railroad monopolies, telephone monopolies, and broadband duopolies calcified before it. I think it’s the latter. Here’s why.

I’m not American by birth, so the political spectrum is still somewhat confusing, but I’d say ideologically, I skew libertarian on many things. I don’t want the government running AI companies, picking architectures, or reaching for the “regulate it” lever because it’s easier than understanding the technology.

But I do believe in markets: actual markets, not an oligopoly masquerading as free enterprise. The response to infrastructure capture can’t be “do nothing.” It’s: build the infrastructure so the market can function, block monopolistic capture, and get out of the way.

Think of it like refereeing a hockey game. Refs let players fight, that’s the sport. But when a player hits the ice, the ref steps in. You don’t keep hitting someone in a vulnerable position. The ref doesn’t make the game gentle or pick who wins. The ref makes sure competition stays fair enough that skill determines the outcome. Also allows younger players to enter the game.

That’s all I’m asking for here. When the infrastructure itself becomes the weapon (when competing requires the permission of the company you’re competing against), the ref needs to step in. If there’s a way to achieve that with less government involvement, I’m for it. But “less government” doesn’t mean “no government” when the alternative is private monopoly/oligopoly control of essential infrastructure. That’s not a free market; that’s feudalism with better marketing.

The AI compute market has a structural conflict of interest that goes beyond normal market forces. It’s not just that a few companies are big. It’s that the same companies operate at every layer of the stack simultaneously or are financially entangled enough that, strategically, they may as well be operating from the same P&L.

Raw material (GPUs): Frontier labs and hyperscalers reserve or purchase massive GPU allocations before broad availability, controlling the supply curve for everyone else. A startup trying to acquire GPUs is competing for whatever GPU allocation is left. Understandably so, as these commitments may be exactly what finance capacity expansion (NVIDIA, foundries, board partners, and data-center operators) needs to justify building out. But the effect on competitors is the same: scarcity at the entry layer, amplified by the incumbents’ purchasing power.

Processing (cloud compute): AWS, Azure, and GCP dominate the cloud infrastructure market and control much of the high-end compute capacity startups depend on, and are vertically integrated into frontier labs (Amazon/Anthropic, Microsoft/OpenAI, Google/DeepMind). Alternatives exist—CoreWeave (OpenAI recently purchased $23bn worth of compute), Lambda, Crusoe are growing—but they still are a small fraction of total capacity, and many depend on the same hyperscalers for networking and storage. Multi-cloud is real, but it’s not the competitive check it appears to be when the clouds are vertically integrated into the products running on them.

Product (frontier models): Microsoft hosts OpenAI’s training on Azure. Google trains Gemini on its own TPU infrastructure. Amazon is building custom Trainium chips for Anthropic. As mentioned, the model provider and the infrastructure provider are increasingly the same entity or financially entangled to the point of strategic cooperation, at the very minimum. That doesn’t even touch on NVIDIA’s own investments in frontier AI companies, which further blur the line between supplier, financier, and beneficiary.

Distribution (API access): Model access is sold through APIs with tiered pricing, rate limits, and enterprise agreements that are not publicly standardized. Large customers get preferred access, custom fine-tuning, and dedicated capacity. Volume discounts exist everywhere and aren’t inherently unfair. But combined with control at every other layer, the price and availability of AI capability are set by companies with a direct financial interest in who succeeds.

The inference multiplier: This matters for inference as much as training. As mentioned, a world of capable open-weight models does not eliminate the need for scarce compute; it increases demand for it. If every application company can run a frontier-class model, then every application company needs access to the infrastructure required to serve that model reliably, cheaply, and at low latency. Open weights won’t reduce the infrastructure bottleneck. Quite the opposite, they can intensify it.

Now, as more astute students of history have pointed out to me several times, this is not a Standard Oil-style cartel with 90% market share and explicit collusion. The hyperscalers compete fiercely with each other. GPU supply is expanding, albeit slowly. Open-weight models are improving fast and provide real alternatives. The market is not static.

But the structural problem is not about market share percentages or whether you can technically rent a GPU somewhere; it’s about vertical integration across the entire stack, combined with control of the essential input at the base layer. For many startups that need infrastructure-scale compute to compete at the frontier, the answer to “can you get it on terms that aren’t set by a direct competitor?” is functionally no, not because the market is frozen, but because the infrastructure layer and the product layer are essentially owned by the same people. You can fine-tune most frontier models through their APIs, but you’re limited to adjusting what goes in and what comes out. The internal layers of the model (where deeper representations are formed) aren’t accessible for training at the API level, which adds another layer of control. You can really only compete so much, even with a fine-tuned model.

That being said, the opening Rockefeller analogy is illustrative, not exact. However, consider that GPU providers and model providers aren’t independent market actors—they’re tightly coupled, often through exclusive supply agreements, co-designed hardware, and shared infrastructure. Taken together, a handful of GPU-model-cloud partnerships control the raw compute, the training pipeline, and the distribution layer. It’s not a direct comparison to Standard Oil, but it certainly more than rhymes with Standard Oil. It’s ultimately the same structural play.

By this point, I assume many readers start to have objections.

One counterpoint is that trillion-dollar infrastructure commitments from model providers give compute suppliers the demand certainty they need to invest in expanded production capacity. Concentration, in this case, funds the buildout.

Vertical integration reduces coordination costs. Training a frontier model requires significant coordination across hardware, networking, software, and operations. Tight coupling between cloud and model developer may produce better models faster than a fragmented market would.

Preferred enterprise access may subsidize R&D. Premium rates for enterprise customers fund frontier research. Tiered pricing isn’t rent-seeking—it may be subsidization that benefits everyone in the long term.

Open-weight models are a partial competitive check. Llama, Mistral, Qwen, and others provide meaningful capability outside the proprietary ecosystem. But, again, open weights shift the dependency rather than eliminating it: from “pay for API access to someone else’s model” to “pay for infrastructure to run your own.” The model tollbooth may narrow. The infrastructure tollbooth remains, and the same hyperscalers are waiting on the other side.

These arguments are real, and I’m still wrestling with where I think things will end up. The question is whether they describe current dynamics in a fast-moving market or structural guarantees. GPU supply could expand, or export controls, power constraints, and fab concentration could keep it tight. Open models could reach frontier parity, or the compute requirements for the next capability jump could widen the gap. Vertical integration could remain efficient, or it could harden into the kind of infrastructure lock-in that has characterized many previous essential technologies.

The historical pattern suggests that when essential infrastructure becomes vertically integrated and concentrated, the benefits of efficiency are eventually outweighed by the costs of stymied competition. That’s not certain. But it’s happened enough times that the pattern deserves serious attention, not cheering and handwaving from the sidelines.

The US has seen this dynamic before, not identically, but structurally. New technology creates essential infrastructure. First movers capture it. Concentration forecloses competition. Eventually, structural reform unlocks the next wave of innovation. Railroads, electricity, telephone, broadband—the specifics differ, but the arc recurs.

A few data points: The Sherman Antitrust Act (1890) passed the House 242-0, Senate 52-1; both parties agreed that unchecked monopoly was incompatible with free enterprise. When the Supreme Court broke Standard Oil into 34 companies in 1911, the resulting entities collectively became more valuable than the original monopoly. GPS generated $1.4 trillion in private-sector economic value between 1984 and 2017 after the government built the infrastructure and opened it for free.

To be fair about Standard Oil: Rockefeller achieved real economies of scale, and kerosene prices actually fell under his control. The more precise claim is that monopoly redirected innovation toward extraction efficiency and competitive suppression rather than product diversity and market growth. The breakup unlocked the latter.

It’s possible that open-weight models, expanding GPU supply, and new cloud entrants will dissolve the tollbooth without structural intervention. But “this time is different” is an expensive bet. Many previous infrastructure monopolies looked temporary and competitive from the inside.

The question is whether we want to bet the competitiveness of the American AI ecosystem on the hope that market dynamics will resolve a structural problem that market dynamics have rarely resolved on their own.

If the problem is vertical integration across the AI compute stack, the solution needn’t be breaking up companies or banning GPU purchases. It’s separating the infrastructure layer from the competitive layers above it and below it so that control of one doesn’t confer control of the others.

There’s a working model for this. Switzerland’s telecom market operates on “last-mile unbundling.” Swisscom owns the physical network—copper, fiber, ducts. Swiss regulators (ComCom) mandate that any competitor can lease access at regulated, published rates. The physical infrastructure is a shared utility. Dozens of ISPs compete on the service layer above it. [cite: ComCom annual reports, OECD broadband data]

Three properties make the Swiss model relevant:

  1. The infrastructure owner cannot discriminate among service-layer competitors. Same terms for similarly situated users.

  2. Competition happens at the right layers: the service layer, where innovation creates value, not the infrastructure layer, where duplication is wasteful.

  3. The regulator’s job is narrow: simply enforce access terms and pricing.

The analogy is not perfect, and I want to be upfront about where it falls short. GPU clusters are not copper loops, and compute is not a dumb pipe. Telecom’s last-mile is naturally monopolistic because duplicating wires to every home is physically wasteful. Data centers are expensive, but they aren’t geographically fixed in the same way. Compute workloads differ wildly by latency, interconnect bandwidth, memory, security, and operational support—“non-discriminatory access” is a harder design problem when one customer needs 10,000 co-located GPUs for three months, and another needs 50 GPUs for a weekend. And Switzerland is tiny, wealthy, and institutionally competent in ways the US is not.

But the core competition problem is similar enough to be useful: when an expensive, capacity-constrained infrastructure layer becomes the prerequisite for competing in the service layer above it, the owner of that infrastructure can shape the market without ever explicitly banning competition. Data-center-scale compute has last-mile-like characteristics, and the unbundling principle addresses the structural dynamic, even if the implementation details differ significantly.

This is the framework I’ll develop in Parts 2-4. Separate the layers, ensure fair access to the physical infrastructure, and let everything above it—hardware providers, model developers, application builders—compete on merit. The hard questions (allocation when demand exceeds supply, non-discrimination for heterogeneous workloads, governance at (an American) scale) are real, and I don’t hand-wave them. They’re design problems, not disqualifications.

It’s not “government should run AI.” It’s “government should ensure fair access to the infrastructure that AI runs on.” Those are fundamentally different propositions, and the difference is the entire argument.

  • Is the vertical integration parallel overstated? Where specifically does it break?

  • The real bottleneck may not be compute & energy—it may be talent, capital, or distribution. Does the infrastructure argument miss the point, or does infrastructure access compound every other bottleneck?

  • Alan Greenspan argued the Sherman Act stifled innovation: “No one will ever know what new products, processes, machines, and cost-saving mergers failed to come into existence, killed by the Sherman Act before they were born.” Serious objection, or survivorship bias in reverse?

  • What am I getting wrong?

Next week: “Don’t Nationalize the GPUs. Nationalize the Building.” A layered architecture for AI compute infrastructure, built on the Swiss telecom model.

This article was written with Grammarly and may contain AI’isms. Also, I have been using em dashes since before they became the most hated punctuation.

Discussion about this post

Ready for more?