Chat is Going to Eat the World

10 min read Original article ↗

In August 2011, Marc Andreessen published an essay in the Wall Street Journal declaring that software was eating the world, a claim that seemed like the kind of thing venture capitalists say to justify their portfolio valuations, except it turned out to be an understatement. Amazon proceeded to devour retail, Netflix swallowed the video rental industry whole, and Spotify slowly digested the music business while everyone argued about per-stream royalties. Global enterprise software spending has grown from $269 billion then to nearly $600 billion now, and most of the companies Andreessen cited as examples have become the dominant forces in their industries.

I think we’re at the start of another shift of similar magnitude, and I’ll put it in terms he’d appreciate: chat is going to eat the world.

There’s a pattern in computing that becomes obvious once you notice it, which is that each major paradigm shift has lowered the barrier to interaction while expanding the universe of people who can use computers productively.

Desktop computing required you to learn the machine’s language, to understand files and folders and applications, to develop a mental model of what was happening inside the beige box. The web simplified things in that you just needed a browser and a URL, though you were still clicking through page hierarchies that someone else had designed, still filling out forms, still figuring out where the website designer had hidden the thing you wanted.

Mobile simplified further with tap and scroll and swipe, intuitive enough that billions of people who never owned a desktop computer picked up smartphones and understood them immediately.

Chat is the logical endpoint of this trajectory: you just say what you want, in ordinary language, and things happen. The interface disappears almost completely.

Think about what consumer interactions look like when you strip away the particulars. You start with a vague desire, you clarify what you want through some process of exploration, you review your options, you make a decision, and you take an action. This structure holds whether you’re booking a haircut, finding a hotel, choosing a restaurant, or figuring out what to watch tonight.

Conversation turns out to be the native format for this entire process, which makes sense when you consider how humans actually operate. You don’t always know what you want when you start looking for something. You want to talk it through, narrow things down, change your mind halfway through when you learn something that shifts your preferences.

Traditional user interfaces are bad at this because they force you to already know what you’re looking for. You have to pick the right filters, navigate the right menus, and if you’re genuinely uncertain about what you want, the interface offers you almost no help. Chat lets you be fuzzy and iterative, which matches how human cognition actually works rather than demanding that users pretend to have more certainty than they do.

The crucial development is that chat can now complete the transaction, not merely recommend options. If the assistant can actually book the haircut, it’s not sitting in front of the interface anymore. It is the interface.

People object that for high-stakes purchases you surely need a proper visual interface where you can inspect the details and feel in control before committing. Nobody’s going to book a four thousand pound holiday through a chat window, or so the thinking goes.

I think this misunderstands where the need for elaborate visual interfaces actually comes from.

Consider what happens when your PA books you a trip. They send you an email with the flight details, hotel confirmation, dates, and total cost. You skim it, say “looks good” or “actually can we get a later flight,” and you’re done. You don’t need to log into Expedia and click around the confirmation screens because your PA has already done the filtering and you trust them to have understood what you wanted.

Chat with a capable AI is essentially that same relationship. You can ask “why this hotel?” or “what’s the cancellation policy?” and just get an answer, which is arguably better than clicking through multiple screens hunting for information yourself. The elaborate seventeen-screen booking flow was never really about reviewing. It was about maintaining control in a world where the tools couldn’t be trusted to understand what you actually wanted.

I should be clear about something that I think people get confused about: chat doesn’t replace visual interfaces so much as it becomes the primary way you interact within them. There’s still a screen, still visual elements that make information scannable, still a confirm button when you’re ready to commit. The difference is that you navigate by talking rather than by clicking through paths that a designer tried to anticipate in advance.

This is exactly what the MCP Apps release from last week is enabling. The Model Context Protocol now lets tools return interactive UI components directly in conversation: dashboards, forms, data visualisations, multi-step workflows, all rendered inline. Anthropic, OpenAI, and the MCP-UI community collaborated on the specification, and ChatGPT, Claude, and VS Code have already shipped support for it.

Shopify’s implementation is a nice example of what this looks like in practice. Their MCP server returns fully interactive product cards with variant selection, bundles, and cart functionality, all embedded within the chat interface. As their engineering team put it, commerce needs “visual and interactive elements like product selectors, image galleries, and cart flows.” Chat can now deliver them without breaking the conversational frame.

If you accept this framing, the implications for the consumer internet are rather significant.

Right now we have millions of bespoke apps and websites, each with their own navigation patterns, their own checkout flows, their own account systems, their own idiosyncratic decisions about where to put buttons. Companies spend fortunes on UX design, conversion optimisation, and A/B testing, all to compensate for the fact that their interface is a thing that users have to figure out.

Much of this complexity collapses if chat becomes the primary interaction layer. You don’t need to design seventeen screens for a booking flow when you can expose an MCP endpoint that receives intents and returns structured data. The app becomes a backend service with a conversational contract, and frontend complexity drops dramatically.

We might look back on this era of idiosyncratic website interfaces as quaint, in the way we now view hand-painted shop signs or manually operated lifts. There was something almost artisanal about learning that this site puts the checkout button top-right while that one hides it in a hamburger menu, that this airline has a four-step booking flow while that one has seven. Charming in a way, like regional dialects, but also ergonomically maddening, and ultimately not something anyone is going to miss.

The restructuring is coming, though it’s happening slowly because most businesses haven’t rebuilt their systems for chat. They’ve bolted a crappy chatbot onto their existing site, which is not the same thing as having a proper protocol that an external assistant can call to browse inventory, check availability, and complete transactions. We’re in the awkward middle period where the AI can talk impressively about booking your flight but actually completing it still requires falling back to a browser.

If services become chat-accessible backends, then your assistant becomes the universal interface to the consumer economy. One app that talks to everything on your behalf, knows your preferences, handles authentication, compares options across providers before you even ask.

This is obviously a valuable position to occupy, which is why OpenAI is subsidising consumer inference so aggressively to make ChatGPT the default place people go to get things done. It’s the Google playbook: get in the middle of intent, then monetise later through advertising or by taking a cut of transactions.

Here’s my contrarian take, though: the agent layer might be thinner than people think.

If the agent is mostly just a language model (which is increasingly commoditised as capabilities converge) plus a bundle of MCP connections to external services, what exactly is proprietary about it? Amazon is valuable because of warehouses and logistics. Shopify is valuable because they’ve aggregated the merchants. Luxury brands are valuable because of the products and the brand equity. The agent is just orchestrating API calls to other people’s infrastructure.

Value might not accrue to the front door at all. It might stay distributed among the services that actually do things, with the agent acting more like a commodity browser than a monopoly platform. The front door could even be open source.

The counterargument is that agents will develop lock-in through personalisation. If your assistant knows everything about you from years of interaction, you won’t want to switch even if competitors are technically equivalent.

The problem with this argument is that current language models don’t actually learn about you in any meaningful sense. They’re static neural networks. The “memory” that today’s assistants have is essentially a retrieval hack, where facts and summaries get stored in an external database and injected into the context window when they seem relevant.

Your preferences exist as text somewhere, not as adapted weights in the model, which means they’re portable in principle. If your preferences can be represented as structured data (”prefers window seats, allergic to shellfish, likes boutique hotels”), there’s no technical reason that data couldn’t be exported and imported to a different agent.

For genuine lock-in, you’d want the model to have internalised something about you that’s intrinsically hard to transfer, some deep intuitive understanding of your patterns that emerges from learning rather than being stored declaratively. Current systems simply can’t do that. They reset with every conversation, or at best retrieve some context from storage.

So if memory stays shallow and portable, switching costs stay low and the front door remains contestable. OpenAI can subsidise all they want, but if users can take their preference file and move to Claude or Gemini or some open-source agent running locally, the moat isn’t real.

If switching costs are low, what would actually make people stay with a particular provider?

Here’s one answer: being on their side.

An ad-supported agent has misaligned incentives, recommending what advertisers pay for rather than what’s genuinely best for you. People might be willing to pay real money for an assistant that’s reliably faithful to their preferences, that doesn’t quietly prioritise sponsored results or nudge them toward higher-margin options.

This doesn’t mean the model layer becomes wildly profitable. It’s fiercely competitive and likely to stay that way. What it does suggest is that there’s a plausible path to a subscription business, one where the product is an agent that’s actually aligned with your interests rather than someone else’s. That might matter more than lock-in.

We’re at the start of a paradigm shift comparable to the transitions from desktop to web or from web to mobile. Chat will become how most people interact with most digital services, not because it’s a gimmick but because conversation is the natural interface for expressing and refining what you want.

The interesting question isn’t whether this happens but where the value ends up. My guess: infrastructure providers do fine, service providers with real backend capabilities keep mattering, and there are meaningful opportunities in the integration layer that makes businesses chat-accessible.

The front door itself might be less defensible than current valuations suggest.

Discussion about this post

Ready for more?