An Open Letter to Mr Zuckerberg

8 min read Original article ↗

Dear Mark,

WhatsApp has over 3 billion monthly active users. In most of the world outside the United States, it is the default way people communicate. They run businesses on it, organise weddings on it, message their mothers on it. For all practical purposes, WhatsApp is the global conversational interface.

This matters because every AI company in the world is currently trying to build a conversational interface of its own. Notifications, presence, threading, media handling, cross-device sync: each one is reinventing these things from scratch, poorly, inside its own little chat window. The industry has spent tens of billions of dollars training foundation models and then ships them behind interfaces that a small team could knock together in a week. You already have the right interface. You have had it for years. These two facts together represent an enormous uncaptured opportunity, and Meta is missing it.

You can already talk to Meta AI inside WhatsApp. The logic is understandable: Meta has spent billions training Llama, it has no direct way to monetise an open-weight model, and embedding it in WhatsApp at least generates engagement numbers that justify the investment to shareholders. The trouble is that this is a vertical integration play in a situation where the real value sits in horizontal enablement. Amazon faced a version of this decision in 2004 when it chose to open its server infrastructure as AWS rather than keep it captive to amazon.com; the difference between “infrastructure that runs our thing” and “infrastructure that runs everyone’s thing” turned out to be worth hundreds of billions of dollars. The analogy is imperfect, but the strategic shape is the same: Meta has built the world’s most widely adopted conversational infrastructure and is currently keeping it captive so that people can ask Llama what to have for dinner.

Let’s be precise about what AI agents actually need from an interface layer: persistent identity, a notification system that users actually check, seamless media handling, and asynchronous communication. The user sends a request, pockets the phone, and gets a ping when the work is done. This is the natural interaction model for autonomous agents, and messaging platforms have spent a decade perfecting every component of it.

The AI ecosystem has clearly noticed. OpenClaw, an open-source agent framework that went from zero to 200,000 GitHub stars in under two months, connects to WhatsApp, Telegram, Discord, Slack, Signal, and about a dozen other messaging platforms. Anthropic recently launched Claude Code Channels with Telegram and Discord integration. Everyone is converging on messaging as the interface layer. The interesting detail is how they are getting there. OpenClaw connects to WhatsApp through Baileys, an unofficial reverse-engineered client that essentially breaks in through the back door. Developers want to build on WhatsApp so badly that they are using unsanctioned, fragile, potentially bannable hacks to do it. When people are picking your locks to build on your platform, the correct strategic response is probably to hand out keys.

In 2007, Steve Jobs was adamant that the iPhone would not support third-party native applications. Web apps would be sufficient, he reckoned. The reversal, launching the App Store in 2008, created what became Apple’s most important strategic asset after the hardware itself. Apple’s Services revenue reached $109 billion in fiscal 2025, up from essentially nothing before the App Store existed. No portfolio of first-party apps, no matter how good, could have generated that kind of value, because a platform captures a share of output from an entire ecosystem that grows faster than any single product line. Revenue scales with the ecosystem’s creativity, not just your own. Jobs was smart enough to recognise this before it was too late.

The parallel is uncomfortably direct. WhatsApp has the users, AI developers have the applications, and there is no marketplace connecting them. The Business API exists, technically, but it was designed for enterprises sending boarding passes, not for conversational AI agents. Its pricing model, per-conversation fees, template approvals, and business verification hoops, reflects a worldview from roughly 2018. A developer who wants to build a personal AI assistant on WhatsApp faces an obstacle course designed for an entirely different use case.

Meanwhile, Telegram has treated bots as first-class citizens since 2015: creating one takes five minutes, costs nothing, and requires no paperwork. When Anthropic launched Claude Code Channels, Telegram was the obvious first integration, and when OpenClaw users want the easiest path, Telegram is the standard recommendation. The developer ecosystem is forming there because Telegram actually lets people build.

Platform shifts do not happen because users switch overnight. They happen because developers go where they can ship, early adopters follow, and habits form. Telegram is accumulating those habits right now, and every month that WhatsApp stays closed is a month in which those habits deepen. WhatsApp probably will not lose its user base over this, but it risks becoming the app for family photos while Telegram becomes the app where things get done. That distinction, between social utility and productive utility, matters enormously for engagement, monetisation, and long-term relevance.

WhatsApp’s hostility to bots was a deliberate and, for a long time, entirely correct design choice. The old bot paradigm was unsolicited outreach: businesses pinging users who never asked to hear from them, which is spam by any reasonable definition. WhatsApp built its reputation on being the messaging app that did not do that, and the resulting trust is one of its most valuable assets. Opening the platform to bots under that old paradigm would indeed have been reckless.

The new paradigm works differently. AI agents in the OpenClaw model do not cold-message anyone. The user initiates the conversation, chooses which agents to talk to, and can terminate the channel whenever they like. The communication is inbound from the user’s perspective, which means the spam vector simply does not exist in the same way. It is closer to the App Store model, where users choose to install applications, than to the email model, where anyone with your address can fill your inbox. Meta’s existing trust infrastructure, reputation scoring, rate limiting, abuse detection, would still add valuable defence in depth, but the fundamental architecture of user-initiated channels does most of the heavy lifting by itself.

The deeper irony is that bots are already on WhatsApp regardless, through unofficial clients like Baileys, operating with zero trust framework, zero moderation, and zero user protections. They arrived whether Meta wanted them to or not; the only remaining question is whether they operate under sanctioned guardrails or without any at all.

The developer demand is there, the precedent is there, and the strongest historical objection dissolves under the new paradigm. The ask is a new developer-facing API purpose-built for conversational and agentic AI, with a revenue share model along the lines of the App Store that lets Meta monetise the ecosystem without throttling it. The trust layer required to keep the experience safe is well within Meta’s existing capabilities, particularly when users control which channels they open and close.

There is a deeper prize that goes beyond revenue share. If every AI agent interaction flows through WhatsApp, Meta accumulates something far more valuable than a frontier model: a personal context layer. The history of what each user asks for, what they care about, how they interact, their preferences, their projects, the texture of their daily lives. With user consent, that context could be offered back to any agent on the platform, so that switching from one model to another does not mean starting from scratch. The model layer is commoditising rapidly; Claude, GPT, and Gemini are converging in capability and will continue to do so. The personal context layer will not commoditise, because it is unique to each user and accumulates over time. Meta is currently spending billions competing at the model layer, where it gives the product away for free. Owning the context layer instead would be a durable, defensible asset that no amount of training compute can replicate.

Meta wants to think of itself as an AI company. It owns the world’s dominant messaging rail. The convergence of those two facts is the single largest platform opportunity in AI, and it is currently being wasted on the Llama chatbot that, with the greatest possible respect, nobody was crying out for. Using WhatsApp as a distribution channel for Llama has some value, but it is a fraction of what the platform could be worth with Claude, GPT, Gemini, Llama, and every other model operating natively alongside 3 billion users. The sooner that distinction lands, the better for everyone. Please don’t fumble the ball, Mark.

Discussion about this post

Ready for more?