Moonshot AI's Founder: His Pursuit of AGI and the Company’s Potential Viable Business Model

16 min read Original article ↗

“For us, it’s about exploring the unknown. Just like AGI, you usually only see the illuminated side of the moon, but the dark side remains mysterious. It’s challenging, yet full of potential. That aligns with our mission.” — Zhilin Yang, Founder and CEO of Moonshot AI (Kimi)

Hi Guys,

Today, we finally look into Moonshot, the company behind Kimi K2. The company has said the model, especially, is performing better than some mainstream open-source models, such as DeepSeek’s V3 and Anthropic’s Claude models, in areas such as coding, completing general agent tasks, and tool integration.

Source: Moonshot AI

On the evening of July 11, 2025, Moonshot AI released and open-sourced Kimi K2, making it China’s first open-source trillion-parameter model. Its popularity among domestic open-source models surpassed DeepSeek-R1 for the first time. Within 48 hours of release, visits to Kimi’s official website surged by 3.6 billion, Hugging Face downloads exceeded 100,000, and GitHub-related projects skyrocketed by 200%.

On OpenRouter, K2’s token consumption quickly surpassed Elon Musk’s Grok-4, topping the global API call rankings. Perplexity CEO announced plans to fine-tune K2 for further training, and Hugging Face’s co-founder also praised it highly.

Devansh dove into the technicalities of the model here, especially breaking down the MuonClip Optimizer, and Tony broke down the cost of integrating the model’s API here.

But today, we will look at who the founder is and how the research lab became one of the most important teams in the open-source AI model space today. The founder’s background, his values, his mission, and the company’s potential viable business model.

Amongst the “four tigers,” only MiniMax and Moonshot are still charging ahead at full speed, with MiniMax doubling down on its strength in companion bot/ entertainment bot, given its legacy product Talkie, and focusing on multi-modal models.

And then Moonshot AI is pursuing the most advanced LLM. Zhilin Yang 杨植麟, the founder of Moonshot AI, said he and his team are stubborn AGI purists who are focused on the vision of the future, rather than the product or short-term profit - a contrast to the big tech players in many ways.

Zhilin Yang has long been a subject of public curiosity. He graduated from Tsinghua University and Carnegie Mellon University, ranking first in his class at Tsinghua’s Computer Science Department and completing his Ph.D. at CMU in under four years.

During his academic career, he collaborated with multiple Turing Award winners, publishing over 20 influential papers in the field of artificial intelligence. Compared to the other founders of the four tigers, he seemed to have the most ‘western’ experience, as before founding Moonshot LLM in 2023, he had also worked at Facebook AI Research and Google Brain. The Chinese AI unicorn Moonshot AI has been backed by Alibaba, Tencent, and Sequoia China, reaching a ~$3.3 billion valuation in late 2024.

Zhiling Yang is a fascinating character. Over a year ago, the founder of Moonshot AI recorded their first podcast appearance on First Push “第一推动” with Co-founder and Partner at Monolith Management Cao Xi, a former Sequoia China Executive. 月之暗面杨植麟:创新、长期、第一性. This was one of his few interviews and first podcast appearances, and here is how he described the founding of the company, his views on talent, and his pursuit of AGI. (Condensed for brevity while retaining key insights. Direct quotes remain in “”)

Zhilin is not your typical “nerd.” As we now know, Pink Floyd’s album inspired the name of the company, but what many don’t know is that Zhilin is a musician himself.

Zhilin is a drummer and was in a band called Skip List. Again, I’m not the technical person here, but what “Skip List” means is a multi-level express lane for searching a sorted list. The bottom layer has all the elements in order, while higher layers act as shortcuts, skipping over large chunks of data. When searching, you start at the top layer, move right until you overshoot your target, then drop down a level and repeat—like taking faster trains and switching to slower ones as you get closer to your destination. This makes searches much quicker than checking every single element, balancing simplicity and efficiency. Thus, it can get the best features of a sorted array (for searching) while maintaining a linked list-like structure that allows insertion, which is not possible with a static array.

He’s a man of reasoning and science, but also a man of arts and aesthetics, which will influence his design and management of the company going forward. During the interview, he said he thinks back on his time with the band fondly and mentioned that, although the direct lessons from forming a band and running an LLM company now might be minimal, it introduced him to a group of people.

“A truly great company needs cultural depth—it’s not just about technology or a soulless product. The soul comes from underlying values,” said Zhilin.

The company name Moonshot comes from Pink Floyd’s album Dark Side of the Moon, and in its literal Chinese translation, 月之暗面 - Dark Side of the Moon.

The English name Moonshot reflects something ambitious and difficult, he said to the podcast host, “like landing on the moon.” And that’s fundamentally what they’re trying to pursue here - a seemingly impossible but possible task. His poetic nature comes through while explaining how he came up with the name with the team.

“For us, it’s about exploring the unknown. Just like AGI, you usually only see the illuminated side of the moon, but the dark side remains mysterious. It’s challenging, yet full of potential. That aligns with our mission. “

DeepSeek’s all-star, domestically educated team has propelled Chinese AI talent to the forefront. But for Zhilin, he did not have this requirement; he himself was educated in the U.S. after obtaining his undergraduate degree in China. While the commonality between him and his co-founders/ key members seems to be Tsinghua graduates: Yutao Zhang, Xiyu Zhou, and Yutong Zhang.

Ultimately, he said it’s not just the technical capabilities, it’s finding the right group of people with the same mission and values. He describes his team as innovative and long-term thinkers, as well as just fundamentally believers in first principles. They are the people “who believe AGI is the only thing worth doing in the next decade,” and that “vision matters most.”

In his eyes, things are evolving daily within the space, but to be successful, it requires three core traits: learning agility and adaptability. Adding that “even my views evolve daily.”

Of course, many of his team members are also music fanatics, and their office meeting rooms are named after various albums of Pink Floyd. He said that his team could form several bands, and in fact, it is this pursuit of aesthetics that subtly shapes their products. His personal hero? Steve Jobs, whom Zhilin calls a man who was able to scale taste.

With a better understanding of who he is, we now look at what drives him.

“I witnessed a massive shift. AI has developed over 70 years, but never before has there been such a breakthrough. Previously, AI companies focused on B2B. But since last year, applications like GPT have emerged, reaching non-technical users. Hitting 100 million users was a clear signal.”

That’s why Moonshot’s Kimi has not only pursued the most cutting-edge model research but also tried to prioritize a good consumer experience (in the beginning, we talked about how that has shifted).

“Unlike the internet, which connects things, AI creates new productivity. I believe AGI could be the most valuable endeavor of the next decade,” said Zhilin.

When asked about the positive response to Kimi’s launch (2024), Zhilin said, “I’m cautiously optimistic—prepare for the worst but strive for the best. Timing also matters. Years ago, assembling such a team was harder—capital and talent flows were different. Now, the market recognizes AGI’s potential.”

But he believes it’ll be a long journey ahead, as over the last decade in AI, he’s seen its potential to multiply productivity, which has been able to unlock brainpower beyond human limits. And if AGI is pursued correctly, it’ll propel society positively, allowing more people to be free from having to work for money and freeing up creativity. If done right, he believes AI can democratize creation so that anyone could direct a film or compose music. But if done wrong, only a few will control AI, and risks will rise. Thus, he believes that choosing open source will help decentralize the technology, and personalized, open models could prevent dystopia.

Share

After DeepSeek, Nathan Lambert has written about how Kimi K2 was another awakening moment for Western developers, and this trend could potentially continue, where Chinese labs continue to lead in open source models.

Globally, AI labs are feeling the heat as open-source models are increasingly recognized for their role in democratizing AI development. OpenAI just released two open-weight frontier models: gpt-oss-120b (120 billion parameters) and gpt-oss-20b (20 billion parameters) under the Apache 2.0 license, marking their first major open-source release since GPT-2 in 2019. Still, the models are competitive mid-tier options at best, designed to counter rival ecosystems and expand adoption, not to rival their flagship products.

But Chinese labs seem to be normalizing open-source as the industry expectation now. And just at the recent World Artificial Intelligence Conference, more so than ever, the Chinese government essentially countered the US AI Action Plan by saying it will be diffusing AI technology in the global south rather than controlling the technology. The open source embrace for the first time at a national level was being vocalized publicly, so as my analysis has evolved, it's not only a philosophical choice, a choice out of necessity, but now a top-down driven movement as well.

Former Benchmark Partner, Bill Gurley, on the BG2 podcast recently made a very valid point that with Chinese LLM labs choosing open source/ weight, this strategy allows them to work on top of each other like an open-weight remix that can accelerate improvements.

Essentially, using one model to distill another, and in some ways, it might mean there will be less likely to be a monopolistic breakout winner in the long run. Still, for the big techs that are more trying to stay relevant and capture value in the distribution channels of their existing apps, it makes more sense to be open source than not. He said if U.S. big tech companies like Microsoft and Amazon were being “smart,” they should follow what the Chinese big tech companies are doing. They’re commoditizing LLMs, investing in the leading startups, and capturing the value through that, but letting the real frontier model startups do their thing - in this case, DeepSeek and Moonshot in China.

So, as foundational AI models become increasingly commoditized, labs around the world are adopting markedly different strategies to generate revenue. ROI is still unproven, but divergent playbooks are being played out.

ChatGPT generated

On one side, we have OpenAI, Anthropic, and leading closed-source LLM labs building a vertically integrated business. Where OpenAI owns the whole value chain, and is trying to become the “Apple of AGI” — an end-to-end product company with a platform and distribution. What it means is that the company is monetizing at the consumer, enterprise, and developer levels.

  • Consumer - With over 100 million users of ChatGPT and a $20/month subscription plan, the consumer tier alone could represent a billion-dollar revenue line.

  • Enterprise - ChatGPT Team and Enterprise, with per-seat pricing targeting knowledge workers, and an API platform powering thousands of developers and software vendors. If OpenAI captures even a fraction of the productivity software market, its total addressable market (TAM) stretches into the hundreds of billions globally.

So we’ve written extensively on how big tech does it because they can monetize potentially through their existing distribution and reach, essentially, eventually selling value-added services to their existing users, optimizing ad revenue, and enhancing their existing products.

Anyway, I think the divergence highlights another key strategic tension: labs like OpenAI that own the full value chain, from model to interface, can experiment with pricing, upsell across customer segments, and retain rich user data. But labs like Moonshot must rely on being picked by platform gatekeepers, competing on model performance, cost, and fine-tuning flexibility. As open-source alternatives proliferate and customers grow more price-sensitive, value will increasingly accrue to those with proprietary distribution, whether that’s through a household-name chatbot or deep integration into enterprise workflows.

In theory, the most profitable labs will be those that productize their models, not just build them. That means monetizing attention (consumer), action (enterprise), and infrastructure (developers), at scale.

And from where we sit now, only a few players will be able to manage to do all three eventually. However, I still think it’s too early to say companies like Moonshot cannot be profitable (unlike some skeptics).

But then, if that’s the “perfect business model,” how do open-source models such as China’s Moonshot AI want to reach profitability? What can they do?

Moonshot is pursuing a more modular approach. With no equivalent to OpenAI’s global reach or Microsoft partnership, obviously, Moonshot is positioning itself as an AI infrastructure vendor in China (for now), powering third-party apps, mini-programs, and enterprise deployments across China’s tech ecosystem.

Monetization could come from API licensing, model-as-a-service deals, or consumer freemium upsells. But in this approach, Moonshot doesn’t control the user interface, which is basically limiting its ability to extract high-margin recurring revenue. The theoretical TAM is still large, given the scale of China’s consumer internet and SaaS markets, but likely will cap out in the low tens of billions unless Moonshot moves up the stack.

I think this goes back to the questions we keep asking ourselves at AI Proem when analyzing these AI companies:

1) Is everyone trying to be the next trillion-dollar business? Maybe not.

2) Will the Chinese AI ecosystem remain largely separate from the ROW like the internet ecosystem did? Tbh, it might.

3) If Moonshot- and equivalent companies were to diffuse the technology to the global south, will it be limited to the global south, thus is that a big enough market anyway? [Chinese internet and EV companies seem to be proving this feasible.]

Kimi and DeepSeek’s business model, in theory, is trying to become the “NVIDIA of LLMs in China” — a key supplier to everyone else’s AI products.

If Moonshot can lock in key consumer app partnerships (like being embedded in WeChat, Alipay, or ByteDance’s enterprise tools), instead of competing with them, it can build a defensible distribution edge. But without that, they risk being replaced by cheaper OSS (like DeepSeek or MiniMax) and potentially even incumbent tech giants (Baidu, Alibaba, Huawei) with their own models. And in late 2024, Zhilin realized that by cutting some of its 2C consumer business ambitions, and reshifted resources and attention back to model development.

  1. Monetize mobile app scale fast — convert general users to premium.

  2. Lean into verticals, which is the path Zhipu and MiniMax have taken.

  3. Offer white-label LLMs to Chinese tech firms who don’t want to build their own or cannot reach the most frontier level- seems like what Moonshot is trying to do.

  4. Differentiate on context window + RAG UX as open source catches up.

So, I think in the mid-term, it would make the most sense for Moonshot to leverage its Data Flywheel and offer Training-as-a-Service, as it could eventually monetize Kimi’s usage data to improve models and offer fine-tuning services, like “bring your own data” for custom copilots.

Then it’s doubling down on solidifying its position as a foundation model lab, but not really as a consumer-facing chatbot company. Capturing consumer mindshare will be very challenging for Moonshot, especially since the BBATs have an existing reach of over 1 billion and access to endless cash piles to fund consumer marketing campaigns. (I guess unless somehow they get crazy funding and pull an OpenAI?)

But in the words of Zhilin when comparing Moonshot to OpenAI, he said,

“In addition to the technical level, our values differ somewhat from OpenAI: we hope that, in the next era, we can become a company that combines OpenAI’s technology idealism with the business philosophy of ByteDance. I believe the Asian mindset towards commercialization has certain merits. If you don’t care about commercial value at all, it’s actually very difficult to create a truly great product, or to make an inherently great technology even greater.”

Aware of the challenges ahead, he realized that to stand out, Moonshot will need to offer something different. In this case, personalization.

“The ultimate, core value of AI-native products is personalized interaction, which is something previous technologies haven’t implemented well. So this question is actually about personalization — how to enable users to gain highly personalized interactive experiences, the more they use your product. For many products today, the degree of personalization is almost zero,” said Zhilin Yang, in the same interview with Overseas Unicorn on Feb 21, 2024.

See the complete translation of the interview by Jordan Schneider’s ChinaTalk in March: Moonshot AI's AGI Vision.

There is no real conclusion on how profitability will be achieved at this point tbh, it’s all speculation as the industry slowly matures, but it is interesting to see the rapid speed of progress in the space and how the views on what is worth pursuing, how to build AI, and the ROI of AI are all evolving constantly.

I recently had interesting calls with three different early-stage investors based in San Francisco, Singapore, and Shanghai (for context, none of them were Asian funds). It’s interesting to see how the appetite for Chinese startups and perception of Chinese AI players could vary so much. I’m not sure if this could only be more representative of personal views or the environment that shapes them, but I’m keen to hear your thoughts:

  • For the U.S.-based investor, the view on China is that there is so much happening and we need to know what, how, why, and who NOW. A genuine fear of FOMO took over the conversation.

  • For the Singapore-based investor, he said that the fund is not so concerned about U.S. regulatory constraints; we believe North Asia, especially China, has a massive advantage in energy efficiency technology and in implementing AI into workstreams to increase efficiency.

  • For the Shanghai-based investor, ironically, the one in China is the most bearish; he believes China’s AI players won't be that relevant globally in ten years. Anthropic and OpenAI will be the trillion-dollar players, capturing everything from consumer and enterprise to API market share, because you can only win this race if you’re at the forefront of your models.

Let me know your thoughts. No matter if you’re a startup founder, investor, or enthusiast, get in touch. Dont be shy 🙂

Lastly, if you have found value in my work, I would appreciate it if you could share it with more people. It is word-of-mouth referrals like yours that help me grow. And please feel free to tag me in your post so I can see/share it.

Email: aiproem@substack.com

New to AI Proem? Start here

LinkedIn: www.linkedin.com/in/gmzshao/

Instagram: graceshao_proem