Android Dreams

43 min read Original article ↗

At the convergence of frontier research breakthroughs, billions in capital, and rising geopolitical tensions lies a dream for a new physical world. After the LLM wave, robotics is seen as the next exponential growth domain.0Robotics startups have seen influxes of billions in capital over the last 2 years, and the trend is only increasing: $6.9B in 2023, $7.5B in 2024, and $10.3B in 2025 so far, here and here.Chinese manufacturing is viewed as an existential threat to the US, adding to incentives. And, though robotics is the hardest domain of AI1Moravec’s Paradox: the things that are easy for humans are difficult for AI, pointed out by Hans Moravec in his book Mind Children. This “paradox” becomes more apparent over time., multiple new AI strategies now offer clear paths to Embodied General Intelligence (EGI).2Teleoperation, Exoskeletons, world models, reinforcement learning, and internet video are the frontrunner strategies. The question is now: which wins?

Informed by conversations with frontier researchers, intuitions gained at Optimus and Dyna2.5My intuitions developed in part from world-class mentors, but Android Dreams uses no non-public information from these companies., and my own syntheses, I predict inference-controlled robots will comprise half the world’s GDP by 2045. This scenario illustrates how.

2023-2025: The Dawn Era

Throughout Android Dreams, I use hypothetical company names to represent archetypes of companies, like “US AI lab” OpenBrain, or “Chinese humanoid company” Unioak.4OpenBrain name and hypothetical naming system from AI 2027, an inspiration for this work

OpenBrain's LLMs take the world by storm. Multiple robotics companies raise $300M rounds on the back of OpenBrain's success, including Waytek, a new US robotics startup. They try to replicate OpenBrain’s success by creating an “LLM for robots”.3.5A Vision-Language-Action model (VLA) is like an LLM but takes in both vision and language, and outputs actions. They are the current popular architecture for robotics models, consisting of a pre-trained VLM with a diffusion-based action head finetuned on robot data as described here.

OpenBrain pre-training worked because they had an entire internet's worth of text data. But there’s no “internet of robotics”: the largest action dataset for robots is less than 0.01% of OpenBrain’s LLM dataset.5The largest action trajectory dataset is Open-X-Embodiment, which is x Terrabytes. GPT-3 trained on x tokens, 4 on x, and 5 on y. no amount of recorded robots or humans actions data is recorded. We'll see that manually collecting this data for EGI is near-impossible.OpenBrain’s post-training works by reinforcement learning, where the model attempts tasks like complex math problems billions of times and learns from its successes and failures. But for robots, the real world is too slow to get billions of interactions.6RL requires lots of data, in the form of interactions with the environment it cares about. Math and Coding RL for LLMs worked because they are entirely virtual and can be fully accelerated. We can’t accelerate the real world directly, so gathering interaction data for RL robots is a challenge. This is why no scaled-up RL in the real world for robots has been shown, yet…All robotics research efforts are now about enabling robots to scale both pre-training and post-training.

Waytek Teleoperation Works

From 2023 to 2025, Waytek tries to solve the pre-training bottleneck by brute-forcing through teleoperation: they collect data of humans controlling robots, then train a Vision-Language-Action model (VLA), with an architecture similar to LLMs, to imitate that data.6.5All open source models in modern AI robotics have three modules: a vision module, a language module, and an action prediction module. The vision and language modules are usually initialized from small, open-source VLMs like Qwen, Llama, or Gemma. See Pi0 and another here.Early signs show this work in demos with high reliability, like laundry, making sandwiches, folding shirts, and sorting packages.7Physical Intelligence’s pi0 here showed teleoperation works to train a VLA to get a robot to fold laundry end-to-end, among many other task demonstrations.

Some companies like Waytek go all-in on teleoperation as a result.8Home humanoid company 1X recently famously released its robots in the home here, with the intent to gain more diverse data and scale.Half think teleoperation will lead to generality, the other half decry teleoperation because it isn’t scalable. Both are wrong: the two types of AI robot tasks, Narrow and General, require different solutions to pre-training and post-training.

Narrow tasks are simple but with slight variance, like package sorting and cloth folding.8.5Narrow task markets are still massive, in the multiple billions.General tasks require all functions of a full human, like service, construction, healthcare, education, and the home. Teleoperation works for narrow tasks, and is a dead-end for general ones.9Explained in the next section, see “how much teleoperation to EGI?”.

Across the Pacific, Xi’s shadow grows

Shenzhen automates its hundredth manufacturing plant into a fully autonomous dark factory.9.25Hundredth is an estimate, but the real figure is likely somewhere in the high tens or hundreds. China already has 295k industrial robot installations vs the US’s 38k.China produces 2x America’s energy, has 10x the manufacturing capacity, and proceeds indomitably toward 100% automation.9.4In 2023, China generated 9,300 TWh of electricity vs the US’s. 4,178 TWh here, a 2.2x multiple. Manufacturing capacity is estimated here based on the steel production of China is ~1,005 Mt, U.S. ~81 Mt here, and ****China has around ~12× (≈2.5 TWh vs >0.2 TWh) the LiOn Battery Production here and here. Their actuator production capacity is ~1M (Laifual ships 300k/year and Nantong Zhinkeng 100k/year) compared to the US in the low 100s of thousands. China already has 50% of the world's robot installations here. China controls ~90% of refining and ~98% of magnet production here.

In 2024, Unioak, a Chinese humanoid company famous for its robot dogs, starts selling humanoids on the market. By 2025, they’re doing dances and wowing in demos.9.5Unitree dance demos here and here. They're not just targeting dances anymore.US investors discuss Chinese dark factories in fear, and write memos about how America Must Reshore Manufacturing.9.8My favorite is Martin Casado’s here, I agree with Martin and view Android Dreams as a path to implementing these visions.People worry about China, and people worry about AI job replacement. But no one quite sees what’s coming.

2026-2030: The Vertical Era

By 2026, the first AI-controlled robot will replace a human job. Waytek, the frontrunner US vertical robotics company, uses cheap Chinese hardware and teleoperated data collection on a simple task like package sorting until they reach 80% of human performance. They offer their robots labor to a package sorting facility, whose owner wants to help the oncoming automation wave.

Waytek Clones Rise

Bears point out that Waytek robots can’t generalize or reason, and are not truly intelligent. But for the first time in AI robotics, robots actually work. Waytek raises a new $400M round and hires operators, engineers, and data collectors to scale up deployment while others raise their own $50M rounds to replicate Waytek’s recipe in other verticals.

Waytek uses a “Robots-as-a-Service” (RaaS) model to sell: take in wages for each hour of robot work instead of selling robots outright. This lets users ignore technical or operational complications during deployment.11This is an underrated aspect of the integration of robots into industry. There will likely be humans employed as supervisors and technicians for autonomous fleets for a while, just as “dark factories” are still supervised operations.Scaling to multiple facilities of package sorting, Waytek scales to $100M ARR by automating this one job alone.12Even for simple tasks like package sorting, the TAM is enormous. I envision Waytek to capture a small part of this market in the beginning as they overcome the challenges of deployment.

Observing Waytek’s success, US humanoid companies like Noumena realize that expensive humanoids can’t compete with cheap Chinese hardware on simple tasks. To be successful, they’ll need to target general tasks that Waytek’s strategy can’t reach: tasks that justify humanoids’ 4x cost multiples.14Humanoids today cost 100-200k, and cannot compete with cheap 20-40k hardware on costs, even using a RaaS model. But they win only in tasks that cheap hardware cannot do (which is a much larger TAM)

Waytek Scales to $10B ARR

As they scale, Waytek shifts mostly to exoskeleton collection.14.5Exoskeletons are robots that can be worn and attached to follow a human's movement, instead of being remotely controlled like in teleoperation.Teleoperation systems are expensive and difficult to operate, while exoskeletons are inexpensive and enable human-level dexterity for better data.15Dexop here shows exoskeleton data transfers to real robot performance. Generalist here added even more credence to this idea. Exoskeletons are more scalable than teleoperation, which requires an entire ~$20-40k robot per collector.

As they reach thousands of robots deployed, Waytek trains small, task-specific world models (videos that an AI can interact with) to evaluate their AI models and accelerate development.16world model RL is shown to work by Dreamer 4 here, and at the bottom of this page shows transfer to robotsTheir deployment scale also enables more data for reinforcement learning, which improves their AI’s task performance on each narrow task over months.17LLMs showed that Reinforcement Learning works better and more stably the larger the batch size is. Reinforcement learning also requires lots of interactions with an environment, which more robots deployed will enable.

Operations are also a challenge; integrating with messy operations in novel circumstances is never easy. Human workers are still needed to supervise when robots get stuck, malfunction, or break down. But even hesitant employers simply cannot resist the cheaper robot labor.18Even hesitant business owners can't deny the economics: Waytek’s robot, which costs only $10/hr and can operate 24/7, is much more efficient than a $20/hr human.

Having solved the post-training bottleneck in narrow domains, Waytek robots keep improving, going from 80% to 90% to 95% of human speed. Waytek continues to accelerate deployment as they encounter more scenarios of the same few verticals. Over 5 years, vertical robotics companies like Waytek continue to scale and earn billions of dollars in narrow task wages: industrial laundromat folding, hotel towel folding, basic food processing, and warehouse package sorting.19The TAMs for these tasks are in the multi-billions. At a wage of $40k/year, 100k robots puts Waytek at $4B ARR. 100k workers only comprise 0.3% of the 12M logistics workers in the US alone.

By 2030, Waytek reaches more than 100k robots deployed.19.5At a rate of 40k, 100k robots puts Waytek at $4B ARR. 100k workers only comprise 0.3% of the 12M logistics workers in the US alone.Their expansion is bottlenecked by how fast they collect data and integrate into company operations. The operational challenge of integrating robots with real labor is as time-consuming as the AI problem.20Changing operations from humans to robots comes with difficulties, like maintenance, slight changes in standards, adjustment from 8 to 24-hour operations, etc. And, most of all, every new deployment requires new data.But even without expanding into new verticals, Waytek has lots of room to grow.

China’s Furnaces Heat Up

As Waytek and its vertical robotics clones experience rapid growth and consume Chinese hardware by the billions, a looming shadow grows. China’s robotics manufacturing supply chains experience Wright's law: costs fall 20% for each doubling of manufacturing volume.21Wright's law is a heuristic developed from airplane manufacturing, stating that for every cumulative doubling of units produced, the cost of producing each unit falls by a constant percentage. Identified by T.P. Wright in Factors Affecting the Cost of Airplanes. 20% is a rough estimate based on averages across industries.Their advantage over the US widens as demand for cheap hardware rises.

China’s government has already mandated automation of physical labor in key sectors like manufacturing and orients its entire economy around this goal.21.5Beijing’s Made in China 2025 targets 70% domestic content in core components/key materials in ten sectors, including robotics, here. They set a doubling of robotics automation goal in 2020 for 2025, and already surpassed this goal here.It's nearly forced into this: China’s aging population spells disaster if China can’t automate physical labor in time.22China’s population fell ~2M in 2023 and ~1.4M in 2024.

State-subsidized capital allocators fund new Waytek clones in China, starting with Xiaoai Automation: the Chinese equivalent of Waytek. China has lower hardware costs than the US and much lower data collector wages.23Competing with cheap Chinese labor would be difficult without the power of glorious CCP subsidies. Average data collection wage in the US is ~$35/hr, but in China $~15/hr, and many more people are available to collect data in China.

America struggles to compete with China’s deeply entrenched low-cost hardware supply chains. The US government knew regular AI would be a threat, but it finally wakes up to just how far behind it is in manufacturing as Chinese actuators reach less than 20x lower cost than their US equivalents.24In 2025, Chinese integrated servo joints are $500–800 compared to the US’s $1.8k–3.5k (MyActuator X8 vs HS FHA series). Reducer class joints are ~$200 vs ~$1500 (Luafei vs Harmonic Drive Systems). Meaning today there’s already a 5-10x cost multiple, and the gap is only widening.And as China works backdoors into their robots, the US realizes that having millions of Chinese robots in America would be a significant threat to national security.24.5Unitree already engineers backdoors into their robots. It’s likely that the US government decides that the benefit of using Chinese-manufactured robots is worth the risk. Especially for robots with Chinese component parts but assembled in America.

Displaced workers advocate AI Socialism

In America, those displaced from automated jobs combine with AI-displaced white-collar workers into a UBI advocacy group. These AI Socialists are the first to seriously push for the basic income for the AI-replaced.25AI socialists are an imaginary group focused on AI-displaced rightsAnti-AI sentiment grows from the “clanker” memes of 2025 into a serious and powerful movement.26The “Clanker” meme is ironic, but also a foreshadowing of cultural sentiments in the futureRobotics companies are seen as cold, unfeeling institutions stealing money from workers and hoarding for themselves. Waytek realizes it needs to improve public perception and elects to give portions of revenue directly to displaced workers.27I imagine this to help them instead of being higher taxed by the government, and the government gives money to the displaced. A potential model is that workers get a cut of the robot’s wages that replaced their job, up to 5 years.

Waytek wants to expand past narrow tasks but can’t: automating one job in one vertical takes months of data collection and ops effort. Getting to EGI, on GPT-5-level, would require more than 10 years and 10s of billions of dollars.28“Data quality and diversity matter more than sheer volume”, as according to ablation experiments by Generalist AI here. Teleoperation is inherently nondiverse and optimized for volume. Even discounting diversity, the calculation of how much teleoperation to EGI gives the mentioned metricsTeleoperation and exoskeletons are unscalable for general intelligence: robotics needs a new method to crack generality.

2027-2032: The Humanoid Era

Each era describes unique trends, but eras overlap and unfold simultaneously

Noumena surpasses Waytek using human video

By 2027, US humanoid company Noumena had long ago pivoted from teleoperation and exoskeletons. Inspired by academic research, they turn to a more scalable alternative: learning from human video.29The following papers prove that learning from video works (using different implementation details). A Whole-Body Motion Imitation Framework from Human Data for Full-Size Humanoid Robot here, and Humanoid Control from human videos here. Robot Telekinesis here shows robot learning from internet video. K-VIL here extracts object keypoints and learns to match those trajectories.The high-level method is proven in academia already: record humans doing jobs, extract meaningful action data from the video, and train a large model to imitate the data.30This method has been proven to work in industry and is publicly pursued by many humanoid companies. FigureAI released here that they're already collecting loads of diverse home human video data. Tesla also demonstrated here that, using an unspecified method, learning from human videos with a minimal amount of teleoperation data scales to a wide variety of tasks. SkildAI also reported that they are including learning from video, likely associated with a CMU learning from video papers here and here.

Learning from video scales better because, instead of bulky hardware, workers perform their jobs as usual while being passively recorded by multiple cameras. Instead of a $40k hardware of $5k exoskeleton, each worker needs only 2-4 $75 cameras. No adjustment to operations is needed from companies.31Workers can operate as usual; the recording process is passive. The main operational overhead is the camera setup. I imagine this is dealt with by robotics companies themselves, but new data companies take over as well…

Learning from video allows Noumena to collect for difficult tasks like manufacturing, farming, and construction.32Farming, manufacturing, and construction can’t be teleoperated or exoskeletoned, because they are inconvenient for workers. Yet the tasks are repetitive and thus ripe for human video humanoid automation.These are easier than full general tasks (no reasoning or long-term memory necessary, but are higher-variance than Waytek’s narrow target tasks. Waytek, whose grippered, mobile-base Chinese hardware exhibits poor learning transfer from human data, cannot use learning from human video.33Academia has already shown that physiology matters for human video transfer because the retargetting process depends on robot physiology. Closer to human hardware is better. Robot telekenisis shows here as well as other research here and here.

Noumena solves post-training

As robots start to be deployed at scales in the thousands, Noumena solves the post-training constraint in two ways: in the real world and in neural-network-generated world models.34Both of these won’t work generally at this time, but will work in narrow domains for getting better at specific, repetitive tasks with low variance.

Per-domain neural network world models are trained on millions of hours of deployed robot data. Traditional simulations couldn’t capture real-world complexity, but world models learn each quirk of the environment better as dataset sizes and parameters grow.34.5Genie 3 here shows that world models scale to human-judged realism. Dreamer 4 here shows world model RL successfully transfer to an environment with no extra in-domain learning.Noumena uses reinforcement learning in these task-specific world models to gather lots of interaction data, going from 80% up to more than 100% of human task speed.34.8LLM RL has already been shown to be able to increase complex task performance manyfold, see the o1 report here and Deepseek R1 here and here.

Since human data is hardware-agnostic, reinforcement learning is also used to allow models to adapt to each hardware, like slightly higher-friction actuators.34.9As mechanical engineers know, no two hardware instances are the same!As more robots are deployed and reinforcement learning data sizes increase, post-training becomes more and more effective.35The scaling RL paper here shows that RL scales in a sigmoid curve (exponential then decaying) with respect to compute in a given domain. In robotics, compute for RL is limited by how many robots there are, since that limits amount of data coming in at once (and thus how much compute we can use).

Noumena Scales Across Verticals

Now that both its per-task pre- and post-training strategies work, Noumena seeks to expand it formula everywhere it can. For Noumena, gathering human video data on their target domain is necessary before deployment: zero-shot performance is impossible.36There will be a long period of time during which we have some generality, but task performance is significantly enhanced by task-specific data. In fact, this will almost always be the case. Even for EGI, data communicates specific techniques of operation.They realize their limiter for deployment is data for that exact task and domain. Data companies like XSize realized this years ago and now thrive.

XSize sells troves of task-specific human video data to humanoid companies, who then deploy robots into those exact domains. On an ongoing basis, XSize organizes contracts with application domains, sets up cameras to record workers, and sells that data to humanoid companies, orchestrating the deployment of those humanoids in the same locations. Thousands of these “forward-deployed data companies” act as interfaces to accelerate humanoid deployment into the real world.37I envision that humanoid companies want to expand as rapidly as possible. Seeing as their bottlenecks are data and operational effort, it stands to reason that commodity companies will be made around getting humanoids into deployment.

This per-task loop of deployment continues manually, growing into the billions over the years of deployment.38The TAM for human video humanoids is any task that's simple and needs no reasoning, but still dexterous. This includes manufacturing, farming.Continual RL on deployed fleets acting as an eternal source of per-task growth. As one would expect, the simpler tasks like manufacturing and farming are automated first, and workers are left with the more streuous and difficult tasks like mining and construction. But Noumena’s humanoids are still not even close to zero-shot deployment capability, and Noumena realizes that it’s still missing a critical piece of the generality puzzle.39Any manual data collection strategy will be bottlenecked on data, as compute grows by Moore’s law, manual data collection cannot keep up. Passive internet-scale data, on the other hand…

US Reindustrializes by Dominating Actuator Manufacturing

As manufacturing upper-level Type I and lower-level Type II tasks are unlocked, America closes the loop on all tasks needed for robots to manufacture robots.40This will take a couple of years, but it is the highest priority of many robotics companies.Now that robots can automate human labor in manufacturing**, America reaches a critical juncture.**

Just like self-improving AI, there is an exponential, self-reinforcing aspect of general robotics growth, specifically for robots that specialize in manufacturing.41Many disagree because there are other limiting factors; this is true. I think this post provides a good bear case for this exponential manufacturing narrative; this presents a good bull case. Organization, factory design, traditional automation machines, and, most of all, input resources, which itself is limited by both human labor and resource processing. Takeoff is delayed by these, but the factor is very much real.Just as AI can improve itself by automating AI research, robots can automate the process of building more robots. Robotics has an exponential, self-reinforcing manufacturing curve, just as AI research can create a self-reinforcing intelligence curve. Wright's law, that costs fall 20% for each doubling of manufacturing volume, further accelerate this curve.42Wright's Law here, the actual percentage depends on industry, but over most industries turns out to be 10-25%.

This curve takes a long time to pick up, because just as the AI growth curve is bottlenecked by raw compute, the AI manufacturing growth curve is still bottlenecked by other factors including organization, factory design, traditional automation machines, and, most of all, input resources, which itself is limited by both human labor and resource processing.43Self-reinforcing robotics manufacturing is not a “whoever gets there first wins it all” situation, because so many other limiting factors exist for robotics manufacturing. But this effect will widen any existing gap by default.For these reasons, takeoff isn’t immediate once human manufacturing labor can be automated, it will take 10-15 years from now until the self-reinforcing loop is hit.44See here. By this time, following the trend of 10x inference cost decrease every year (even assuming decay in that trend), inference will be more than 1000x cheaper. Each robot costs around 40-80k, amortized over even just 2 years this is much cheaper than human labor wages. As is manufacturing the extraction of minerals from the earth, as is rare earth processing.But robotics manufacturing is bottlenecked by human labor: ~60% of robot actuator manufacturing is human labor.

The US Government now sees a window to pivot its AI advantage into a hardware advantage.44.5This also makes use of the US’s training and inference compute production advantage**Just like China’s rare earth processing strategy, America starts the manufacturing growth loop by automating, subsidizing, and dominating a critical vertical: actuator manufacturing. **

America realizes it will have difficulty competing with China's entire ecosystem, so it must pour all its resources into dominating one specific subdomain, ideally the most important one. Actuators are the most expensive component in robots, comprising around 30-50% of the total cost, while human labor comprises around 60% of the actuator’s expensive manufacturing process45See here, here, and here. Actuators require precision machining, cold-forming, heat-treating, and precision form-grinding, which requires specialized human labor here. Strain-wave reducers are ~50–70% human labor because flexspline forming/grinding, wave-generator assembly, bearing integration, and nearly all fixturing, inspection, and test-running steps are manual, even when CNC cuts the metal. Servo-motor production adds ~30–50% human labor from winding, magnet placement, and QC, which combines with manual encoder wiring and final assembly, so the integrated actuator ends up ~40–60% human labor overall.. Perfect for automation by the very AI robots they’re trying to manufacture. Like China, who now controls ~90% of refining and ~98% of magnet production using ~10 billion in subsidies, D.C. commits billions into dominating actuator manufacturing.46See here and here

Noumena and Unioak manufacture humanoids by the millions

Noumena already started building their own humanoid manufacturing facilities years ago. The first tasks they collect and automate are for the human workers in these facilities.47easy access, and high priority

Aided by the US government, Noumena now builds specialized facilities to manufacture commodity inputs to robots like harmonic-drive actuators and processed rare-earth metals. The US government adds subsidies to these domestic ingredients to try to help them compete with Chinese alternatives. They also focus on industry-critical verticals like mining, construction, and metal processing.48These verticals are harder to automate than manufacturing, and will come laterAmerica races to build the machine that builds the machines.

Unfortunately, China realized this long ago. Unioak, China’s most well-resourced humanoid company, shifted from entertaining dances to useful manufacturing tasks by learning from human videos.49Unioak is a parallel to many similar Chinese humanoid companies, which currently focus on agility but will soon switch to economically meaningful tasks…With the might of Beijing behind its back, Unioak manufactures humanoids by the thousands per week and accelerates automating its factories.

China uses the same playbook they used for dominating earth metal processing: automation, amortization over large volumes, and government subsidies to kill competition. Unlike in virtual AI, China never needed to spy on Americans. They were ahead from the start, and their self-reinforcing manufacturing curve hit even earlier. By the time America could onshore manufacturing, China drove the price of a harmonic drive unit from $250 down to $100.50If the Wright’s law heuristic holds, then this price reduction is reached with just a 16x volume increase.America still has a herculean uphill battle left to fight.

Automation sows chaos at home

While its overseas rival grows, Washington DC has another problem to deal with. Nearly 10% of the U.S. population has now been displaced by automation, both blue-collar and white-collar.51summing easy verticals workers, also easy white collar jobs like simple webdev swe and analysisWorkers refuse to be recorded by cameras, since they know they lose their jobs soon after. “Clanker” is shouted in the streets, though robots luckily haven’t yet reached human-facing roles, so abuse is minimal.

Subsidized income is given to the displaced, likely through the government. Citizens demand that automating companies like Waytek and Noumena either pay displaced workers directly or pay heavy tax rates up to 70% of profits to fund this UBI system. The cost of human labor rises because of UBI, driving incentives for automation.9Explained in the next section, see “how much teleoperation to EGI?”.

The political environment is now driven almost entirely by the oncoming automation wave. Citizens want to distribute the gains from AI to all and slow down automation, while companies urge that every dollar must be spent trying to claw back America’s manufacturing dominance from China. It’s existential, they argue. But trends from the early 20s continue: income inequality rises, and the average American is worse off than their parents.^ Political division is rising, both culturally and economically.

People see one way out: full automation, and abundant income for all. But the current path still gives no clear way to a truly general system. There’s still not enough diverse data: general embodied intelligence needs to generalize to everything on earth. All roads point to one trove of data yet to be used.

2031-2045: The General Intelligence Era

The last Bottleneck to AGI, Adaptive Long-Term Memory, is Solved

Both humanoid companies like Noumena and frontier AI labs like OpenBrain have been contemplating how to crack the final barriers to EGI for years. In 2029, OpenBrain’s computer agent researchers solve the last remaining bottleneck to human-level virtual AI: adaptive long-term memory53Adaptive long-term memory will be solved by an AI lab because that bottlenecks computer agents, whereas roboticists have many more problems to deal with (first the pretraining and posttraining problems need to be solved.(LTM). Adaptive LTM allows models to break the long context barrier holding them back, and enables true “online learning” like in humans.53.5Adaptive long-term memory is not just information storage, but the method of adapting context and notions of importance. It is the macro-online learning mechanism in humans. Depending on the final implementation of LTM, it may be combined with RL.This was the first major barrier to truly general robots, now solved. The only remaining bottleneck to EGI is pre-training.

OpenBrain solves Robotics pretraining

The first hints to the solution to scalable robot pretraining showed in OpenBrain’s video model, Soreo 4, which could produce video including human movements conditioned on language.54Sora and Veo both show in-context control of human bodies in video production, see the Veo manipulation demonstration videos here. Try the Sora prompt: create an egocentric video of a robot doing chores or a human doing construction work. They have all the priors, the problem is now how to translate that knowledge into actions with some kind of (likely learned) 2d to 3d transformation.Veo could generate fully plausible robot hand movements, manipulating objects and adjusting to context and user input like a robot should.55Many papers have shown transfer from the internet to real-world works. RT-2 here showed transfer from a VLM trained on internet data. All modern open-source models like Pi0 follow the same format: leverage internet pretraining to get powerful priors in the real world. VJEPA2 here shows that a model trained to predict video and posttrained to predict actions is better than one without the pretrainingSoreo could generate coherent egocentric videos of humans, and even robots, moving intelligently conditioned on text and world context.

It became clear to OpenBrain that video models have implicit action models.53.8See “Video models are zero-shot learners and reasoners” here.Substantial evidence in 2025 already shows that pretraining on video prediction offers significant gains, even at small scales.55Many papers have shown transfer from the internet to real-world works. RT-2 here showed transfer from a VLM trained on internet data. All modern open-source models like Pi0 follow the same format: leverage internet pretraining to get powerful priors in the real world. VJEPA2 here shows that a model trained to predict video and posttrained to predict actions is better than one without the pretrainingThe problem now became how to more-strongly extract a working robot foundation model from these videos.

Sutton’s The Bitter Lesson made clear that whatever algorithm scales with compute wins.56scaling with compute means you can’t be bottlenecked on any other factor. This bitter lesson isnt an eternal principle, but is true in our time because the rate of compute improvement is so high, following moores law of a double every 2 years.An algorithm that uses extra compute to learn directly from troves of existing data is much better than one bottlenecked by brute-forced data. After some time, OpenBrain researchers crack this puzzle and create a general solution to the pre-training problem.

Now the three criterion for EGI, long-term memory, online learning, and pre-training, have been solved. At the intersection of these solutions, OpenBrain crafts the first AI model with all the capabilities of a human.57I mean all economically useful capabilities, not every possible capability in existence.Embodied AI has finally gained generality, EGI has been achieved internally.58AGI achieved internally.

OpenBrain Scales EGI

OpenBrain’s new robot model, called EGI-1, has achieved average human performance at most tasks when dropped into a new environment.59EGI is a gradual process. And after that, surpassing human performance becomes much harder. The exponential takeoff narrative is true, but happens over a much, much longer time horizon then some might think.This is good enough to justify replacing many many people. Seeing that they would create this, OpenBrain already bought a less-expensive humanoid company that manufactures in China: their only chance to compete at the same level as Noumena. They started manufacturing at scale a year ago, and are now primed to start deploying. They start with industry, with construction, behind-the-counter service jobs, and service.

Over the years, online learning grows. Combined with long-term memory, reinforcement learning reaches unprecendent data scales aided by large deployment fleets and massive general world models like Openbrain’s Wisher.60the real world equivalent to Wisher is Genie, a massive internet-scale pretrained world model

Now the labs have another scaling law, and continue investing more and more compute into this method as EGI goes from the level of GPT-3 to GPT-6 to GPT-9. OpenBrain uses some form of distillation to fit on remote compute or remote inference of the full model. By this time, computing has improved such that a trillion-parameter model can fit on just one edge inference device.61If Moore's Law holds, then by 2030, this should be trueNoumena also suspected the scalable video model strategy, and pivots to this strategy. Though they are disadvantaged in pre-training, Noumena already has tens of thousands of robots deployed, which is a massive advantage for post-training.

Finally, humanoids can replace humans with immediate deployment and RL as they work. Humanoids break into service, construction, mining, farming, healthcare, education, and, at last, the home. Humanoids are everywhere. They gradually improve online, gathering more interactions with the real world and iteratively refining their general intelligence.

Humanoid doubters realize they’re wrong. The humanoid form-factor proved critical, even aside from “the world being designed for humans”, because both training from video models (which mostly predict human movement) and learning from human video require near-human physiology. Humanoids were truly the fastest way to reach EGI.62retargeting humans

The Fifth Industrial Revolution: Robotics Companies Boom

Even after EGI, Waytek’s cheap robots continue to grow. For a long time after, there are different classes of robots pervading all industries. Companies still want cheap Chinese stationary arm robots to pick boxes or fold clothes, not overkill expensive human-level humanoids.63I envision cheap robots being around until humanoids reach a manufacturing scale to justify even lower costs. This is still a ways away.If cheap hardware worked before, it still works.

Each vertical demands a different class with unique branding and purpose. Both simple robots and humanoids have 3 versions: default, durable, and dexterous. Humanoids additionally specialize into small service robots, male and female androids, agile combat models, and more.

The only bottleneck to scaling deployment is now manufacturing capability. OpenBrain acquired a humanoid company just for this purpose, while Noumena has been building its automated factories for years now. The government is now pouring billions into subsidizing humanoid manufacturing, but it may be too late.64I imagine we ought to do this as soon as possible

On Beijing’s orders, Deepcent (a representation of China’s big AI labs) has acquired UniOak and solved EGI for the CCP.65Deepcent needs hardware just like OpenBrain. OpenBrain couldn't acquire Noumena, but the CCP can mandate Unioak to merge with Deepcent.UniOak and other Chinese humanoid companies are churning out humanoids by the millions per week. They’ve long grown orders of magnitude above the US in the exponential manufacturing growth curve. China’s manufacturing capability dwarfs America’s, and combined with its dominance in energy production, the United States of America has trouble competing with Chinese prices.66If energy production follows the trends, then by this time, China will have ~10pWh vs US 4.5 PwH, see here and hereNoumena and other companies can still compete in high-quality robots, but the majority of the world’s mechanical workforce is manufactured in China.67At this point, Chinese production could be in the millions of robots per week. Chinese vehicle production was at 31 million units in 2024 here, with the world at 92.4 million units.

Worldwide, corporations are accepting contracts from humanoid companies to purchase their labor instead of workers. For now, task performance is still boosted by posttraining using human video data from XSize and other data companies. Upon deployment, there is a ramp-up period for online learning task performance for each application.68This ramp-up period gives time for online learning to adapt the policy to each new deployment setting or technique.

As months pass, a large, diverse data flywheel is built, and online learning continues to accelerate. More and more of the remaining industries are being conquered: agriculture, mining, service, and the home. Humanoid performance has now surpassed human performance in most tasks by GDP: true Embodied General Intelligence is achieved.68.5“surpass human performance” here means better than the average experienced worker in a given application domain

By this point, if America has successfully managed to dominate a critical robot supply chain element like actuators, it stands a chance. Otherwise, it simply cannot compete with China’s entrenched supply chain, massive energy growth, rare earth metal processing dominance, and force of governmental will toward automation.69I believe in the good ending, but who knows

The Joys and Consequences of Artificial Humans

Finally, the era of androids has arrived. EGI reaches a general level and can fully pass the Turing test, except for appearance. The obvious next step is to give them realistic skin, as consumers enjoy feeling human connection.70Even in 2025, realistic Chinese robot skin is common, made of medical-grade bionic silicone here, here, and here. You already can’t tell the difference.These robots pervade human-facing industries like retail, food service, and healthcare.71The home is one of the most difficult domains because it requires zero-shot generalization to any possible scenario. It requires full, uncompromised EGI.As the models and hardware improve, it becomes harder and harder to tell androids apart from humans.72Anecdotally, people want BB-8 in their home more than they want C-3PO, unless it’s in a human-looking android in a romantic capacity.

“Companion” androids also pervade the market: they are finetuned to maximize human desire and worsen the ongoing fertility crisis.71.5I’m not saying I want this to happen, but rather that it simply will happen. Look at already ongoing phenomena with LLM “partners”, and the prevalence of internet sexual content.They come in both male and female forms, and have optimized personalities, infinite patience, sex appeal, and, eventually, artificial wombs.

Androids also spread into healthcare, alleviating underserved areas and shortening month-long wait times. They are most used in elder-care facilities, to be kind, human-feeling 24/7 attendants to the needs of the elderly, and being constant companions to the lonely among the elderly.

OpenBrain and Noumena don’t sell to the military because their employees refuse, but as Deepcent-Unioak has no such qualms, other US robotics companies offer theirs in the service of Washington.75Military use of AI is a big topic in the ethos. Anecdotally, most in frontier labs would rather their work be used for non-combat purposes. Although drones are optimal for nearly every military use, humanoids are still better for in-building operations.The US military has a “human-in-the-loop” requirement, which ensures all violent activity is directly approved by a human observing at all times.

The same agile humanoids find stardom in Humanoid Battlebots, a robot fighting league that attracts fans from around the world. Like F1 Racing of the past, each frontier robotics company makes its own hyper-engineered champion robot to test their prowess across the hardware and model stack.76Combat humanoids are one of the easiest categories of automation because they’re easy to start with teloperation. By this time, they would’ve reached autonomy.

After a few years of EGI improvement, OpenBrain autonomous robots finally reach the home. People can buy them as complete products.71The home is one of the most difficult domains because it requires zero-shot generalization to any possible scenario. It requires full, uncompromised EGI.These are likely smaller, friendly, cute, and more aesthetic than regular humanoids.72Anecdotally, people want BB-8 in their home more than they want C-3PO, unless it’s in a human-looking android in a romantic capacity.Buying robots in the home starts with the upper class and spreads to the masses.73In Asia, cost of labor is still so low that it home humanoids experience a fraction of the total live-in maid market.

Culture and Geopolitics of the New Axial Age

On both sides of the Pacific, unemployment asymptotically approaches 90%. In domains like service, education, healthcare, the arts, and government, humans still work because being human offers inherent value to customers. But now that human labor is less of a capital consideration, the leadership of corporations like OpenBrain, Deepcent-Unioak, Noumena, Xiaoai, and Waytek have 100x more power than before.74Assuming “AGI” is achieved, AI companies start to comprise portions of national GDP in the single-digit percentages. The value of these companies will grow into the multi-trillions, and their power likely approaches that of nations. If they get explicitly nationalized like situational awareness predicts here, this scenario plays out a bit differently, but I’m taking the opposing bet.

With such a large unemployed population, who recieve basic income but see the exponentially-compounding wealth of AI company shareholders, the political environment is tense. Citizens fight to institute dividends from AI companies directly to all citizens, and AI companies fight to reinvest all earnings back into compute and manufacturing.77one of the simplest ways to balance the two is to allow all citizens to be shareholders, but reinvest all profits into the exponential autonomy machineOver the years, citizens and corporations continuously negotiate to balance investing in the welfare of both citizens and the cycle of increasing autonomy.77.5Over time, the leverage of citizens would diminish, and the goodwill of those in power matters more and more.

In times of social upheaval, new ideologies and cultures form as people need to address the central questions of human life: what are meaning/the universe/I? There is an explosion in diversity of lifestyles and idealogies.78One thing everyone gets wrong is that, even as time goes on, there will still be people living identically to our ancestors, in our natural, evolved environment. Diversity of lifestyle only increases over time.Like the first Axial age of 500BC, which gave rise to Buddhism, Confucianism, and Western Philosophy, this is the second Axial Age that sees the rise of Naturalism, New Worldism, Elitism, Spiritualism, and Ascendism.

Beijing and Washington are in constant negotiation. Both sense a better future, and don’t want their empires to erupt in flames.79Like symbiotic-competitive US and Soviet Russia in Cold War, science and technology growth were spurred by competition. Would China chose to invade Taiwan, make all the rest of Asia and the US their enemy, while their citizens’ lives are rapidly improving by the day? All citizens in growing nations have a lot to lose.China still dominates manufacturing, but the US dominates computing infrastructure and AI software. Prozy conflicts continue to erupt and likely expand in other continents, which lag decades behind in automation.79.5In "third-world countries”, robotics is an exceptionally difficult project because the cost of labor is so low. Trying to further undercut those low wages isn’t worth it to AI companies who instead compete with $35/hr US wages. The US is at an advantage in this way.Like in the past, the peace depends on a tension built on a balance of resources and citizens, strengthened by co-migration and cultural diffusion through the internet.80When tiktok was shut down, US citizens migrated en-masse to Xiahongshu and found out chinese young adults were not all that different from themselves. The average Chinese citizen actually has a positive view of the US. And China sends many bright students to the US, while the US is increasingly doing the same. There is a bear case for this narrative, but I’ve chosen to end on a hopeful note here.Ultimately, the larger struggle is between citizens and their AI companies. The US and China enhance each other's lives in a symbiotic relationship while continuing to compete for dominance in all areas.

2045+: The Superintelligence Era

After reading this, the intelligent being questions The Project of Automation itself.80.5The Project of Automation, tied to transhumanism, tied to the entire AI and robotics movement, the desire to advance our technological frontier by creating beings like ourselves, who can do anything we can.Why do this in the first place? What is it all for? Automation will cause societal chaos on our continuous upward trajectory. But humans always have and always will yearn for “more”, to be “greater”. Automation of research and labor are necessary criterion to continue on that path.81Why? Because the only way to achieve scalable scientific growth at increasingly faster rates is to automate the growth process itself.

Most of all, humans don’t really know what to want. We don’t know what questions should even be asked yet. We don't know if we're capable of understanding the universe at our current levels of intelligence and consciousness, but we want to get there. We are on the journey of being intelligent enough to even comprehend the universe.82Just as a goldfish couldn’t comprehend the universe in its entirety, we also lack in intelligence and in other key dimensions. This is Elon’s whole quest.

The Branches of 2045

Beyond 2045, the future branches out exponentially into many possibilities.83The outcomes below are hard to pinpoint time horizons for. And the best heuristic: the future never really pans out how you think.

Automation fundamentally changes the social contract. People matter only for world-class skills, for their personalities, or for their money, or for who they know. Maybe, people continue living regular lives as they always have, despite technology. Naturalist nations arise, living in peace independent of technology, living in the environments we were evolved for.84Even today, we don’t live in environments we evolved for; our base drives have been hyperoptimized to the point of detriment, and evolutionary success is sometimes independent of our instincts. Many have abandoned the initial optimization of life itself: survival and reproduction.

We finally reached the era of android dreams.85From “Do Androids Dream of Electric Sheep?”. I think soThere are now androids, real beings who seem identical to humans and somehow more. They are engineered to be perfect.

Robots allow superintelligence to interface with the real world, running experiments and improving its model of reality. They are critical to superintelligence.86taking the shmidhuber/sutton/deutsch view of intelligence: propose frameworks of reality, test them against reality, update world model to compress reality more efficiently based on the resultsEfforts to improve AI now allow it to propose theories of reality and test them in reality. It continues to grow. The superintelligence of the future will grow by generating frameworks of reality, testing them against the real world (enabled by robots), gathering more data, and updating its model of reality.

Alignment is easy because models are trained on human data, but hard because we’re not intelligent enough to know what we truly want.87Alignment will be solved by those working on virtual AI, not robotics, for the same reason as long-term memory: they are at the frontier, and roboticists have more to deal with in the real world.: As new intelligence grows, it would actually know what to want better than us. It will be in our best interest to cede decision-making. It likely will not “turn on us”, given the deeply embedded values in data and RLHF of the complex optimization of prioritizing humanity, but it does become the decision maker for society.88The current paradigm of LLMs has given empirically hopeful evidence as to the ease of baking values into models. The problem of deciding those values remains.

Some people want to control their destiny and look to merging with machines through either brain-computer interfaces or uploading minds to compute.89Either in the gradual cybernetics method, or the pantheon mind upload method (or permutation city). This is the most commonly held method of “taking back our independence from AI”.Perhaps the Fermi paradox (why aren’t there any aliens?) is because once cultures reach a 2045-level of technology, they choose to reside in fully constructed realities contained in computers. Why travel to other planets in our reality, when we can design entirely new realities and societies in our compute?90It seems space travel is much harder than creating new worlds

After all sci-fi problems are solved, and manufacturing is hardly a bottleneck, some people live out the long-held dream of expanding to the stars. These people stay in our reality because they value being at their perceived “layer 1”.91Layer 1 here means the layer of reality we find ourselves in, if the recursive worlds theory that simulated worlds make further simulated worlds is true. It’s hard to say anything about any other hypothetical base layers above ours.We expand to the stars, build Dyson spheres, and create galactic societies.

The more important consequence of greater intelligence is the opportunity to become entirely new beings. We started as bacteria, then eukaryotic cells, then became multicellular, evolved into organisms with brains, became homo sapiens, and unlocked the ability to evolve faster than natural selection through technology.92Humans still evolve under modern (albeit different from our original) selection pressures, but this process takes millennia. Artificial growth happens over the decades, especially because changes can be informed by intelligence.We give rise to the next evolution on the long, ever-changing tree of life.92.5This means, right now, we live at the end of humanity’s childhood, and the dawn of something new.They have entirely new structures of society, ways of thinking about the universe, drives, and experiences.93Akin to the difference between the complexity of human society and that of apes. But larger as time goes on. Old humanity would look to us and think we lived like aliens.They may either choose to live in constructed realities or our base reality.94It seems, empirically, based on the one sample we have

Inevitabilities and The Now

No matter the scale, with time, intelligence inevitably grows.94It seems, empirically, based on the one sample we haveWe access more energy, become more intelligent, and expand our sphere of influence exponentially. What is the end of this exponential curve? The bet of the Project of Automation is that greater intelligence will help us find out. In all things, though, for us, the direction and the journey are worth attending to.95The Hero’s Journey is better than the journey of the person who stays inside and does nothing. So the goal, the direction, matters for the impact it has on the journey. But what matters is the journey.

How do we prioritize humanity among all this chaos? Ensure love for your fellow man triumphs above all.96This is, even empirically, an eternally effective principle.Care about the babies and mothers, the next generation. Keep society dynamic and changing, on an upward path. Focus on improving the lives of everyone, not just a small group of people. Do not sacrifice anyone for the greater good. People are ends in themselves.97Naturally, these are easier said as principles than implemented as decisions

Just as we look back on our ancestors, they will look at our age as the last time humans could still live autonomously and freely in our natural order.98To be precise, more like the 70s or 80s. We’ve already crossed the natural order threshold.What matters are the present and the people who make your life worth living. Work to make the future better for humanity’s children.

Inspired by Leopold Aschenbrenner, Situational Awareness and Daniel Kokotajlo, AI 2027. While I used to work at Tesla Optimus and Dyna Robotics and am infinitely grateful to my mentors, the above is all based on publicly available information and my own research. Thank you to Adam Majmudar, Ahalya Nava, and Sourish Jasti for their feedback.