February 7, 2026
Being a card carrying member of Team Words Mean Things is challenging on the best of These Days, but particularly when multibillion-dollar companies publish weird press releases disguised as blogposts about how their new “agent team workflow” helped them build a C compiler “from scratch” in just two weeks.
“When there are many distinct failing tests, parallelization is trivial: each agent picks a different failing test to work on. After the test suite reached a 99% pass rate, each agent worked on getting a different small open-source project (e.g., SQlite, Redis, libjpeg, MQuickJS, Lua) to compile.
But when agents started to compile the Linux kernel, they got stuck. Unlike a test suite with hundreds of independent tests, compiling the Linux kernel is one giant task. Every agent would hit the same bug, fix that bug, and then overwrite each other’s changes. Having 16 agents running didn’t help because each was stuck solving the same task.
The fix was to use GCC as an online known-good compiler oracle to compare against.”
To sum up: “by using sixteen copies of our our massive language model, whose training data includes every GCC ever released, a warehouse full of GPUs, all the public code in the world and having it autocorrect itself by testing its output against GCC, we managed to make a different C compiler that self-reports that it mostly works. This took two weeks, cost $20,000 and gosh I have so many feelings.”
Now, a reasonable person might have questions! Questions like “what do we think ‘clean room’ or ‘scratch’ means?” Or, say: “testing…?” Most of mine are florid iterations on a theme of “what do we think we’re even doing here”. But even with those doubts in my head, I have excellent, perhaps even revolutionary news.
With the help of recent research in algorithmic analysis here at mHoye Advanced Research, we have made a breakthrough theoretical advancement that lets us outperform warehouses full of specialized GPUs by a dozen orders of magnitude, achieving 100% parity using only a single core of an air-cooled, commodity Cortex A73 CPU. Being among the first people to recognize the crucial role of curated training data, the mHoye approach completes the proposed task in less than 0.01% of the time, with aggregate energy-cost economies on the order of 99.9999% and net zero token costs. By training the “cp(1)” command on only a single, carefully cloned version of the GCC source, then compiling the output of that process with GCC, the resulting application – which I’m calling “mhoyecc” – or as I’ve taken to calling it, mhoye plus cc – is self-hosting, fully bootstrapped and passes 100% of GCC’s tests.
We are currently seeking funding for mHoye Advanced Research to productize this research and bring it to market. Contact us directly if you’d like to be part of a future powered by these revolutionary technologies!
However, while this research represents a tectonic advancement in the state of the art, it also presents profound philosophical and epistemic risks; today, this technology could empower anyone, both good actors and bad, to copy untold amounts of data within or between machines. But – more ominously – it’s clear that once you’ve fed cp(1) a sufficient quantity of human art or argument, it will ingest and repeat all of that art, each of those arguments, with digital perfection. And, alarmingly, cp(1)’s error rates are far below those of our smartest and most diligent people, beyond even those of large, well-run teams.
With that in mind, we need to consider the possibility that cp(1) may soon become – and indeed may already be – sentient and self-aware. At first glance this seems like a remote possibility, but when you multiply even the smallest sliver of probability against the fact that cp(1) can copy an effectively infinite amount of data, simple math makes it clear that we are obligated to take this risk seriously. Even today, cp(1) is fully capable of perfectly copying itself; if these risks are left unchecked and unmanaged, it is inevitable that cp(1) will have copied enough data that it may begin to copy other resources available to it, eventually copying humans and ultimately humanity itself.
It’s self-evident that we have no systems, and certainly no ecosystems, capable of managing or sustaining the arbitrary number of copied beings we now know to be on the horizon. Worse, cp(1) – or, as I have taken to calling it, “mhoye’s basilisk” – is only one of many similar utilities; other lesser-known so-called “frontier utilities” such as rsync(1) encompass an advanced superset of cp(1)’s capabilities, theoretically capable of remote, “compressed” synchronization of other utilities, exacerbating already extreme risks.
With that in mind, I am requesting – in fact, I am insisting on – an audience with the General Assembly of the United Nations as soon as possible, to raise awareness and secure funding to address these grave threats facing humanity that are silently embedded in so many of the systems around us already. We have to act now, too late may arrive sooner than we can imagine.
I look forward to your urgent reply.
Sincerely,
– Mike Hoye,
Founder, Principal Scientician and Chief Bash_Builtins Philosophist
mHoye Advanced Research & Semiotic Enterprise Information Technologies.