Top Comments | Hacker News

13 min read Original article ↗

There's two basic kinds of distillation: 1) the massive [and dumb] method where you ask a question and use the answer as reinforcement (Black Box), and 2) more targeted distillation where you use one model to directly inform/train/guide another model (RLAIF).

The latter is basically fine-tuning the model with direction from another model. Thousands of businesses do this every day to fine-tune. This is almost certainly what the Chinese labs are doing, since it has a much better effect on the end result than just getting simple answers to simple questions.

These complaints of distillation are inflating the problem to make it sound worse than it is, because they want the USG to block/ban Chinese model providers as protectionism. They have already called for more export controls on chips (which is funny because DeepSeek v4 was designed to run on Huawei chips and now the other Chinese providers are following suit). But they can't come right out and say that, so their claim is that they're asking for more export controls because distilled models might not be as safe as their own. But if you show them a jailbreak of their model that bypasses their safety, they'll tell you that any model can eventually be jailbroken so don't worry about safety.


Here's what is happening:

Chinese resellers are offering Claude tokens at 70-90% below official Anthropic API prices. They achieve this by reselling capacity from pooled Claude Max accounts, payments fraud, and also reselling the model output & reasoning chains to various Chinese labs. They are subsidizing model access in exchange for user logs and reasoning traces, which they then sell as training data, allowing them to operate below cost.

Claude and ChatGPT are both blocked in China. You need to use a VPN to access either, and you can't pay with a Chinese bank card. So most people who want access to Claude buy access via a reseller. It's the easiest and cheapest way to access Anthropic models in China.

These resellers operate tens of thousands of bot accounts, which is also why Anthropic introduced identity verification, to slow down the onslaught of bots.

Here's one token reseller, they're offering Opus 4.8 at a 93% discount below official API rates: https://yunwu.ai/pricing?provider=Anthropic

This is one reason why DeepSeek & GLM are priced so cheaply, they are competing with impossibly low token prices in China. They have to keep prices low, in order for people to use them.

I shared this story a few months back, but it never got any traction. It explains the token resale economy in China, it's an excellent read https://www.chinatalk.media/p/how-to-buy-cheap-claude-tokens...


I am on the vesuvius challenge team that did the segmentation, unwrapping, and ink detection, so feel free to ask any questions.


“Distillation attack” are we joking here.

If anything these models should be compelled to be public since they have been trained off public data. What an absurd overreach to call this an attack.

It’s clear they are scapegoating national security and China at this point to build an anti-competitive moat.

I generally really like Anthropic’s work and models but stuff like this scares me for the future. We are positioning these companies to have too much power. The public’s life is getting worse while these companies consolidate power using data they stole from the public.


The hypocrisy of Anthropic complaining about "illicitly extracting its Claude AI model capabilities" and supporting the White House's accusation of China "stealing U.S. AI labs' intellectual property on an industrial scale" is hilarious.

Anthropic, OpenAI, Google, Microsoft, et al trained their models by ignoring the rights of copyright holders when harvesting whatever content they could. Now one of them is crying foul for another entity doing exactly what they all did?

Hilarious.


Reminds me a bit of the anecdote of Steve Jobs complaining about people ripping off the Mac GUI, in the mid to late 1980s, when he gave no public acknowledgement to the work done by Xerox on the Alto and Star operating system.

"you're trying to rip off what I've already ripped off!"

Crawl the whole Internet to build a gargantuan sized LLM and then complain you're being copied...


> Developed from design to production in nine months, accelerated by OpenAI’s models

> the use of OpenAI models to accelerate parts of the design and optimization process.

I wish there was more about this. As is I kind of have to assume that this is just meaningless marketing, like saying development was accelerated by Microsoft Office or their 5k LG Ultrafine 40-inch monitors.

Like, if this was as big a deal as it kind of vaguely implies, they would be making a bigger deal of it, right?


Interestingly, there were no consequences for the execs that made this 'mistake'. There seems to be almost unlimited cover for execs cargo culting on using AI as a pretext for layoffs. If it doesn't implode almost immediately, they get massive bonuses, if it blows up in their face, oh well they had the courage to 'take a bold strategic decision'

In other words, they don't really have a plan, but they are happy playing with people's lives via layoffs, since it's the 'in' thing to do. The incentives are huge on the upside and zero on the downside for them.


I'll just leave it here: "Anthropic's downloading of over seven million books from pirate sites like LibGen constituted infringement, the judge ruled, rejecting Anthropic's "research purpose" defense: "You can't just bless yourself by saying I have a research purpose and, therefore, go and take any textbook you want."

https://www.joneswalker.com/en/insights/blogs/ai-law-blog/wh...


> I pushed everyone too hard. I didn’t appreciate how maturing companies need more slack, and that running people at startup intensity constantly will wear them out.

Sounds like wisdom many companies might consider...


> These complaints of distillation are inflating the problem to make it sound worse than it is

Unfortunately, the Reuters piece itself is complicit in this dramatization. The lede paragraph parrots Anthropic's talking point that distillation is an "attack", without using quotes that would alert the reader that this framing is a corporate talking point. Distillation is NOT an attack.


Meta continuing to be the most shameless (and shameful to work for) company around.

I can't think of a single product of theirs that hasn't made the world a markedly worse place. Even their recent hardware foray is managing to find a way to ruin trust in everyday interactions (guys filming drunk girls with Ray Bans, surveillance, etc.).

Have several friends at the more 'thoughtful' frontier labs that bin meta applicants straight to the trash for this very reason.


Chip CEO here. It really depends on what "design" or "production" means. Does "design" mean that the design was complete? Does "production" mean the beginning of production, i.e. tapeout? If measuring from RTL-freeze to tapeout, this is a fairly typical (even somewhat unimpressive) timeline (accounting for some unexpected issues) for a large, complex 3nm chip. If measuring from concept (no RTL at all, block diagram of architecture) to tapeout, this is an amazing timeline. The truth is probably somewhere in between. A more concrete statement would use actual technical milestones and gates.


This is great for competition! Chinese vendors offering a cheaper solution = what economics told me the free market was all about.

I also learnt that Anthropic should get better at what they do if they want to compete. If not, somebody else will win.

Or does this not apply to huge US corporations any more?


Exactly what everyone said when Patriot Act was passed and renewed repeatedly.

America permanently traded away basic freedoms for the bogus promise of safety in the shadow of fear. And the Supreme Court was too scared to stop it despite its obvious constitutional problems. Crying eagle photos in chain-emails were sufficient propaganda to keep it in place.


> Which leaves the only real question. Why 25,000 at all? It is my company and my risk. If I want to start with nothing, that is my call, not a toll the state collects before it will let me try. And the cheap door has a price of its own: to some clients, “UG” reads as “not serious,” and they would rather deal with a GmbH. The structure built to let me in quietly marks me for using it.

The 25,000 is there to make sure you can cover some liability. If you really wanted "your company and your risk", you could have used the "simplest setup", where you are liable with your own money, but if you think about it that way, it doesn't sound so appealing, does it? So of course the UG which does not (yet) have 25,000 in the bank sounds less serious than the GmbH that has 25,000 in the bank. A company that starts with nothing wouldn't be a GmbH (limited liability company), it would be a GoH (company without liability), and there's a good reason why those don't exist...


Anyone else here enjoy living in the future? Look at us, we get AI megacorporations ruling the world and bestowing us with the power to use their servers for just $20-200/month. It's practically charity, and all we had to give up for it is all consumer hardware, the quality of the internet and our own jobs. I love it here!


Some unc perspective: I paid ~$6,000 in inflation-adjusted dollars for a computer in 1996. Today, I can get the same power in a $6 single board computer. A powerful modern mini PC starts at ~$600.

However painful these price hikes are, and they are painful, it is worth remembering that computing has become incredibly ubiquitous and cheap.


Lets reflect on Aristocreon, in about 200 BC, putting their thoughts down on a scroll. They would be aware that the scroll might be kept in a library for some time. Maybe they could have imagined it surviving for 300 years. But they never would have imagined that in 300 years a volcano might destroy the scroll, but in some way preserve it. And then that nearly two thousand years later future humans with machines made of materials unimaginable to Aristocreon, but related distantly to sand and lightning, would be able to read the scroll again and instantly transmit it to nearly the whole planet, a planet with many times more humans than existed in their time. (and speaking of 'planet', in Aristocreon's time, people had fairly recently been able to show that the world was spherical but much of it was still unknown).

Do we have better imaginations? Can our sci-fi writers come up with something equivalent that is as dizzyingly far from what we know now, as now is from what Aristocreon knew?


Oauth and enterprise auth has to be the worst thing ever made, it might be the most confusing and frustrating part of dealing with the cloud. Even the AI tools took a year to just get basic Oauth working on headless systems without assuming you could open a browser. If they're going to go down the auth rabbit hole with RBAC/IAM/Workload identities?/service accounts and all the trash the big cloud providers have, I just hope to god they leave in the simple shit for personal use. I just want a damn API key, I keep it a secret and revoke if necessary and don't need 10000 layers of auth bullshit tangled up in every layer of every platform.


I think you meant a quote attributed to Bill Gates:

"Well, Steve, I think there's more than one way of looking at it. I think it's more like we both had this rich neighbor named Xerox and I broke into his house to steal the TV set and found out that you had already stolen it."


The AI companies seem to take the viewpoint that everything on the internet is free, except their stuff. It's okay to hammer some random website with AI crawlers, ignoring robots.txt, and causing bandwidth costs to skyrocket. But if you cost an AI provider money with your data acquisition practices, well, that's just clearly unacceptable.


This is a bit ironic, Anthropic complaining about a competitor using claude data to build its own product when Anthropic basically used all of human knowledge production to build claude, i don't think they paid every magazine, author, journalist, etc ...

This is almost standard practice in any competitive industry anyways. Disassemble your competitor's product, study it and try to reproduce / improve.


> logic technology can extend for the first time below the 1 nm node, advancing the era of angstrom-level scaling, where dimensions approach the size of individual atoms. While transistor nodes now refer to a generation of manufacturing technology versus an exact physical dimension, IBM’s 0.7 nm technology—also referred to as 7 angstroms—demonstrates how continued scaling remains possible.

Continuing the well established trend of making bold claims about physical dimensions that have nothing to do with any of the structures in the chip, and the name scales better than the tech.

What they actually deliver is a "nanostack architecture" built with ~5nm features that according to them is comparable to a hypothetical real sub-1nm chip.

It's an impressive achievement nonetheless but it looks like the industry has a few too many marketers.



How does anyone seriously trust LastPass anymore? Years ago, I was working for a company handling bank data. They were using LP immediately following a previous LP security incident and had no plans to migrate away.


These are the price changes mentioned in the article:

Macs

  MacBook Neo: $699 (up from $599)
  13-inch MacBook Air: $1,299 (up from $1,099)
  15-inch MacBook Air: $1,499 (up from $1,299)
  M5 MacBook Pro: $1,999 (up from $1,699)
  M5 Pro MacBook Pro: $2,499 (up from $2,199)
  M5 Max MacBook Pro: $4,099 (up from $3,599)
  iMac: $1,499 (up from $1,299)
  M4 Max Mac Studio: $2,499 (up from $1,999)
  M3 Ultra Mac Studio: $5,299 (up from $3,999)

iPads

  iPad: $449 (up from $349)
  11-inch iPad Air: $749 (up from $599)
  13-inch iPad Air: $949 (up from $749)
  11-inch iPad Pro: $1,199 (up from $999)
  13-inch iPad Pro: $1,499 (up from $1,299)
  iPad mini: $599 (up from $499)

More products:

  Apple TV 4K: $199 (up from $129)
  HomePod: $349 (up from $299)
  HomePod mini: $129 (up from $99)
  Vision Pro: $3,699 (up from $3,499)

Well that's a strange way of expressing competitiveness when Hetzner is still vastly cheaper than those 3 cloud providers, despite those cost increases.


Just yesterday I saw people saying that Apple wouldn't increase prices until the next refresh.

And I agreed! So… holy shit. I think we're going to see even further price increases across the industry. There already were a ton, but it can always get worse, of course.

Thank you, OpenAI. What would have we done without your attempts at monopolizing destroying the memory market.


"Information wants to be free"

Anthropic profited from training its models on all kinds of copyrighted information, live by the sword, die by the sword...

Their model weights, training data, training methods, etc are all going to leak to China over time.

Nobody on a site named _Hacker_ news should be all that upset about this.