Alibaba's new AI chip: Key specifications comparable to H20
news.futunn.comNote also that today China has told its tech companies to cancel any NVIDIA AI chip orders and not to order any more:
https://www.ft.com/content/12adf92d-3e34-428a-8d61-c91695119...
200% tariff incoming :-)
"Speaker Johnson says China is straining U.S. relations with Nvidia chip ban" - https://www.cnbc.com/2025/09/17/china-us-nvidia-chip-ban.htm...
Translation: "We are angry with China that they wont let the US undermine itself, and sell its strategic advantages to them..."
> "Speaker Johnson says China is straining U.S. relations with Nvidia chip ban"
Oh, the irony.
Is this supposed to signal confidence in the chips already available on China's domestic chip market, or is it primarily aimed at boosting that market to make it ready?
Yes. :)
How big a deal is it to be on the cutting edge with this? Given that models seem to be flattening out because they can't get any more data, the answer is "not as much as you would think".
Consequently, a generation or 2 behind is annoying, but not fatal. In addition, if you pump the memory up, you can paper over a lot of performance loss. Look at how many people bought amped up Macs because the unified memory was large even though the processing units were underpowered relative to NVIDIA or AMD.
The biggest problem is software. And China has a lot of people to throw at software. The entire RISC-V ecosystem basically only exists because Chinese grad students have been porting everything in the universe over to it.
So, the signal is to everybody around this that the Chinese government is going to pump money at this. And that's a big deal.
People always seem to forget that Moore's Law is a self-fulfilling prophecy, but doesn't just happen out of thin air. It happens because a lot of companies pump a lot of money at the engineering because falling off the semiconductor hamster wheel is death. The US started the domestic hamster wheel with things like VHSIC. TSMC was a direct result of the government pumping money at it. China can absolutely kickstart this for themselves if the money goes where it should.
I'm really torn about this. On the one hand, I hate what China does on many, many political fronts. On the other hand, tech monopolies are pillaging us all and, with no anti-trust action anywhere in the West, the only way to thwart them seems to be by China coming along and ripping them apart.
Arguably, the leading RISC-V IP is from US firms like SiFive; it has also caught major traction with NVDA in custom products, and US Govt for various industries, and even Redhat is now supporting RH on Risc-V
Microchip Inc partnered w US Govt on the aerospace angle and funded Canonical for linux ports. Their Polarfires and now Euro aerospace like Gaisler are heading in the same direction. US Govt/DARPA and others have been funding risc-v ports for years, to include mainly automated porting.
There are big differences between lowend profile-challenged SBCs and the work of NVDA, Microchip Inc, and the US Govt in the much more highend GPU related, and safety critical industries.
With Heavyweights like IBM/Redhat now on risc-v joining canonical and others, the SW side is definitely improving
TSMC btw, has always been about labor arbitrage
I think those are both the case: they’re telling Chinese companies to invest in domestic hardware–implicitly also saying things like being prepared to stop using CUDA–and that means the hardware vendors know not to skimp on getting there (a nicer version of burning the landing boats on the beach).
It’s also an interesting signal to the rest of the world that they’re going to be an option. American tech companies should be looking at what BYD is doing to Tesla, but they’re also dealing with a change in government to be more like Chinese levels of control but with less maturity.
My cynical view is that it's mostly trade war and nationalism. If you follow the official PRC position, the chips are already made in China because TW is CN... Practically buying TW chips is boosting it's economy and hence funding it's military so from that perspective that makes sense. From long term development perspective this will absolutely boost national market... however that will take an insane amount of time. If you buy into AI is going to change everything hype, this move is a huge handicap and hence a boon to external economies. And I am probably missing a ton of viewpoints... politics meh
if you don't know what you're talking about you don't have to say anything, you know
“grey market” smugglers gonna keep working on it
Until now it was perfectly legal to buy nVidia chips in China. It was the US that was blocking export.
Which way though? From the USA to China or China to the USA?
Are they allowed to rent them from server farms, datacenters, etc. located outside China that are able to procure them?
"Are they allowed to rent them from server farms, datacenters, etc. located outside China that are able to procure them?"
alicloud has many cluster outside china, so they probably can because many friendly country with china has it
but it would be the same with US power play, they only permit anyone that they accept
That is going to really slow LLM development in China. But more GPUs for everyone else!
I mean, progress is already getting slow in llm development space and their qwen models are, well, good enough for time being. Meanwhile, its good for the world that they are working on their own chips, that way nVidia will have to stop being comfortable.
This is step in good direction for everyone except nvidia and its chinese distribution network
They have more electric power (also cheaper), more data centers. I bet AI development will be slower elsewhere.
Yup - at some level of scale it's very much an infrastructure game.
No amount of trade war politics will make up for a lack in infrastructure investment.
In 24 months, US hyperscalers will be training models on GPUs/XPUs with 16A process technology and HBM4E. The gap between the raw processing power of US and Chinese AI hardware will be widening.
I wish I had your confidence being able to forecast 24 months ahead. In 24 months TSMC could be a smoking crater.
There won’t be 16A manufacturing here in the USA.
Probably ever.
We live in extremely dangerous uncertain times.
Any forecast that long is worthless.
You seem extremely pessimistic. TSMC SoW-X will put an entire wafer worth of chips on a single substrate and be incredibly fast.
https://wccftech.com/tsmc-cutting-edge-sow-x-packaging-set-f...
Why would you say 16A would never be manufactured in the United States? It is of course TSMC’s plan of record.
Can someone ELI5 this to me? Nvidia has the market cap of a medium-sized country precisely because apparently (?) no one else can make chips like them. Great tech, hard to manufacture, etc - Intel and AMD are nowhere to be seen. And I can imagine it's very tricky business!
China, admittedly full of smart and hard working people, then just wakes up one day an in a few years covers the entire gap, to within some small error?
How is this consistent? Either:
- The Chinese GPUs are not that good after all
- Nvidia doesn't have any magical secret sauce, and China could easily catch up
- Nvidia IP is real but Chinese people are so smart they can overcome decades of R&D advantage in just s few years
- It's all stolen IP
To be clear, my default guess isn't that it is stolen IP, rather I can't make sense of it. NVDA is valued near infinity, then China just turns around and produces their flagship product without too much sweat..?
> because apparently (?) no one else can make chips like them
No, that's not really why. It is because nobody else has their _ecosystem_; they have a lot of soft lock-in.
This isn’t just an nvidia thing. Why was Intel so dominant for decades? Largely not due to secret magic technology, but due to _ecosystem_. A PPC601 was substantially faster than a pentium, but of little use to you if your whole ecosystem was x86, say. Now nvidia’s ecosystem advantage isn’t as strong as Intel’s was, but it’s not nothing, either.
(Eventually, even Intel itself was unable to deal with this; Itanium failed miserably, largely due not to external competition but due to competition with the x86, though it did have other issues.)
It’s also notable that nvidia’s adventures in markets where someone _else_ has the ecosystem advantage have been less successful. In particular, see their attempts to break into mobile chip land; realistically, it was easier for most OEMs just to use Qualcomm.
If what you say is true, isn't what one of the big contribution of Deepseek is that they wrote some custom lower level GPU cluster to GPU cluster communication protocol instead using of the nvidia soft ecosystem? And that is open sourced?
Well, they wrote it _for_ Nvidia stuff, though; if anything that was a contribution to the Nvidia ecosystem! Though it does show a willingness to go outside the _established_ Nvidia ecosystem.
I'm always a little surprised that Nvidia is _so_ highly valued, because it seems inevitable to me that there is a tipping point where big companies will either make their own chips (see Google) or take the hit and build their own giant clusters of AMD or Huawei or whoever chips, and that knowledge will leak out, and ultimately there will be alternatives.
Nvidia to me feels a bit like dot-com era Sun. For a while, if you wanted to do internet stuff, you pretty much _had_ to buy Sun servers; the whole ecosystem was kinda built around Sun. Sun's hardware was expensive, but you could just order a bunch of it, shove it in racks, and it worked and came with good tooling. Admins knew how to run large installations of Sun machines. You could in theory use cheaper x86 machines running Linux or BSD, but no-one really knew how to do that at scale. And then, as the internet companies got big, they started doing their own thing (usually Linux-based), building up administration tooling and expertise, and by the early noughties Linux/Apache was the default and Sun was increasingly irrelevant.
>In particular, see their attempts to break into mobile chip land;
I wouldn't exactly say it was a failure, all those chips ended up being used in the Nintendo Switch
If you are aiming to have your chips in a decent portion of all mid/high-end phones sold, which they appear to have been aiming for, then the Nintendo Switch isn't really that much of a consolation prize. The Switch had very high sales... for a console, with 150 million over 7 years. Smartphone sales peaked at 1.5 billion units a year. You'd probably prefer to be Qualcomm than Nvidia in this particular market segment, all things considered.
Yearly global smartphone sales are around 300 million.
... Where are you getting that? The iPhone _alone_ sells about 200 million units a year.
There are almost 5 billion smartphone users; sales of 300 million a year would imply that those are only replaced every 16 years, which is obviously absurd.
Oh, that was quarterly: https://canalys.com/newsroom/global-smartphone-market-q2-202...
On a separate note, speaking of the average lifespan of a phone, I'm fairly sure that with how expensive they're becoming, smartphone lifespans are increasing. Especially with:
* hardware performance largely plateauing (not in the absolute sense, that of "this phone can do most of what I need")
* the EU pushing for easy battery and screen replacement and also for 7 years of OS updates
* the vast majority of phones having cases to protect against physical damage
Yeah, peak sales per year were a few years back. People are definitely keeping them longer than they used to.
I thought those came from the automotive sector.
Nah, they also managed to fob them off on the auto sector to some extent, but they were originally envisaged as a mobile chip.
It’s several factors and all of your alternatives are true to some degree:
1. An h20 is about 1.5 generations behind Blackwell. This chip looks closer to about 2 generations behind top end Blackwell chips. So ~5ish years behind is not as impressive especially since EUV is likely going to be a major obstacle to catching up which China has no capacity for
2. Nvidia continues to dominate on the software side. Amd chips have been competitive on paper for a while and have had limited uptake. Now Chinese government mandates could obviously correct this after substantial investment in the software stack — but this is probably several years behind.
3. China has poured trillions of dollars into its academic system and graduates more than 3x the number of electrical engineers the US does. The US immigration system has also been training Chinese students but having a much more limited work visa program has transferred a lot of knowledge back without even touching IP issues
4. Of course ip theft covers some of it
They also have some insane power generation capability - doesn't seem that far fetched that they just build a shitload of slower chips and eat the costs of lower power efficiency.
> China has poured trillions of dollars into its academic system and graduates more than 3x the number of electrical engineers the US does.
This metric is not as important as it seems when they have ~5x the population.
It is. The outcome rate will not grow by the relative number of electrical engineers to population but by the absolute number of the engineers.
In theory, but I'm not sure that's true in practice. There are plenty of mundane, non-groundbreaking tasks that will likely be done by those electrical engineers and the more people, the more space, the more tasks are to be done. And not to mention more engineers does not equal better engineers. And the types to work on these sorts of projects are going to be the best engineers, not the "okay" ones.
It's certainly non-linear.
The more engineers you can sample from (in absolute number), the better (in absolute goodness, whatever that is) the top, say, 500 of them are going to be.
That's assuming top-tier engineers are a fixed percent of graduates. That's not true and has never been.
Does 5x the number of math graduates increase the number of people with ability like Terrance Tao? Or even meaningfully increase the number of top tier mathematicians? It really doesn't. Same with any other science or art. There is a human factor involved.
Suppose there's only one Terrance Tao. Then sampling from 5x the number of people increases the probability he's in the sample (by about 5x).
Suppose there's more than one. Then sampling from 5x the number of people increases the average number of him that you get (by about 5x).
This is not necessarily true. Hypothetical, if most breakthroughs are coming from PHDs and they aren't making any PHDs, then that pool is not necessarily larger.
"not to mention more engineers does not equal better engineers."
funny that you mention this because many top AI talent from big tech companies are from chinnese Ivy league graduate
US literally importing AI talent war as highest as ever and yet you still have doubt
You just said what I said. I didn't say that 100% of the graduates are stupid, but certainly not all high tier either. We aren't in extreme need of the average electrical engineer or the average software engineer. That's a fact. Look at unemployment rates.
I don't like this argument since you can apply this into any country on earth and the answer would be the same
You are trying too hard to be right meanwhile 40% top AI talent in big tech is chinnese
so higher number = more chance smart people is indeed true and your argument is just waste of time
Doesn’t seem to work for India. Wuhan university alone probably has more impact than the sum of. Of course a competent state and strategic investment matters.
What gave you the impression that it's "without too much sweat"? They sweated insanely for the past 6 years.
They also weren't starting from scratch, they already had a domestic semiconductor ecosystem, but it was fragmented and not motivated. The US sanctions united them and gave them motivation.
Also "good" is a matter of perspective. For logic and AI chips they are not Nvidia level, yet. But they've achieved far more than what western commentators gave them credit for 4-5 years ago. And they're just getting started. Even after 6 years, what you're seeing is just the initial results of all that investment. From their perspective, not having Nvidia chips and ASML equipment and TSMC manufacturing is still painful. They're just not paralyzed, and use all that pain to keep developing.
With power chips they're competitive, maybe even ahead. They're very strong at GaN chip design and manufacturing.
Western observers keep getting surprised by China's results because they buy into stereotypes and simple stories too much ("China can't innovate and can only steal", "authoritarianism kills innovation","China is collapsing anyway", "everything is fake, they rely on smuggled chips lol" are just few popular tropes) instead of watching what China is actually doing. Anybody even casually paying attention to news and rumors from China instead of self-congratulating western reports about China could have seen this day coming. This attitude and the phenomenon of keep getting surprised is not limited to semiconductors.
AMDs chips outperform nVidia's (Instinct is the GPU compute line at AMD) and at a lower per watt and per dollar range.
AMD literally can't make enough chips to satisfy demand because nVidia buys up all the fab capacity at TSMC.
Would you care to provide sources?
It's NVIDIA, not nVIDIA. I don't think AMD outperforms NVIDIA chips at price per watt. You need to defend this claim.
By NVIDIA's own numbers and widely available testing numbers for FP8, the AMD MI355X just edges out the NVIDIA B300 (both the top performers) at 10.1 PFLOPs per chip at around 1400 W per chip. Neither of these thngs are available as a discrete device... you're going to be buying a system, but typically AMD Instinct systems run about 15% less than the comparable NVIDIA ones.
NIVIDIA is a very pricey date.
https://wccftech.com/mlperf-v5-1-ai-inference-benchmark-show...
https://semianalysis.com/2024/04/10/nvidia-blackwell-perf-tc...
https://semianalysis.com/2025/06/13/amd-advancing-ai-mi350x-...
There’s a difference between raw numbers on paper and actual real world differences when training frontier models.
There’s a reason no frontier lab using AMD models for training, because the raw benchmarks for performance for a single chip for a single operation type don’t translate to performance during an actual full training run.
Meta, in particular, is heavily using AMDs for inference training.
Also, anyone doing very large models tend to prefer AMDs because they have 288GB per chip and outperform for very large models.
Outside of these use cases, it’s a toss up.
AMD is also much more aligned with the supercomputing (HPC) world were they are dominant (AMD cpus and GPUs power around 140 of the top 500 HPC systems and 8 of the top 10 most energy efficient)
>It's NVIDIA, not nVIDIA
Take a look at their logo. It starts with a lowercase n.
Per dollar sure but they’re quite a bit off per watt. Plus the software ecosystem is still not there.
the thing is, the software ecosystem gap is widening everynday. not the other way around
My question would be: how did they fab it without access to ASML's high-end lithography machines?
https://www.theguardian.com/technology/2024/jan/02/asml-halt...
They've gone all-in with using less advanced equipment (DUV instead of EUV) but advanced techniques (multi patterning). Also combined with advanced packaging techniques.
Also, they're working hard on replacing ASML DUV machines as well since the US is also sanctioning the higher end of DUV machines. Not to mention multiple parallel R&D tracks for EUV.
You also need to distinguish between design and manufacturing. A lot of Chinese chip news is about design. Lots of Chinese chip designers are not yet sanctioned, and fabricate through TSMC.
Chip design talent pool is important to have, although I find that news a bit boring. The real excitement comes from chip equipment manufacturers, and designers that have been banned from manufacturing with TSMC and need to collaborate with domestic manufacturers.
> They've gone all-in with using less advanced equipment (DUV instead of EUV) but advanced techniques (multi patterning).
But that still seems like a huge step behind using EUV + advanced techniques.
Anyway, I'm curious to know how far that gets them in terms of #transistors per square mm.
Also, do we know there aren't secret contracts with TSMC?
You need to see it from their perspective. "huge step behind" is better than "we have nothing, let's just die". This is the best they have right now, and they're going all in with that until R&D efforts produce something better (e.g., domestic EUV).
It could also happen that all their DUV investment allows them to discover a valuable DUV-derived tech tree branch that the west hasn't discovered yet.
Results are at least good enough that Huawei can produce 7nm-5nm-ish phones and sell them at profit.
A teardown of the latest Huawei phone revealed that the chips produced more heat than TSMC equivalent. However, Huawei worked around that by investing massively into avdanced heat dissipation technology improvements, and battery capacity improvements. Success in semiconductor products is not achieved along only a single dimension, there are multiple ways to overcome limitations.
Another perspective is that, by domestically designing and producing chips, they no longer need to pay the generous margins for foreign IP (e.g., Qualcomm licensing fees), which is a huge cost saving and is beneficial for the economics of everything.
Yes exactly.
Also to a certain degree you can just throw loads of GPUs at the problem.
So instead of 100k GB200s, you have ~1m of these cards. One thing china _is_ good at is is mass manufacturing.
There's all sorts of caveats to that, but I really think people are overlooking this scenario. I strongly suspect that they could ramp output of (much?) weaker cards far quicker than TSMC can ramp EUV fabrication.
Plus China has vastly superior grid infrastructure. They have a massive oversupply of heavy industry, so even if they hit capacity issues with such gargantuan amounts of cards I can easily see aluminium plants and what not being totally mothballed and supply rerouted to nearby newly built data centres.
> You need to see it from their perspective. "huge step behind" is better than "we have nothing, let's just die".
Yes but that doesn't answer the question of how they got so close to nvidia.
> It could also happen that all their DUV investment allows them to discover a valuable DUV-derived tech tree branch that the west hasn't discovered yet.
But why wouldn't the west discover that same branch but now for EUV?
> Results are at least good enough that Huawei can produce 7nm-5nm-ish phones and sell them at profit.
Sidenote, I'd love to see some photos and an analysis of the quality of their process.
> Yes but that doesn't answer the question of how they got so close to nvidia.
Talent pool and market conditions. China was already cultivating a talent pool for decades, with limited success. But it had no market. Nobody, including Chinese, wanted to buy Chinese stuff. Without customers, they lacked practice to further develop their qualities. The sanctions gave them a captive market. That allowed them to get more practice to get better.
> But why wouldn't the west discover that same branch but now for EUV?
DUV and EUV are very different. They will have different branches. The point however is not whether the west can reach valuable branches or not. It's that western commentators have a tendency to paint Chinese efforts as futile, a dead end. For the Chinese, this is about survival. This is why western commentators keep being surprised by Chinese progress: they expected the Chinese to achieve nothing. From the Chinese perspective, any progress is better than none, but no progress is ever enough.
China has been producing ARM chips like the A20, H40 (raspberry pi class competitors, dual and quad core SOC; went in to a lot of low end 720p tablets in the early 2010s) for a while now, their semiconductor industry is not zero. The biden administration turning off the chip supply in 2022 was nearly 3 years ago; three years is not nothing, especially with existing industry, and virtually limitless resources to focus on it. Probably more R&D capacity will be coming online here in the next year or two as the first crop of post-export control grads start entering the workforce in China.
Another perspective: they don't need to create chips that are as good as Nvidia. Current strategy is to create less powerful chips but that have better yields due to smaller die size. They then scale out huge multi-node AI clusters. This requires more power and bandwidth, but they have plenty of those. Data centers are located near renewable energy sources, for example in the desert, where power is nearly free. They are very good at building networking so bandwidth is not an issue. They are very good at building efficient power systems (less heat when routing energy) because they are not behind, even in some areas ahead, in power semiconductors (GaN). They still need to innovate in power delivery systems and cooling systems to be able to handle the scale that's required, but that's easier than solving litography.
In other words, they are working on litography and nanometers, but they're not very worried about those areas because they don't really need them. HN is too myopic, focusing only on single-chip performance and logic chips.
I think Alibaba uses TSMC for their foundries, like everyone else. I would assume that they did use ASML machines for this.
The article seems to only depict it being similar to the H20 in memory specs (and still a bit short). Regardless, Nvidia has their moat through cuda, not the hardware.
>- Nvidia doesn't have any magical secret sauce, and China could easily catch up
This is the simple explanation. We'll also see European companies matching them in time, probably on inference first.
This is more my thinking as well. How many big tech companies are working on their own internal TPU chip? Google's started using them in 2015. It sounds like the basic theory of getting silicon to do matrix multiplication is well established. Sure you can always be more efficient, but getting a working chip sounds very approachable. AMD hardware has been ~competitive the entire time, but they have squandered all good will with their atrocious software support.
If China sees an existential risk to getting compute capacity, I can easily see an internal decree to make something happen. Even if it requires designing the hardware + their own CUDA-like stack.
> has the market cap of a medium-sized country
"According to investors, today's value of Nvidia's expected future profits over its lifetime equals the total monetary value of all final goods and services produced within a medium-sized country in a year."
Don't compare market cap with GDP, when you spell it out it's clear how nonsensical it is.
Flagship? No, H20 was their cut down chip they were allowed to sell to China.
No, that was the H800.
The H200 is the next generation of the H100.
> Nvidia has the market cap of a medium-sized country
This makes no sense. Market cap is share price times number of shares, there is no analog for a country. It’s also not comparable to the GDP of a country, since GDP is a measure of flow in a certain time period, whereas market cap is a point in time measurement of expected performance.
I'd say there's a mix of 'Chinese GPUs are not that good after all' and 'Nvidia doesn't have any magical secret sauce, and China could easily catch up' going on. Nvidia GPUs are indeed remarkable devices with a complex software stack that offers all kinds of possibilities that you cannot replicate over night (or over a year or two!)
However they've also got a fair amount of generality, anything you might want to do that involves huge amounts of matmuls and vector maths you can probably map to a GPU and do a half decent job of it. This is good for things like model research and exploration of training methods.
Once this is all developed you can cherry pick a few specific things to be good at and build your own GPU concentrating on making those specific things work well (such as inference and training on Transformer architectures) and catch up to Nvidia on those aspects even if you cannot beat or match a GPU on every possible task, however you don't care as you only want to do some specific things well.
This is still hard and model architectures and training approaches are continuously evolving. Simplify things too much and target some ultra specific things and you end up with some pretty useless hardware that won't allow you to develop next year's models, nor run this year's particularly well. You can just develop and run last year's models. So you need to hit a sweet spot between enough flexibility to keep up with developments but don't add so much you have to totally replicate what Nvidia have done.
Ultimately the 'secret sauce' is just years of development producing a very capable architecture that offers huge flexibility across differing workloads. You can short-cut that development by reducing flexibility or not caring your architecture is rubbish at certain things (hence no magical secret sauce). This is still hard and your first gen could suck quite a lot (hence not that good after all) but when you've got a strong desire for an alternative hardware source you can probably put up with a lot of short-term pain for the long-term pay off.
What does "are not good after all" even mean? I feel there are too many value judgements in that question's tone, that blindsides western observers. I feel like the tone has the hidden implication of "this must be fake after all, they're only good at faking/stealing, nothing to see here move along".
Are they as good as Nvidia? No. News reporters have a tendency to hype things up beyond reality. No surprises there.
Are they useless garbage? No.
Can the quality issues be overcome with time and R&D? Yes.
Is being "worse" a necessary interim step to become "good"? Yes.
Are they motivated to become "good"? Yes.
Do they have a market that is willing to wait for them to become "good"? Also yes. It used to be no, but the US created this market for them.
Also, comparing Chinese AI chips to Nvidia is a bit like comparing AWS with Azure. Overcoming compatibility problems is not trivial, you can't just lift and shift your workload to another public cloud, you are best off redesigning your entire infra for the capabilities of the target cloud.
I think my question made it clear I'm not simply assuming China is somehow cheating here - either in the specs of their current product, or in stealing IP.
No, I just struggle to reconcile (but many answers here go some way to clarifying) Nvidia being the pinnacle of the R&D-driven tech industry - not according to me but to global investors - and China catching up seemingly easily.
Unfortunately I think global investors are quite dumb. For example all the market analysts were very positive about ASML, Nvidia, etc but they all assumed sales to China would continue according to projections that don't take US sanctions or Chinese competition into account. Every time a sanction landed or a Chinese competitor made major step forward, it was surprise pikachu, even though enthusiasts who follow news on this topic saw it coming years ago.
To me at least "not good after all" means their current latest hardware has issues which means it cannot replace Nvidia GPUs yet. This is a hard problem so not getting there yet doesn't imply bad engineering just a reflection of the scale of the challenge! It also doesn't imply that if this generation is a miss following generations couldn't be large win. Indeed I think it would be very foolish to assume that Alibaba or other Chinese firms cannot build devices that can challenge Nvidia here on the basis of current generation not being up to it yet. As you say they have a large market that's willing to wait for them to become good.
Plus it may not be true, this new Alibaba chip could turn out to be brilliant.
Isn’t NVIDIA fabless? I imagine (I jump to conclusions) that design is less of a challenge than manufacturing. EUV lithography is incredibly difficult- almost implausible. Perhaps one day a clever scientist will come up with a new, seemingly implausible, yet less difficult way, using “fractal chemical” doping techniques.
>design is less of a challenge than manufacturing.
If so, can you explain why Nvidia's market cap is much higher than TSMC's? (4.15 trillion versus 1.10 trillion)
I'd just say "market irrationality" and call it a day. TSMC is far closer to a monopoly than NVIDIA is, and they win no matter which fabless company is buying their capacity.
You could be right. But it could also be due to things like: automatic 401k injections into the market, easy retail investing, and general speculative attitudes.
Perhaps China’s actions are less of a problem for Nvidia and more of a problem for other chip makes. After all, if Alibaba can make this chip, what justifies the valuation of companies like Groq?
Defaulting to China stealing IP is a perfectly reasonable first step.
China is known for their countless theft of Europe and especially American IP, selling it for a quarter of the price, and destroying the original company nearly overnight.
Its so bad even NASA has begun to restrict hiring Chinese nationals (which is more national defense, however illegally killing American companies can be seen as a national defense threat as well)
https://www.bbc.com/news/articles/c9wd5qpekkvo.amp
https://www.csis.org/analysis/how-chinese-communist-party-us...
I'm not sure why you are being downvoted, this is well known knowledge and many hacks in the past decade and a half involved exfiltrating stolen IP from various companies.
China's corporate espionage might have surpassed France at the winners podium.
It's all stolen IP.
Virtually all products out of china still are.
If you want something manufacturered the best way is still to fake a successful crowd sourcing campaign.
You'll be able to buy whatever it is on AliExpress (minus any safety features) within 6 months.
Yup this right here. The Chinese are estimated to steal hundreds of billions of dollars worth of US IP every single year. It's the Chinese way, they just steal or copy everything. Whatever gets them ahead.
I wonder what the ethnicity of the Americans who founded, run, and developed NVIDIA and TSMC are.
Taiwanese not Chinese.
I wonder what ethnicity the majority of Taiwanese are, where they came from, and how recently...
Just some 2c totally out my head:
- Chinese labs managed to "overcome decades of R&D" because they have been trying for many years now with unlimited resources, government support and total disrespect of IP laws
- Chinese chips may not be competitive at process power/W with Western but they have cheaper electricity and again unlimited loss capacity
- they will probably hit wall at the software/ecosystem level. CUDA ergonomy is something very difficult to replicate and, you know, developers love ease of use
If CUDA isn't that strong of a moat/tie-in and Chinese tech companies can seemingly reasonably migrate to these chips, why hasn't AMD been able to compete more aggressively with nVidia on a US/global scale when they had a much longer head start?
1. AMD isn’t different enough. They’d be subject to the same export restrictions and political instability as Nvidia, so why would global companies switch to them?
2. CUDA has been a huge moat, but the incentives are incredibly strong for everybody except Nvidia to change that. The fact that it was an insurmountable moat five years ago in a $5B market does not mean it’s equally powerful in a $300B market.
3. AMD’s culture and core competencies are really not aligned to playing disruptor here. Nvidia is generally more agile and more experimental. It would have taken a serious pivot years ago for AMD to be the right company to compete.
AMD is HIGHLY successful in the GPU compute market. They have the Instinct line which actually outperforms most nVidia chips for less money.
It's the CUDA software ecosystem they have not been able to overcome. AMD has had multiple ecosystem stalls but it does appear that ROCm is finally taking off which is open source and multi-vendor.
AMD is unifying their GPU architectures (like nVidia) for the next gen to be able to subsidize development by gaming, etc., card sales (like nVidia).
Why doesn't AMD just write a CUDA translation layer? Yeah, it's a bit difficult to say "just", but they're a pretty big company. It's not like one guy doing it in a basement.
Does Nvidia have patents on CUDA? They're probably invalid in China which explains why China can do this and AMD can't.
They did...HIPIFY translates from CUDA to HIP (ROCm)
https://rocm.docs.amd.com/projects/HIPIFY/en/latest/index.ht...
> CUDA has been a huge moat
The CUDA moat is extremely exaggerated for deep learning, especially for inference. It’s simply not hard to do matrix multiplication and a few activation functions here and there.
It regularly shocks me that AMD doesn't release their cards with at least enough CUDA reimplementation to run DL models. As you point out, AI applications use a tiny subset of the overall API, the courts have ruled that APIs can't be protected by copyright, and CUDA is NVIDIA's largest advantage. It seems like an easy win, so I assume there's some good reason.
A very cynical take: AMD and Nvidia CEO’s are cousins and there’s more money to be made with one dominant monopoly than two competitive companies. And this income could be an existential difference-maker for Taiwan.
bro, both are American CEOs.
What is this racialized nonsense, have you seen Jensen Huang speak Mandarin? His mandarin is actually awful for someone who left Taiwan at 8.
AMD can't even figure out how to release decent drivers for Linux in a timely fashion. It might not be the largest market, but would have at least given them a competitive advantage in reaching some developers. There is either something very incompetent in their software team, or there are business reasons intentionally restraining them.
They did; it's called HIP.
From what I've been reading the inference workload tends to ebb and flow throughout the day with much lower loads overnight than at for example 10AM PT/1PM ET. I understand companies fill that gap with training (because an idle GPU costs the most).
So for data centers, training is just as important as inference.
> So for data centers, training is just as important as inference.
Sure, and I’m not saying buying Nvidia is a bad bet. It’s the most flexible and mature hardware out there, and the huge installed base also means you know future innovations will align with this hardware. But it’s not primarily a CUDA thing or even a software thing. The Nvidia moat is much broader than just CUDA.
The drivers are the most annoying issue ! Pytorch kind of like cuda so much it just works anything with roccm just sucks !
And it would be a big bet for AMD. They don't create and manufacture chips 'just in time' -- it takes man hours and MONEY to spin up a fab, not to mention marketing dollars.
AMD has been producing GPU compute cards (and is highly sucessful at it) for nearly as long as nVidia. (https://www.amd.com/en/products/accelerators/instinct.html)
AMD is fabless. They spun off GlobalFoundries years ago.
> If CUDA isn't that strong of a moat/tie-in and Chinese tech companies can seemingly reasonably migrate to these chips, why hasn't AMD been able to compete more aggressively with nVidia on a US/global scale when they had a much longer head start?
It's all about investment. If you are a random company you don't want to sink millions in figuring out how to use AMD so you apply the tried an true "no one gets fired for buying Nvidia".
If you are an authoritarian state with some level of control over domestic companies, that calculus does not exist. You can just ban Nvidia chips and force to learn how to use the new thing. By using the new thing an ecosystem gets built around it.
It's the beauty of centralized controlled in the face of free markets and I don't doubt that it will pay-off for them.
I think they'd be entirely fine just using NVIDIA, and most of the push came from US itself trying to ban export (or "export", as NVIDIA cards are put together in the china factories...).
Also AMD really didn't invest enough in making their software experience as nice as NVIDIA.
I was referring to this: https://www.reuters.com/markets/emerging/china-tells-tech-fi...
ROCm is making serious inroads, now.
Are there precedents where an authoritarian state outperformed the free market in technological innovation?
Or would china be different because it's a mix of market and centralized rule?
Because Cuda moat in China is wrecked artificially by political reason rather than technical reason
This is the right answer
I use AMD MI300s at work, and my experience is that for PyTorch at least there is no moat. The moat only exists in people's minds.
Until 2022 or so AMD was not really investing into their software stack. Once they did, they caught up with Nvidia.
The only way the average person can access a MI300 is through the AMD developer cloud trial which gives you a mere 25 hours to test your software. Meanwhile NVidia hands out entire GPUs for free to research labs.
If AMD really wanted to play in the same league as NVidia, they should have built their own cloud service and offered a full stack experience akin to Google with their TPUs, then they would be justified in ignoring the consumer market, but alas, most people run their software on their local hardware first.
> The only way the average person can access a MI300 is through the AMD developer cloud trial which gives you a mere 25 hours to test your software
HN has a blindspot where AMDs absence in the prosumer/SME space is interpreted as failing horribly. Yet AMDs instinct cards are selling very well at the top end of the market.
If you were trying to disrupt a dominant player, would you try selling a million gadgets to a million people, or a million gadgets to 3-10 large organizations?
AMD sells 100% of the chips they can produce and at a premium. It's chicken and the egg, here. They have to compete with nVidia for pre-buying fab capacity at TSMC and they are getting out bought.
AMD also need to share that fab wafer capacity to processor division and third party client like (sony,valve,various hpc client)
I can rent an MI300X for $2.69/hr right now on runpod.
AMD probably don't have chinese state backing, presumably, where profit is less of a concern and they can do it unprofitably for many years (decades even) as long as the end outcome is dominance.
Sadly, AMD and its precursor graphics company, ATI, have had garbage driver software since literally the mid-1990s.
They have never had a focus on top notch software development.
CUDA isn't a moat... in China. The culture is much more NIH there.
Because Chinese government can tell their companies to adopt Chinese tech and they will do it. Short term pain for long term gain.
It's interesting that CUDA is a moat because if AI really was as good as they claim then wouldn't the CUDA moat evaporate?
Exactly. The whole argument that software is a moat is at best a temporary illusion. The supply chain is the moat, software is not.
Most chipmakers in China are making or have made their new generation of products CUDA-compatible.
Do you know how bad AMD is at doing drivers and Software in general?
People are trying to break the moat.
See, Mojo, a new language to compile to other chips. https://www.modular.com/mojo
I don't think "learn entirely new language" is all that appealing vs "just buy NVIDIA cards"
This was in terms of breaking the Nvidia monopoly. Mojo is a variant of python. When looking at the difficulty of migrating from CUDA , learning python is pretty small barrier.
Sure, you can keep buying nvidia, but that wasn't what was discussed.
> Mojo is a variant of python.
Lol this is how I know no one that pushes mojo on hn has actually ever used mojo.
Yes, over simplifying the concept. what is wrong with that? If I post a thesis on compilers would that really help clarify the subject? Read the link for details. Is Mojo attempting to offer a non-Cuda solution? Yes. Is it using Python as the language? Yes. Is there some complicated details there? Yes. Congratulations.
> Yes. Is it using Python as the language?
You're completely wrong here. That's the "what's wrong with it".
I think you are missing the nuance between the different aspects of using the Python Interpreter, and integrating new functions with Python. And compiling to a different target. Would you say Iron Python is Not Python, and quibble about it? Is there some Python purist movement I'm not aware of? Should every fork of Python be forced to take Python out of its name?
To say Mojo doesn't use Python, when clearly that is a huge aim of the project, makes me think you are splitting hairs somewhere on some specific subject that is not clear by your one liners.
Key aspects of Mojo in relation to Python:
• Pythonic Syntax and Ecosystem Integration: Mojo adopts Python's syntax, making it familiar to Python developers. It also fully integrates with the existing Python ecosystem, allowing access to popular AI and machine learning libraries.
• Performance Focus: Unlike interpreted Python, Mojo is a compiled language designed for high-performance execution on various hardware, including CPUs, GPUs, and other AI ASICs. It leverages MLIR (Multi-Level Intermediate Representation) for this purpose.
• Systems Programming Features: Mojo adds features common in systems languages, such as static typing, advanced memory safety (including a Rust-style ownership model), and the ability to write low-level code for hardware.
• Compatibility and Interoperability: While Mojo aims for high performance, it maintains compatibility with Python. You can call Python functions from Mojo code, although it requires a specific mechanism (e.g., within try-except blocks) due to differences in compilation and execution.
• Development Status: Mojo is a relatively new language and is still under active development. While it offers powerful features, it is not yet considered production-ready for all use cases and is continually evolving.
> I think you are missing the nuance between the different aspects of using the Python Interpreter, and integrating new functions with Python
What if I told you I used to work at modular? What would you say then to this accusation that I'm "missing the nuance"?
The rest of this is AI crap.
I think then I'd have to go back to your original reply, and ask what your point was. What is it you are finding objectionable? These one liner "doh, your wrong", isn't clarifying.
Do you really think Mojo is not based on Python? Or they are not trying to bypass Cuda? what is the problem?
The rest might be marketing slop. But I'm not catching what your objection is.
> Do you really think Mojo is not based on Python?
what do you mean "do you really". it's not. full stop. what part of this don't you understand?
? Are we talking about same thing? Mojo, the new language for programming GPU's without CUDA?
The marketing and web site materials clearly show how they are using the Python interpreter and extending Python. They promote the use of Python everywhere. Like it is one of the most hyped points.
I think you are trying to quibble with, does the new functions get compiled differently than the rest of Python? So technically, when the Mojo functions are in use, that is not Python at that point?
Or maybe you are saying that they have extended Python so much you would like to not call it Python anymore?
Like IronPython, maybe since that gets compiled to .NET, you disagree with it being called Python?
Or maybe to use the IronPython example, if I'm calling a .NET function inside Python, you would like to make the fine distinction that that is NOT Python at that point? It should really be called .NET?
Here is link to docs. You worked there. So maybe there is some hair splitting here that is not clear.
https://docs.modular.com/mojo/manual/python/
Maybe it is just marketing hype that you disagree with.
But right on the main page it says "Mojo is Python++".
> The marketing and web site materials clearly show how they are using the Python interpreter and extending Python.
brother you have literally not a single clue what you're talking about. i invite you to go ask someone that currently works there about whether they're "using the Python interpreter and extending Python".
From Docs, https://docs.modular.com/mojo/manual/python/
"This is 100% compatible because we use the CPython runtime without modification for full compatibility with existing Python libraries."
At this point you need to either explain your objection, or just admit you are a troll. You haven't actually at any point in this exchange offered any actual argument beyond 'duh, you're wrong'. I'd be ok if you actually pointed to something like 'well technically, the mojo parts are compiled differently', or something. You say you worked there, but you're not even looking at their website.
Creator Chris Lattner discussing why they used Python. https://www.youtube.com/watch?v=JRcXUuQYR90
Start at minute 12. "Mojo is a very extended version of Python".
Are you even a programmer? Do you know what any of these words mean?
> using the CPython interpreter as a dynamic library (shown as libpython.dylib in figure 1).
They're embedding the python interpreter not extending it - just like everyone and their mother has been able to do for decades
https://docs.python.org/3/extending/embedding.html
I repeat: you have no idea what you're talking about so in reality you're the troll.
You're really splitting some very thin pedantic hairs.
You're problem isn't with me, you are quibbling with there own marketing materials. Go complain to marketing if they are using the words that you disagree with. Everything I've posted is directly from Mojo's website.
You: "Well, technically they are embedding the interpreter, so all the surrounding code that looks exactly like python, and we promote as being compatible with python, and promote as extending python. My good sir, it is not really python. That is just a misunderstanding with marketing. Please ignore everything we are clearly making out as an important feature, totally wrong".
They clearly promote that they are extending python. What is your problem with that? How is that wording causing you to seize up?
I'm aware of what is technically happening. Where did I ever say anything that was not directly from them? Do I really need to write a thesis to satisfy every ocd programmer that wants to argue every definition.
Were you let go because of an inability to think flexibly? Maybe too many arguments with co-workers over their word choice? Does you're brain tend to get single tracked on a subject, kind of blank out in a white flash when you disagree with someone?
Actually, I'm kind of convinced you're just arguing to argue. This isn't about anything.
> so all the surrounding code that looks exactly like python, and we promote as being compatible with python
bro are you really thick? there is zero mojo code that is runnable python; take a look at
https://github.com/modular/modular/tree/main/mojo/stdlib/std...
mojo has zero to do with python. zilch, zero, nada.
what they are doing is simply embedding the python interpreter and running existing python code. literally everyone already does that, ie there are a million different projects that do this same thing in order to be able to interoperate with python (did you notice the heading at the top of the page you linked is *Python interoperability* not *Python compatibility*).
> This isn't about anything.
it's about your complete and utter ignorance in the face of a literal first hand account (plus plenty of contrary evidence).
> Were you let go because of an inability to think flexibly?
let go lololol. bro if you only knew what their turnover was like you would give up this silly worship of the company.
Sorry. I get it now.
You're bitter.
To be clear, I'm not a fan boy. I don't really know much about Mojo. I've watched some videos, checked out their website, thought it was interesting idea.
The parent post was about alternatives to CUDA.
I posted a 6 word sentence summarizing how Mojo is trying to bypass CUDA, and using Python. -> And you flipped out, that it isn't Python. Really?
I checked out your link, sure does look like Python. But that is the point, all of their promotional materials and every Chris Lattner video, all sales pitches, everywhere.
Everywhere, is Python, Python, Python. Clearly they want everyone one to know how closely tied they are to Python. It is a clear goal of theirs.
But. I see now the pedantic hair splitting. Mojo 'Looks Like Python', they use the same syntax. "Mojo aims to be a superset of Python, meaning it largely adopts Python's syntax while introducing new features".
But you say, they aren't modifying or extending CPython so this is all false, it is no longer technically Python at all.
And I guess I'm saying, Chill. They clearly are focused on Python all over the place, to say that it isn't, is really ludicrous. You're down a rabbit whole of debating what is a name, what is a language. When is Python not Python? How different does it have to be, to not be?
CUDA is a legal moat.
A reimplantation would run into copyright issues.
No such problem in China.
Apparently DeepSeek’s new model has been delayed due to issues with the Huawei chips they’re using. Maybe raw floating point performance of Chinese chips is competitive with NVIDIA, but clearly there’s still a lot of issues to iron out.
I'm sure there are LOTS of issues that need to be addressed, but the demand for the chips are so high that the incentives are overwhelmingly in favor of this continuing. If the reported margins on the Nvidia chips are as high as the claims make it out to be (73+% ??) this will easily find a world wide market.
It was also frustratingly predictable from the moment the US started trying to limit the sales of the chips. America has slowed the speed of Chinese AI development by a tiny number of years, if that, in return for losing total domination of the GPU market.
>America has slowed the speed of Chinese AI development by a tiny number of years, if that, in return for losing total domination of the GPU market.
I'm open to considering the argument that banning exports of a thing creates a market incentive for the people impacted by the ban to build aa better and cheaper thing themselves, but I don't think it's as black and white as you say.
If the only ingredient needed to support massive innovation and cost cutting is banning exports, wouldn't we have tons of examples of that happening already - like in Russia or Korea or Cuba? Additionally, even if the sale of NVIDIA H100s weren't banned in China, doesn't China already have a massive incentive to throw resources behind creating competitive chips?
I actually don't really like export bans, generally, and certainly not long-term ones. But I think you (and many other people in the public) are overstating the direct connection between banning exports of a thing and the affected country generating a competing or better product quickly.
> If the only ingredient needed to support massive innovation and cost cutting is banning exports, wouldn't we have tons of examples of that happening already - like in Russia or Korea or Cuba?
That's just one of the ingredients that could help with chance of it happening, far from being "the only ingredient".
The other (imo even more crucial) ingredients are the actual engineering/research+economical+industrial production capabilities. And it just so happens that none of the countries you listed (Russia, DPRK, and Cuba) have that. That's not a dig at you, it is just really rare in general for a country to have all of those things available in place, and especially for an authoritarian country. Ironically, it feels like being an authoritarian country makes it more difficult to have all those pieces together, but if such a country already has those pieces, then being authoritarian imo only helps (as you can just employ the "shove it down everyone's throat until it reaches critical mass, improves, and succeeds" strategy).
However, it is important to remember that even with all those ingredients available on hand, all it means is that you have a non-zero chance at succeeding, not a guarantee of that happening.
Russia and Cuba? Why not mention Somalia and Afghanistan? They're about equally relevant in this context.
South Korea might have the capability to play this game (North Korea certainly doesn't), but it hasn't really had the incentive to.
Which brings us to the real issue: an export ban on an important product creates an extremely strong incentive, that didn't exist before. Throwing significant national resources at a problem to speculatively improve a country's competitiveness is a very different calculation than doing so when there's very little alternative.
Russia and Korea and Cuba don’t have the economy, manufacturing and competent research scientists that China has
Head of SMIC was ex TSMC IIRC. They were able to poach TSMC engineers because Taiwan didn’t pay as well.
>They were able to poach TSMC engineers because Taiwan didn’t pay as well.
Apparently that was an issue for them when it came to hiring people to work at their US fabs as well.
The catch-up would happen one way or another but with the exports ban it definitely accelerated
I mean, I don’t know how long the NVIDIA moats can hold. With this much money at stake, others will challenge their dominance especially in a market as diverse and fragmented as advanced semiconductors.
That’s not to say I’m brave enough to short NVDA.
I think that NVIDIA’s moat is the US government. Remember our government’s efforts to prevent the use of Huawei cell infrastructure in Europe and around the world?
I am a long time fan of Dave Sacks and the All In podcast ‘besties’ but now that he is ‘AI czar’ for our government it is interesting what he does not talk about. For example on a recent podcast he was pumping up AI as a long term solution to US economic woes, but a week before that podcast, a well known study was released that showed that 95% of new LLM/AI corporate projects were fails. Another thing that he swept under the rug was the recent Stanford study that 80% of US startups are saving money using less expensive Chinese (and Mistral, and Google Gemma??) models. When the Stanford study was released, I watched All In material for a few weeks, expecting David Sack’s take on the study. Not a word from him.
Apologies for this off-topic rant but I am really concerned how my country is spending resources on AI infrastructure. I think this is a massive bubble, but I am not sure how catastrophic the bubble will be.
> Remember our government’s efforts to prevent the use of Huawei cell infrastructure in Europe and around the world?
The US is burning good will at an alarming rate, how long will countries keep paying a premium to be spied on by the US instead of China?
I think the answer to your question is ‘not for very long.’ I frequently have breakfast with a friend who is a retired math professor and he is an avid investor in the stock market. We talk a lot about how long the US stock market will keep increasing in value. We don’t know the answer about the stock market, but it is fun to talk about. We both want to start easing out of the stock market.
The main competitors to Huawei in cell network stuff are mostly European (Nokia and friends), not American.
They are heavy into AI investing but will tell people AI startups are just toy apps (Chamath). That podcast is full of crooks. I’d be willing to give them a pass as bunch of old white guy techies that just love to talk about tech, but they are literally at the dinner table with Trump and Musk.
This country used to have congressional hearings on all kinds of matters from baseball to the Mafia. Tech collusion and insider knowledge is not getting investigated. The All-in podcast requires serious investigation, with question #1 being “how the fuck did you guys manage to influence the White House?”.
Other notes:
- Many of them are technically illiterate
- They will speak in business talk , you won’t find a hint of intimate technical knowledge
- The more you watch it, the more you realize that money absolutely buys a seat at the table:
https://bloximages.chicago2.vip.townnews.com/goskagit.com/co...
(^ Saved myself another thousand words)
Remember that time in history when Chamath thought he found gold in SPACs. Hubris is easily forgotten or forgiven.
You say 95% failed like it's a bad thing - a 5% success rate sounds reasonable to me in terms of startups!
It's not startup success rate, it's application of the technology at companies. Meaning that 95% of the time that AI is applied to a work problem, it fails to generate material value over existing methods.
Sacks has always been absolutely disingenuous and interested in pedaling his own interests over the interests of the common good. As a total Trump shill he talks out of both sides of his mouth at the same time & accuses the left of things that he has no problem with when he or his own party does it.
Anyone who's listened to him (even those who align with him politically) for an extended period of time can't help but to notice so obviously so self interested to the point of total hypocrisy—the examples of which are too many to begin to even wanting to enumerate. Like—take the Trump/Epstein stuff, or the Elon/Trump fallout—topics he would absolutely lose his sh*t over if these were characters on the left. I find it hard to believe anyone actually ever took him seriously. Branding myself as a fan of his would just be a completely self-humiliating insult to my intelligence and my conscience IMO.
> a week before that podcast, a well known study was released that showed that 95% of new LLM/AI corporate projects were fails.
I mean. I think some of us knew this. There's a lot of issues with AI, some psychological, some are risk adverse individuals who would love to save hours, weeks, months, maybe years of time with AI, but if AI screws up, its bad, really bad, legal hell bad, unless you have a model with a 100% success rate for the task, it wont be used in certain fields.
I think in the more creative fields its very useful, since hallucinations are okay, its when you try to get realistic / look reasonably realistic (in the case of cartoons) that it gets iffy. Even so though, who wants to pay the true cost of AI? There's a big uphill cost involved.
It reminds me a lot of crypto mining, mostly because you need an insane amount to invest into before you become profitable.
"Your margin is my opportunity" as someone said. Certainly Google must have plans to sell its chips externally with this much up for grabs?
They make more money using them themselves or renting out their time to others.
I was also wondering if Google would try to make profit from selling TPUs, but they probably won’t because:
At least for me, Google has some real cachet and deserves kudos for not losing money selling Gemini services, at least I think it is plausible that they are already profitable, or soon will be. In the US, I get the impression that everyone else is burning money to get market share, but if I am wrong I would enjoy seeing evidence to the contrary. I suspect that Microsoft might be doing OK because of selling access to their infrastructure (just like Google).
There's no point selling TPUs when you can bundle TPU access as part of much more profitable training services. The margins are much higher providing a service as part of GCP versus selling.
I agree. Amazon and I think Microsoft are also working on their own NVIDIA replacement chips - it will be interesting to see if any companies start selling chips, or stick with services.
From what I'm hearing in my network, the name of the game is custom chips hyperoptimized for your own workloads.
A major reason Deepseek was so successful margins wise was because the team heavily understood Nvidia, CUDA, and Linux internals.
If you have an understanding of the intricacies of your custom ASIC's architecture, it's easier for you to solve perf issues, parallelize, and debug problems.
And then you can make up the cost by selling inference as a service.
> Amazon and I think Microsoft are also working on their own NVIDIA replacement chips
Not just them. I know of at least 4-5 other similar initiatives (some public like OpenAI's, another which is being contracted by a large nation, and a couple others which haven't been announced yet so I can't divulge).
Contract ASIC and GPU design is booming, and Broadcom, Marvell, HPE, Nvidia, and others are cashing in on it.
I wouldn't be surprised if a fair portion of Amazon's Bedrock traffic is being served by Inferentia silicon. Their margins on Anthropic models are razor thin and there's a lot of traffic, so there's definitely an incentive. Additionally, every model that's served by Inferentia frees up Nvidia capacity for either models that can't be so served or for selling to customers.
Do you have a link or references showing Google isn’t losing money on Gemini?
Earning report does not break out profit from Gemini separately, but this is still useful https://abc.xyz/assets/34/fa/ee06f3de4338b99acffc5c229d9f/20...
A long time ago I worked as a contractor at Google, and that experience taught me that they don’t like things that don’t scale or are inefficient.
That's the same as saying that Google is winning the AI race because they don't like losing. They won't win anything if we are in a bubble that burst tho
A hypothetical AI bubble bursting doesn't mean that every single AI vendor fails completely. Like the Dot-Com Bubble, the market value drops precipitously and many companies fold, but because the market value does not fall to zero, the survivors (i.e. Amazon) still win.
Websites were still mostly selling goods and services in 2001. Not giving away hot takes and hallicinated summaries in exchange for eyeballs. In other words, after stuff like pets.com collapsed, people still found it useful to have pet food delivered, and the business model evolved. LLMs, on the other hand, don't seem to have a lot of public appeal. Most of the use cases are being shoved down the public's throat. Their appeal is to corporations as cost saving replacements for workers. But an AI bubble bursting would look like corporations rolling back their exuberance for the AI craze. What's already only speculatively profitable and requires enormous capex would probably become too toxic for anyone to try again for a generation.
Fabrication is the bottle neck. They can't even meet internal demand.
As long as only TMSC is only top performance chip producer and it is possible to reserve all it manufacturing capacity for one two clients the NVIDIA will hold without problem...
My opinion, the problems for NVIDIA will start when China ramp up internal chip manufacturing performance enough to be in same order of magnitude as TMSC.
But all sorts of people get their things fabbed by TSMC.
Cerebras get their chipped fabbed by them. I assume Eucyld will have their chips fabbed by them.
If there's orders, why would they prefer NVIDIA? Customer diversity is good, is it not?
TSMC and NVIDIA's relationship has gone back for more than 20 years. In the NVIDIA biography they talk about how TSMC really helped NVIDIA out early on when other suppliers just couldn't meet the quality and rate demands that NVIDIA aspired to. That has led to a strong relationship where both sides have really helped each other out.
Yes, but other are still getting chips from them. I think it's just a matter of having enough demand.
> If there's orders, why would they prefer NVIDIA? Customer diversity is good, is it not?
Money talks. Apple asked for first dips a while earlier (exclusively).
But other people are literally getting their things fabbed by them.
AMD are, Cerebras are, I assume OpenChip's and Euclyd's machines will be.
> But other people are literally getting their things fabbed by them.
Sure, but in my example Apple got access exclusively for a few months to a newer node, which would make a world of difference if you compete in the same space.
I'm not knowledgeable about this, but I wonder how important performance really is here.
Wont it be enough to just solder on a large amount of high bandwidth memory and produce these cards relatively cheaply?
> but I wonder how important performance really is here.
Perf is important, but ime American MLEs are less likely to investigate GPU and OS internals to get maximum perf, and just throw money at the problem.
> solder on a large amount of high bandwidth memory and produce these cards relatively cheaply
HBM is somewhat limited in China as well. CXMT is around 3-4 years behind other HBM vendors.
That said, you don't need the latest and most performant GPUs if you can tune older GPUs and parallelize training at a large scale.
-----------
IMO, Model training is an embarrassingly parallel problem, and a large enough cluster leveraging 1-2 generation older architectures that is heavily tuned should be able to provide similar performance to train models.
This is why I bemoan America's failures at OS internals and systems education. You have entire generations of "ML Engineers" and researchers in the US who don't know their way around CUDA or Infiniband optimization or the ins-and-outs of the Linux kernel.
They're just boffins who like math and using wrappers.
That said, I'd be cautious to trust a press release or secondhand report from CCTV, especially after the Kirin 9000 saga and SMIC.
But arguably, it doesn't matter - even if Alibaba's system isn't comparably performant to an H20, if it can be manufactured at scale without eating Nvidia's margins, it's good enough.
Isn’t memory production relatively limited also?
They are currently doing this. It’s part of their Made in China 2025 plan
> That’s not to say I’m brave enough to short NVDA.
Their multiples don't seem sustainable so they are likely to fall at some point but when is tricky.
> Their multiples don't seem sustainable so they are likely to fall at some point but when is tricky.
They've been trying really hard to pivot and find new growth areas. They've taken their "inflated" stock price as capital to invest in many other companies. If at least some of these bets pay off it's not so bad.
google has already started offering its TPUs to other neocloud providers
I hadn't heard that. Source?
Interesting. I read that as Google is using colocation to host its TPUs. I don't think Google is selling its TPUs like Nvidia sells H100s.
Slowing AI development by even one month is essentially infinite slowness in terms of superintelligence development. It's a kill-shot, a massive policy success.
Lost months are lost exponentially and it becomes impossible to catch up. If this policy worked at all, let alone if it worked as you describe, this was a masterstroke of foreign policy.
This isn't merely my opinion, experts in this field feel superintelligence is at least possible, if not plausible. This is a massively successful policy is true, and, if it's not, little is lost. You've made a very strong case for it.
>in terms of superintelligence development
doing a lot of heavy lifting in your conjecture
This is not merely my opinion, but that of knowledgable AI researchers, many of whom place ASI at not a simple remote possibility, but something they see as almost inevitable given our current understanding of the science.
I don't see myself there, but, given that even the faint possibility of superintelligence would be an instant national security priority #1, grinding China into the dust on that permanently seems like a high reward, low risk endeavor. I'm not recruitable via any levers myself into a competitive ethnostate so I'm an American and believe in American primacy.
The Chinese state operates the country much like a vast conglomerate, inheriting many of the drawbacks of a massive corporation. When top leadership imposes changes in tools or systems without giving engineers the opportunity to run proof-of-concept trials and integration tests to validate feasibility, problems like these inevitably arise.
Reported by one of the more least credible PRC reporters on FT who should be thoroughly ignored.
There's a very important point made in the article - with recent export controls, domestic Chinese firms don't need to beat Nvidia's best, but only the cut-down chips cleared for Chinese export.
The AI race is like the nuclear arms race. Countries like China will devote an inordinate amount of resources to be the best - it may take a year or two, but in the grand scheme of things that is nothing.
And NVIDIA will lose its dominance for the simple reason that the Chinese companies can serve the growing number of countries under US sanctions. I even suspect it won't be long before the US will try to sanction any allies that buy Chinese AI chips!
China and Russia collectively have a talent pool dense enough to build future products and services the rest of the world uses, if China can produce comparative hardware for AI.
Simple example being TikTok.
Its just a matter of time really.
If russia has a dense talent pool why they are decades behind in chip design and manufacturing?
You need more than talent - founding, culture of entrepreneurship, government support, trust of partners, suppliers, etc.
Russia has none of that at the scale needed.
Everything in china is a copy though. Even your example TikTok is a Vine clone
The Europeans invented the car and Ford mass produced it.
Yet, we see Ford as extremely innovative and revolutionary. I think we can draw lots of parallels between a 19th and early 20th century industrializing US and current China.
You may disparage TikTok as a Vine clone, but it redefined the state of the art for recsys algorithms. Google and Meta had to play catch-up with how quickly and how good TikTok is at discovering videos users find interesting out of the ocean of available content.
Yeah its incredibly disrespectful to call TikTok just a vine clone.
Most of Meta's engagement comes from video content. Continuous engagement is how it is able to generate its revenue.
Thats all I need to say!
> And NVIDIA will lose its dominance
They are vendor locking industries, i don't think they'll loose their dominance, however, vendor locked companies will loose their competitiveness
"lose" not "loose" please.
Indeed. You could (and probably should) view the export restrictions as a subsidy for chinese manufacturing.
This is not true and a lot of Nvidi’s chips are smuggling into the country. There’s a ton of domestic pressure to be the leading chip producers. It’s part of China’s strategic plan called Made in China 2025
Considering the fact China controls most of the world supply of rare minerals, considering the fact the US is lead by a incompetent leader, considering the fact Nvidia looses a big market, I think China can compete with even the leading Nvidia chips in a couple of years time.
If that happens, China in turn can export those Chips to countries that are in dire need of Chips, like Russia. They can export to Africa, South-America and the rest of Asia. Thus resulting in more competition for Nvidia. I see bright times ahead, where the USA no longer controls all of the worlds chip supply and OS systems.
I see this as an absolute win.
China doesn't control the supply of rare minerals but rather production. Rare minerals are not really rare, but the processing them is a "dirty" business and does lot of damage to environment.
China has managed to monopolise the production (cheap prices) and advance the refinement process, so other domestic projects to extract rare earth minerals were not really profitable. To start it again would take some time.
Why do we look at these as a race? There is nothing to win. Nobody won space, or nukes, and they won’t win AI. You might get there first, but your competitor will get there soon after regardless. Embrace it.
We win. The companies think they'll "win", and I'm fine letting them. The race is good for us.
There is no us and them!
But them, they do not think the same.
Huh? Things would certainly have turned out very differently if Nazi Germany or Imperial Japan had won the nuke race.
This conveniently coincides with China banning purchases of Nvidia AI chips:
US government f'ed over Nvidia's China market dominance in order to help OpenAI, Google, Anthropic, xAI.
China shouldn't be buying H20s. Those are gimped 3 year old GPUs. If Nvidia is allowed to sell the latest and greatest in China, I think their revenue would jump massively.
This article is propaganda.
If you have the most basic understanding of chips its not just design, as that has a high degree of coupling to manufacturing and this article doesn't say where, who or how the chips are being made.
China, at last check was behind Intels home grown chip making efforts when it came to sizes and yields.
Hype and saber rattling to get the US to (Re)act, or possibly ignore the growing black market for Nvidia gear (that also happens to be bi-directional with modified cards flowing back to the US).
Several years ago, whenever some Chinese engineers dared to propose using some Chinese parts, the challenges he/she had to face is always "who is going to be responsible if it is not reliable enough for its quality?"
Nowadays, whenever some Chinese engineers dared to propose using some American parts, the challenges he/she had to face is always "who is going to be responsible if it is not reliable enough for its supply?"
For a comparison, the latest Nvidia Blackwell cards have up to 8tb/sec memory bandwidth vs 700GB/sec here.
funny that you didn't capitalize TB properly but did GB. :)
anyway.
VR200 supposedly has 20TB/s HBM, so I wish good luck to all these copy cats to catch up.
Not a word on compute or interconnect speeds. All this really says is they stuck some HBM on a chip.
One of these headlines in the next few months will spark a US market selloff greater than what we saw on the initial DeepSeek release.
I believe about 1000 S&P points down - to just above the trade war lows from April.
If CUDA is nvidia's moat, which has basically created a monopoly, how long until there is an anti-monopoly trial against them in EU or even in the US?
I am hoping they release it as a fully open-source design, or with as much documentation and openness as they can.
So about 5 years behind the cutting edge, SMIC showed their advanced lithography tools today(still no ASML) but come 2030 at this rate? Hard to say they won’t catch up.
While their lithography may lag, their system-level engineering is leveraging unique strengths. China's lack of power constraints allows them to build massive, optically-networked systems like the CloudMatrix 384. There is a SemiAnalysis that compares it to Nvidia’s GB200 NVL72. It looks like they overcome weaker individual chips to outperform Nvidia’s GB200 NVL72 with 2x the compute, 3.6x the aggregate memory, and 2.1x the memory bandwidth. with scale-out networking and software optimization, not just silicon.
The A100 gpu is almost 5 years old at this point, and still useful for a lot of things.
I hope China floods the world with cheap, affordable GPUs. We’re sick and tired of the Nvidia tax.
isnt the h20 nerfed anyways? H20’s FP16/BF16 performance is reduced to ~148 TFLOPS vs ~1,979 TFLOPS for the H100?
This is typical CCP propaganda. If Alibaba truly had a chip that was remotely comparable to the H20, they wouldn’t need to ban the H20.
if the card is legit and china can scale it to millions
then its just matter of time when SOTA model is produced from china first or not
So faced with a choice of buying hobbled H20 GPU chips vs developing their own (so far behind the SOTA), the Chinese market decided to develop/buy their own GPU chips?
Who could have possibly seen this coming? /s
tldr it's somewhat comparable to an A100, which was released in May 2020.
If China is ok spending a few years catching up on chips then they must not think that "AGI" or a serious takeoff of AI is near.
Does any normal person think "AGI" is real?
I thought that was just the marketing strategy execs employed to get regulatory capture and convince all the AGI pilled researchers to work for them
They seem to be highly pragmatic. Rather than chasing AGI, they are more interested in what can be done with today's technology. Any breakthrough towards AGI will inevitably leak quickly, so they'll be able to catch up as long as the foundation is ready. In a bicycle race, it can be quite beneficial to travel behind the leader and enjoy a reduction in drag forces. Perhaps that's their guiding principle.