Myths and Urban Legends About Dual-Socket Servers
amd.comWith chiplets being a thing these days, I guess this is really about where in the overall system is the best place to put the package boundary and socket/pins? And then another approach to that same question would be what apple is doing with in-package ddr5 (which I think I heard amd is copying with a custom line for... Microsoft / azure I think it was?).
My understanding is that one factor pushing hyperscalers towards dual-socket was the cost of the network fabric - for the longest time, having two CPU sockets per NIC / per leaf switch port was the overall system price/performance sweet spot for many workloads. More sockets required more expensive CPUs, while single-socket servers need twice as many NICs and twice as many top-of-rack switchports.
With newer/faster ethernet standards you still need twice as many NICs but you can often split the lanes coming out of a switch chip and use a Y cable.
This article reads like an AMD advertisement for their EPYC processor line.
It was written by Robert Hormuth AMD’s Corporate Vice President, Architecture & Strategy – Data Center Solutions Group and published on amd.com.
Are you one of the reasons why SEO spam sites are clicked on so often?
> Are you one of the reasons why SEO spam sites are clicked on so often?
I don't think your poorly thought-through personal attack has any relevance to the topic. I clicked on the article because there was a submission in HN with the title "Myths and Urban Legends About Dual-Socket Servers". What leads you to believe SEO holds any relevance
It does but it also seems like AMD is saying just buy one CPU, which is weird because you’d think they would want you to buy two to double the profit.
They’re fighting a calcified perception in the corporate IT market that the standard “unit of scale” should be a dual-socket system, because they have a differentiated product in EPYC that shines single-socket.
This likely still remains a major market barrier for them: Outside of the always-be-optimizing hyperscalers, “ordinary” datacenter buyers tend to follow old patterns and rules of thumb from generation to generation.
AMD chips have more cores than Intel chips, so pushing for a single powerful CPU means "Buy AMD, not Intel."
Of course they'll like it even better if you bought two AMD chips instead of one, but they probably don't care as much whether you put those chips into one server or two.
Intel will have more cores shortly.
> Intel will have more cores shortly.
The article was posted in 2023.
More weak cores with lower overall performance.
> It does but it also seems like AMD is saying just buy one CPU, which is weird because you’d think they would want you to buy two to double the profit.
The are saying "buy a single EPYC instead of two of our competition".
I think that's because it is.
True, but it reminds us that advertising can be educational, technical, and straightforward instead of manipulative, emotional, and focus-grouped.
There's nothing wrong per se with writing an article about your awesome product and why everyone should use it
It’s on amd.com
EPYC brings plenty of NUMA complexity in a single socket unfortunately. If you just want to solve system performance riddles then one socket is plenty. I seem to recall that Facebook publicly announced that they switched their web server systems to 1 socket more than 8 years ago. Since that time Netflix has written several times about how they carefully keep the sides of a 2S server from interfering with each other and I always wondered why they bother, why they don't just saw the system in half and save themselves the trouble.
Netflix's CDN is optimized to reduce space and effort needed by ISPs to install them.
They want a small box, because ISPs have limited space.
They want a single LACP group, because ISPs have limited ports, and to use only one IP address, because ISPs have limited addresses.
And they want to make it easy to plug in properly, so that they can reduce communication with the ISP.
These all add up to a dual socket node over two single socket nodes in one box. Although, as single socket capabilities increase, they may end up with a single socket node instead.
Not mentioned in this is the issue of memory scaling.
DRAM price per GB has been roughly flat for well over a decade - consumer prices hit $4/GB in 2011, and have fluctuated around there ever since - most of the drop in real cost since then has been due to declining value of that $4. Prices for large enterprise/hyper scalers are probably similar, as it’s a low-margin commodity market.
Two sockets gets you more memory channels and more DINNS, but as memory price causes the RAM/CPU ratio to drop, and single-channel bandwidth increases with DDR5, that becomes less important.
Of course that’s one of those things you can’t really say to customers, kind of like “you don’t really need 250hp in a passenger sedan”.
I thought that most of the reason that Apple's newer cpus are supposed to be so good for LLMs is that the integrated memory lets them have more channels than usual?
Yes, but that has nothing to do with servers which is the topic of this thread.
If price didn't matter, what would be the best performing CPU available today? With lots of PCIe lanes and high single-thread performance?
If single threaded perf is all that matters, probably Epyc 9175F. Zen5, 16 chiplets, one core per. Each core has 32MB of L3. Boosts to 5Ghz. 128 lanes of pci-e 5.0.
If/when they make a v-cache version of this, that'll most likely be even better: Zen5 v-cache doesn't have the clock speed penalty that previous generations did (because the cache is underneath instead of on top) and 96MB of L3 per core would be monstrous.
For what workload?
Probably AMD Turin.
If you want to inference big LLMs on CPU, you really want those 2x 12 memory channels a Dual EPYC system offers you for decent speeds.
> (..) you really want those 2x 12 memory channels a Dual EPYC system offers (...)
I had to check and I was amazed that there are companies selling workstations with dual EPYC processors, providing a whopping 256 CPU cores and over 2TB of DDR5. All in a desktop form factor. Amazing.