SambaNova chip for LLMs nearly as fast as Groq
fast.snova.aiIf the crypto wave was any indication, llm asics will gain popularity as general purpose devices like Nvidia gpus get phased out.
I'm hoping the rise of Asics leads to hardware agnostic representations of ML models, or at least seamless conversion from Cuda to other target hardware.
With LLMs standardizing around known instructions, I'm bullish about this family of hardware. Now, they may just get acquired given the infinite money Nvidia has. But, I see their IP being valuable by itself.
I've used Groq, and it's been one of my few 'oh wow' moments of the recent llm cycle.
This offers no insight into the architecture of either system or the chips they developed. Also, why would I want to access their stack and have them use my queries as training data?
I see nothing from either Groq or SambaNova that says that they will distribute dedicated inference chips in any other form than full data centers. If I can't slot it into my machine, is it real? How exactly are these companies envisioning the future of inference? As a walled corporate garden where we pay them with our thoughts code low complexity tasks and small change so they can cement their hold on our material lives and make themselves indispensable as they imperceptibly shape our outcomes? At least Tenstorrent is selling something I could slot, although I don't think they are at the point where it makes sense to do so
Let me know how that goes.
When you're competing with NVIDIA, 'nearly as fast' as the next best competitor is not going to cut it. I wonder what part of the market they're looking for here.
Both are faster than NVIDIA! Always good to see competition!
I mean that 'almost as fast as Groq' is not a compelling strategy. If it's almost as fast, why not just go with Groq? Are they going for cheaper? Added services? I'm just wondering.
Groq quantize their models, killing accuracy. SambaNova run the models at full precision, and on a single node, whilst Groq runs a single model on hundreds of chips. SambaNova run training and inference, Groq only run inference. SambaNova run hundreds on models on a single node without any performance degradation, Groq need hundreds of chips for a single model.
Groq are trying to launch their own cloud and running it at a loss to try to acquire users are raise venture capital so they can tape out a new chip (their current chip just is not very competitive, they get speed for having a good interconnect, but it is so cost ineffective). SambaNova sell systems to big companies and service providers, and don’t have a paid cloud API, so aren’t as visible as Groq. But SambaNova make real revenue whilst Groq have literally zero revenue, which is why they struggle to raise money.
Very soon people will realize that Groq are frauds and SambaNova is the legit challenger to Nvidia.
Well, Groq memory-per-rack looks very low. That is to say, the whole world now understands how exciting very high inference throughput is in a way that almost nobody did when Groq started, (I think I saw an early pitch deck, and don't recall fast inference as even a differentiator in that deck, although I could be wrong). However, the number of Groq chips/servers/whatever you call them that are needed to get up to running Llama 400B looks like a lot. Like many, many racks worth.
Plus, Groq claims to have converted over to only being a cloud provider now, and will keep their hardware to themself. Given that I can't even sign up for a pay-as-you-go API key with Groq right now, I think there's a lot of room for competitors.
Never got my verification link. Maybe it’s because I used + notation
Can this be mass produced?