1.32 Petaflops hardware for local prototyping

3 points by brody_slade_ai 10 months ago · 3 comments

Reader

I’ve been working on Vanta, a scalable AI hardware solution powered by 2–8 NVIDIA RTX 4090s, delivering up to 1.32 petaflops FP32 in a compact form factor.

It’s built for startups, developers and researchers to prototype, fine-tune and run models up to 70B parameters locally. So you can own your computer instead of renting.

- A 2-GPU setup costs $9k and breaks even in 9 months vs. cloud rental at $0.69/hr (ex: RunPod).

- The 8-GPU at $40k saves $12k in year one compared to $48k in cloud costs.

This can handle different AI framework: TensorFlow, PyTorch, ONNX, CUDA-optimized libraries, VLLM, SGLANG, llama.cpp...

I can get it built in a day and shipped out quick. Let me know what you think!

bigyabai 10 months ago

How do you intend to compete with Nvidia's DIGITS/DGX Spark hardware that is hitting the market soon?
- brody_slade_aiOP 10 months ago
  
  If you look into Digits and Vanta, Digits is better for FP4 inference, while a 4090 setup is more cost effective per FP32 FLOP, ideal for training large models and general compute
  8x GPUs can handle parallel tasks better than Digits' single chip and scale easier. And 192GB of memory also exceeds its 128 GB unified memory for massive workloads
  We’ve got 4090s on deck, stock’s solid, can get a rig built in 1 day and shipped fast

Settings

1.32 Petaflops hardware for local prototyping

Keyboard Shortcuts