We Hit 10,500 Tokens/Sec on B200 | Morph

1 min read Original article ↗

Technical deep-dive: custom CUDA kernels + speculative execution for 2.3x speedup

Tejas Bhakta

Tejas Bhakta

September 15, 20254 min read

We Hit 10,500 Tokens/Sec on B200