Settings

Theme

DeepSeek Open Source DeepGEMM – FP8 GEMM Library(300 lines for 1350+ FP8 TFLOPS)

twitter.com

4 points by helloericsf a year ago · 1 comment

Reader

helloericsfOP a year ago

Github: https://github.com/deepseek-ai/DeepGEMM

- Up to 1350+ FP8 TFLOPS on Hopper GPUs - No heavy dependency, as clean as a tutorial - Fully Just-In-Time compiled - Core logic at ~300 lines - yet outperforms expert-tuned kernels across most matrix sizes - Supports dense layout and two MoE layouts

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection