I read the DeepSeek papers and I think. “None of this is complicated. I’m just way too lazy to write my own Cuda code. I’m American. This is America. Programming your GPU at 100% efficiency? Mfer if my PyTorch code is shitty, I’ll just buy more GPUs”
