shreyansh26

Karma: 11
Created: 8 years ago

Recent Submissions

1. ▲ Understanding Multi-Head Latent Attention (From DeepSeek) (shreyansh26.github.io) 2 points · 4 months ago · 1 comment
2. ▲ Deriving the gradient for the backward pass of Layer Normalization (shreyansh26.github.io) 3 points · 1 year ago · 0 comments
3. ▲ GTC'25 Notes: CUDA Techniques to Maximize Memory Bandwidth – Part 1 (shreyansh26.github.io) 1 point · 1 year ago · 0 comments
4. ▲ FlashAttention in PyTorch (github.com) 2 points · 3 years ago · 1 comment
5. ▲ Understanding FlashAttention (shreyansh26.github.io) 2 points · 3 years ago · 0 comments
6. ▲ Ask HN: What are some good resources on Recommender Systems? 14 points · 3 years ago · 3 comments

All submissions on HN · View profile on HN