shreyansh26
- Karma
- 10
- Created
- 8 years ago
Recent Submissions
- 1. ▲ Deriving the gradient for the backward pass of Layer Normalization (shreyansh26.github.io)
- 2. ▲ GTC'25 Notes: CUDA Techniques to Maximize Memory Bandwidth – Part 1 (shreyansh26.github.io)
- 3. ▲ FlashAttention in PyTorch (github.com)
- 4. ▲ Understanding FlashAttention (shreyansh26.github.io)
- 5. ▲ Ask HN: What are some good resources on Recommender Systems?