Settings

Theme

OsamaJaber

Karma
226
Created
7 months ago

Recent Submissions

  1. 1. AutoMegaKernel: Compiling a LLM into a single CUDA kernel (arxiv.org)
  2. 2. AutoMegaKernel: Compile an LLM into one provably-correct CUDA megakernel (github.com)
  3. 3. StreamIndex: Memory-bounded compressed sparse attention via streaming top-k (arxiv.org)
  4. 4. Show HN: AutoKernel, Auto GPU Kernel Optimization (arxiv.org)
  5. 5. DeepSeek V4's indexer dies at 65K. We got it to 1M on 6GB (arxiv.org)
  6. 6. AutoKernel: Autonomous GPU Kernel Optimization via Iterative Agent-Driven Search (arxiv.org)
  7. 7. DeepSeek V4's indexer OOMs at 65K context. We got it to 1M in 6G (arxiv.org)
  8. 8. Ouroboros: Dynamic Weight Generation for Recursive Transformers (arxiv.org)
  9. 9. Tide: Token-Informed Depth Execution for Per-Token Early Exit in LLM Inference (arxiv.org)
  10. 10. Own your AI. Optimized down to the kernel (runinfra.ai)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection