Settings

Theme

Show HN: Single file transformers implementation for learning

gist.github.com

2 points by vkaku 5 months ago · 1 comment

Reader

vkakuOP 5 months ago

This was entirely written by Grok 3.0.

The focus was on being able to demonstrate training, inference and attention, all in one file;

This can be run on a GPU thanks to cupy, a kernel needn't be written for this whole thing to run. I definitely think that more people can mess around with different attention mechanisms and models and try training models out on their computers. That is the post.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection