Settings

Theme

LLMs Powered by Kolmogorov-Arnold Networks

6 points by adityang5 2 years ago · 1 comment · 1 min read


Seeing as the authors claim that KANs are able to reduce the issues of catastrophic forgetting that we see in MLPs, I thought "Wouldn't it be nice if there was an LLM that substituted MLPs with KANs?". I looked around and didn't find one, so I built one!

- PyTorch Module of the kan_gpt

- Deployed to PyPi

- MIT Licence

- Test Cases to ensure forward-backward passes work as expected

- Training script

I am currently working on training it on the WebText dataset to compare it to the original gpt2. Facing a few out-of-memory issues at the moment. Perhaps the vocab size (50257) is too large?

I'm open to contributions and would love to hear your thoughts!

p1esk 2 years ago

Why don’t you test it first with a small model on mnist or cifar?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection