GitHub - srush/LLM-Training-Puzzles: What would you do with 1000 H100s...

2 min read Original article ↗

Skip to content

Navigation Menu

Sign in

Appearance settings

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Appearance settings

What would you do with 1000 H100s...

License

MIT license

1.1k stars 69 forks Branches Tags Activity

Notifications You must be signed in to change notification settings

Repository files navigation

LLM Training Puzzles

image

This is a collection of 8 challenging puzzles about training large language models (or really any NN) on many, many GPUs. Very few people actually get a chance to train on thousands of computers, but it is an interesting challenge and one that is critically important for modern AI. The goal of these puzzles is to get hands-on experience with the key primitives and to understand the goals of memory efficiency and compute pipelining.

I recommend running in Colab. Click here and copy the notebook to get start.

Open In Colab

image

If you are into this kind of thing, this is 6th in a series of these puzzles.

About

What would you do with 1000 H100s...

Topics

Resources

Readme

License

MIT license

Activity

Stars

1.1k stars

Watchers

11 watching

Forks

69 forks

Releases

No releases published

Packages

No packages published