gpjt
- Karma
- 1,609
- Created
- 17 years ago
About
https://www.gilesthomas.com/Recent Submissions
- 1. ▲ Writing an LLM from scratch, part 32e – Interventions: the learning rate (gilesthomas.com)
- 2. ▲ Writing an LLM from scratch, part 32d – Interventions: adding attention bias (gilesthomas.com)
- 3. ▲ Writing an LLM from scratch, part 32c – Interventions: removing dropout (gilesthomas.com)
- 4. ▲ Writing an LLM from scratch, part 32B – Interventions: gradient clipping (gilesthomas.com)
- 5. ▲ Writing an LLM from scratch, part 32a – Interventions: training a baseline model (gilesthomas.com)
- 6. ▲ Getting a Custom PyTorch LLM onto the Hugging Face Hub (gilesthomas.com)
- 7. ▲ Writing an LLM from scratch, part 31 – the models are now on Hugging Face (gilesthomas.com)
- 8. ▲ Writing an LLM from scratch, part 30 – digging into the LLM-as-a-judge results (gilesthomas.com)
- 9. ▲ LLM from scratch, part 29 – using DDP to train a base model in the cloud (gilesthomas.com)
- 10. ▲ LLM from scratch, part 28 – training a base model from scratch on an RTX 3090 (gilesthomas.com)