Settings

Theme

Ask HN: Hardware for learning LLM fine-tuning

2 points by rreyes1979 2 years ago · 3 comments · 1 min read

Reader

What would be your suggestions on hardware to either rent or buy to get into LLM fine-tuning without going broke? Got a personal project and would like to start exploring fine-tuning (Llama2?) in the near future.

ilaksh 2 years ago

Fine tuning in what respect? With large datasets? Or little ones. Makes a huge difference as far as training time and costs.

Look into qLoRA or 8bit quantization. You won't need a lot of memory if you do it locally with a small model and don't merge the weights. You can also use Modal Labs, RunPod, or Replicate.com. They have Serverless and per second billing which is great for testing inference.

Amazon Sagemaker training jobs can also work. Just be careful not to leave an inference endpoint on if you use Amazon for that too. You can easily go broke.

  • rreyes1979OP 2 years ago

    I am as n00b as someone can be on this (although I've been doing software engineer for more than 20 years now) so please ignore any nonsense I may express.

    My intention is to work on news clustering and summarization. So far just by using some "clever" prompts I have been able to generate some pretty good news summaries and I have not started clustering yet. But, I have used GPT 4 so far and my educated guess is that soon enough I will hit some quality / cost limits. So, fine tuning a Llama 2 model with (hopefully) small datasets to improve costs and quality on my specific tasks seems like a reasonable path forward.

    Does that make sense? Thank you for your answer!!!

    • ilaksh 2 years ago

      I think it depends on the task and the result of the fine tune which is mainly based on the training dataset and ability of the base model, whether you will be able to maintain the quality.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection