A basic training example using GGML · ggml-org ggml · Discussion #707

1 min read Original article ↗

Hi, I just want to share what I have been working on recently. This is an example of training a MNIST VAE. The goal is to use only ggml pipeline and its implementation of ADAM optimizer.

There aren't many training examples using ggml. The only one I found is baby-llama. But I think its way of doing opmization is not quite right. Found another training example in llama.cpp which shows a proper way of using Adam.

Some of the mods I have to add

  • Reuse the same forward and backward graph during training
  • Change in Adam and LBFGS optimizer to make GPU backend work
  • Add several missing OPs in both CPU and CUDA backends
  • Hooks (callbacks) added in optimizer to do tests and sample work

Below are some samples from the VAE trained on MNIST after each epoch (total 10 epochs).
mnist-sample-epoch_1 | mnist-sample-epoch_2
mnist-sample-epoch_3 | mnist-sample-epoch_4
mnist-sample-epoch_5 | mnist-sample-epoch_6
mnist-sample-epoch_7 | mnist-sample-epoch_8
mnist-sample-epoch_9 | mnist-sample-epoch_10