TensorFlow-DirectML
github.comJust tried it.
On integrated graphics (Intel HD 620) (batch size 1000):
- was able to train a simple dense network, but no speed up over just doing CPU training on the same processor (i3-7100U)
- ResNet style architecture failed on the same HD 620 with "LLVM ERROR: SPIRV internal error: Invalid magic number"
On a machine with NVidia GPU (batch size 1000):
- unlike Intel GPU, ResNet trained without any errors (so it might have been Intel driver issue)
- using DirectML came out about 3 times faster, than CPU of the machine (i7-8700K)
- using DirectML came out about 12 times slower, than using regular tensorflow-gpu with CUDA
So far mixed feelings, but I am excited to see how it runs on AMD GPUs, and on Windows on ARM64 (e.g. Surface X).
P.S. I run https://github.com/losttech/Gradient-Samples/tree/master/Fas... and https://github.com/losttech/Gradient-Samples/tree/master/Res...
had to add "batch_size: 1000" to the fit call to see speedups over CPU.