From GAN to WGAN (2017)
lilianweng.github.ioIf anyone is looking for a good project, alternate Wasserstein distance/EMD solvers would be a good place to start.
AFAIK everyone (eg, pyemd, gensim, textacy) uses wrappers around the EMD solver from http://ofirpele.droppages.com/, which is a zip file from some time in 2008. The limits on performance mean it can't practically be used in things like interactive nearest neighbor calculations (FAISS, nmslib, annoy etc)
Looks like a complex topic, but the code is small enough to help remove the rust from my math gears. I guess everybody would want it ported to CUDA/OpenCL/Vulkan?
it is weird that everyone uses this relatively old EMD solver.
in this case the WGAN doesn’t actually compute the discrete EMD like that. instead it uses some constraints in the optimization process (gradient clipping), which it can be argued make the training objective equivalent in the limit to minimizing continuous Wasserstein distance (between probability distributions).
FWIW, the code on that page computes EMD for histograms, which is basically the easy case. The EMD over a high dimensional continuous distribution like images is intractable by comparison.