Ask HN: Which of the following ML topics do you wish had good tutorials?
1. Distributed Reinforcement Learning with RLLib
2. Distributed Deep Learning with PyTorch
3. Reinforcement Learning with PyTorch
4. Linear Algebra for ML with numpy
5. Other (please specify)
I like to teach what I learn and have a few tutorials up on YouTube. I need your help in figuring what should I put up next. I'd be interested in Distributed Deep Learning with PyTorch, but only if you really know what you're talking about. I wouldn't want you to repeat what is already on pytorch.org on this topic.
Cool, thanks for the response. Yes, I do find that the PyTorch tutorials on distributed training are a work-in-progress.
I was thinking of starting with a basic implementation of the original paper by Jeff Dean, et. al. on synchronized data parallelism, implement basic model parallelism, explain why async parallelism works, do a simple implementation of HOGWILD!, and finally do "hello world" training using existing distributed training systems like Horovod, Distributed PyTorch, RayLib, Microsoft DeepSpeed, etc.
"Hello world" examples already exist for all of those. Reproducing them is not very interesting. If you're willing to dive a little deeper, try to implement SyncBatchnorm: explain design choices, measure the performance impact, describe any bugs you had in your implementation. Such a case study would be very interesting to read, and would probably get you noticed.
+1 for Distributed Learning