MrUssek
- Karma
- 208
- Created
- 6 years ago
Recent Submissions
- 1. ▲ China has trained a 10 trillion parameter language model (twitter.com)
- 2. ▲ What is your backup if the tech industry crashes?
- 3. ▲ The Future of Deep Learning Is Photonic (spectrum.ieee.org)
- 4. ▲ Separating MNIST digits using Optimal Transport (mrussek.com)
- 5. ▲ Enigma: GPT-2 trained on 10K Nature Papers: Can you spot the difference? (stefanzukin.com)
- 6. ▲ GShard: Scaling giant models with conditional computation and automatic sharding (arxiv.org)