Settings

Theme

PyTorch Internals: Ezyang's Blog

blog.ezyang.com

443 points by Anon84 a year ago · 34 comments

Reader

smokel a year ago

Also interesting in this context is the PyTorch Developer Podcast [1] by the same author. Very comforting to learn about PyTorch internals while doing the dishes.

[1] https://pytorch-dev-podcast.simplecast.com/

  • swyx a year ago

    i think the problem w the podcast format (ironic for me to say) is that it assumes a lot higher familiarity with the apis than is afforded by any visual medium including blogs

    • smokel a year ago

      Agreed, but I'm still very happy that some people try. I'm really not that much interested in the weather or listening to idle chit-chat, and for some reason most podcasts seem to focus on that.

alexrigler a year ago

This is a fun blast from the near past. I helped organize the PyTorch NYC meetup where Ed presented this and still think it's one of the best technical presentations I've seen. Hand drawn slides for the W. Wish I recorded :\

zcbenz a year ago

For learning internals of ML frameworks I recommend reading the source code of MLX: https://github.com/ml-explore/mlx .

It is a modern and clean codebase without legacies, and I could understand most things without seeking external articles.

  • ForceBru a year ago

    Why is MLX Apple silicon only? Is there something fundamental that prevents it from working on x86? Are some core features only possible on Apple silicon? Or do the devs specifically refuse to port to x86? (Which is understandable, I guess)

    I'm asking because it seems to have nice autodiff functionality. It even supports differentiating array mutation (https://ml-explore.github.io/mlx/build/html/usage/indexing.h...), which is something JAX and Zygote.jl can't do. Instead, both have ugly tricks like `array.at[index].set` and the `Buffer` struct.

    So it would be cool to have this functionality on a "regular" CPU.

    • zcbenz a year ago

      Most features are already supported on x86 CPUs, you can pip install mlx on Linux , and you can even use it on Windows (no official binary release yet but it is building and tests are passing).

    • saagarjha a year ago

      I think it relies heavily on unified memory.

chuckledog a year ago

Great article, thanks for posting. Here’s a nice summary of automatic differentiation, mentioned in the article and core to how NN’s are implemented: https://medium.com/@rhome/automatic-differentiation-26d5a993...

hargun2010 a year ago

I guess its longer version of slides but not new I saw comment from as far back as 2023, nonetheless good content (resharable).

https://web.mit.edu/~ezyang/Public/pytorch-internals.pdf

aduffy a year ago

Edward taught a Programming Languages class I took nearly a decade ago, and clicking through here I immediately recognized the illustrated slides, brought a smile to my face

  • lyeager a year ago

    Me too, he was great. Tried his darndest to help me understand Haskell monads.

  • aostiles a year ago

    He was really nice in Stanford's CS 240h. He helped me better understand Safe Haskell and GHC internals.

vimgrinder a year ago

For someone it might help: If you are having trouble reading long articles, try text-to-audio with line highlight. It helps a lot. It has cured my lack of attention.

quotemstr a year ago

Huh. I'd have written TORCH_CHECK like this:

    TORCH_CHECK(self.dim() == 1) 
      << "Expected dim to be a 1-D tensor "
      << "but was " << self.dim() << "-D tensor";
Turns out it's possible to write TORCH_CHECK() so that it evaluates the streaming operators only if the check fails. (Check out how glog works.)
bilal2vec a year ago

See also dev forum roadmaps [1] and design docs (e.g. [2], [3],[4])

[1]: https://dev-discuss.pytorch.org/t/meta-pytorch-team-2025-h1-...

[2]: https://dev-discuss.pytorch.org/t/pytorch-symmetricmemory-ha...

[3]: https://dev-discuss.pytorch.org/t/where-do-the-2000-pytorch-...

[4]: https://dev-discuss.pytorch.org/t/rethinking-pytorch-fully-s...

nitrogen99 a year ago

2019. How much of this is still relevant?

  • mlazos a year ago

    I used this to onboard to the PyTorch team a few years ago. It’s useful for understanding the key concepts of the framework. Torch.compile isn’t covered but the rest of it is still pretty relevant.

  • kadushka a year ago

    I’m guessing about 80%

    • sidkshatriya a year ago

      To understand a complex system, sometimes it better to understand a (simpler) model system. Sometimes an older version of the same system is that good model system. This is not true always but a good rule of thumb.

pizza a year ago

Btw, would anyone have any good resources on using pytorch as a general-purpose graph library? Like stuff beyond the assumption of nets = forward-only (acyclic) digraph

brutus1979 a year ago

Is there a video version of this? It seems it is from a talk?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection