Visualize Any Hugging Face Model
hfviewer.comThis is a neat idea. When I'm looking up models I usually want to see something about the architecture, but also some of the hyperparameters for the specific model---residual dimension, total number of layers, tokenizer configs. There's some of that in the visualization but it's spotty.
The results for Nemotron 3 Nano are hard to parse, and I think actually incorrect: https://hfviewer.com/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-B... I'm guessing this is because the implementation uses layers that are all instances of the same class, with forward passes that branch on the layer type specified at construction time.
Hi, I'm from Embedl (embedl.com / https://huggingface.co/embedl) and we made the hfviewer. Could you please elaborate more on why the Nemotron model visualization might be incorrect? A number of passes are performed to get the graph structure from the HuggingFace conf including sometimes exporting the model with torch.export and the recombining it to make the view meaningful. We would love to fix any issues and make the viewer better.
Where is it capturing the model "structure" from?
Most hugging face models are implemented in PyTorch, with an architecture specified as a series of layers. This looks like a nice visualization of that.