Settings

Theme

Zml-smi: universal monitoring tool for GPUs, TPUs and NPUs

zml.ai

76 points by steeve 6 days ago · 12 comments

Reader

rdyro a day ago

Looks cool!

nvtop can actually support TPUs too via https://github.com/rdyro/libtpuinfo/ https://github.com/Syllo/nvtop/blob/76890233d759199f50ad3bdb...

serialx 17 hours ago

Look into all-smi https://github.com/lablup/all-smi It supports all GPUs thinkable including Apple Silicon and many AI accelerator cards.

mrflop 6 days ago

Renaming fopen64 to intercept library calls feels like a brittle hack masquerading as "sandboxing." Why not just upstream this hardware support to nvtop instead of fragmenting the ecosystem?

  • steeveOP 6 days ago

    sadly, sandboxing is something that can't be upstreamed. this way, sandboxing is kept in zml instead of patching mesa.

    as for nvtop, great program, but we missed a few features (such as sandboxing)

    • pstuart 21 hours ago

      It looks cool and I was excited to get monitoring for the NPU on my Ryzen AI 395+, unfortunately it does not show. NPU support in linux really seems to be an afterthought.

      • steeveOP 21 hours ago

        Weird, because we tried it. It doesn’t show anything?

        We use the amdsmi to get metrics. I’ll investigate.

  • marwanet 21 hours ago

    If this logic were pushed into nvtop, wouldn't the codebase become unmaintainable? Each vendor's interception method is going to be different.

imcritic 14 hours ago

Is it capable of exposing metrics in Prometheus format?

synergy20 14 hours ago

would be nice to have cpu usage added so I have all in one?

currently I use btop which shows basic gpu load along with cpu, network, etc.

152334H 19 hours ago

"NPU" seems to refer to trainium only?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection