Ask HN: Which laptop can run the largest LLM model?
I’d like to experiment with LLMs locally and understand their infrastructure better. https://rog.asus.com/us/laptops/rog-flow/rog-flow-z13-2025/s... The out of stock one has 128gb of unified system ram. AMD 395 ai chip. So easily run 70B models on that much vram; but slower, probably in that 30-40tokens/s which is very usable. Qwen 3 30b will be in that 60 tokens/s range. llama 4 scout will be around 20-30tokens/s Interesting, I found it on Amazon for $5k:
https://a.co/d/h085rvP That’s the same price as an M4 Max MBP with the same ram and storage. Any idea how they compare in performance? Don’t the M-series processors for Mac book pro’s have a huge amount of HBM which is good for models? I see you can get a pro with 48MB of unified memory whereas Alienware will sell you a machine with 32GB of regular ram and 24GB of graphics RAM on a 5090 discrete GPU. So the pro has twice the RAM accessible to the GPU. Looks like the MacBook Pro might be more cost effective? I like the support for larger models. Thanks!