Microsoft Foundry Local for Windows and Mac
learn.microsoft.comFoundry Local is an on-device AI inference solution offering performance, privacy, customization, and cost advantages.
Optimize performance using ONNX Runtime and hardware acceleration, Foundry Local will automatically select and download a model variant with the best performance for your hardware. (CUDA if you have NVidia GPU, NPU-optimized model for Qualcomm NPU, and if nothing, CPU-optimized model)
Python and js SDK available.
If the model is not already available in ONNX, Olive (https://microsoft.github.io/Olive/) allows to compile existing models in Safetensor or PyTorch format into the ONNX format
Foundry Local is licensed under the Microsoft Software License Terms