Local Model Router: Ollama/OpenAI-compat bridges for local LLMs via llama.cpp

1 points by g023 3 months ago · 0 comments · 1 min read

A high-performance local LLM server providing drop-in API compatibility with Ollama and OpenAI, built on llama.cpp's llama-server. Features automatic VRAM management, Hugging Face integration, and modular architecture. Unlike Ollama which bundles its own inference engine, LMR leverages the battle-tested llama.cpp backend while providing familiar APIs and intelligent model management.

https://github.com/g023/localmodelrouter

No comments yet.

Settings

Local Model Router: Ollama/OpenAI-compat bridges for local LLMs via llama.cpp

Keyboard Shortcuts