lmcli - Large ____ Model CLI
lmcli is a versatile CLI and TUI for interacting with LLMs and LMMs.
Features
- Clean & snappy
- Sandbox tool execution
- OpenAI-compatible and Anthropic-compatible API clients (i.e. talk to anything)
- Branching, persisted conversations (SQLite)
vi-like movements- Export conversations to JSON or HTML
- Image support - attach images for vision models
- < 50MB RSS per instance
Installation
go install codeberg.org/mlow/lmcli@latest
Dependencies
lmcli works best when the following tools are available:
- ripgrep - Grep tool
- libchafa - image rendering
- uv - Python tool
--withdependencies - bubblewrap - Sandbox tools with
bwrap
Configuration
# ~/.config/lmcli/config.yaml
defaults:
model: deepseek-v4-pro
maxTokens: 64000
temperature: 0.8
reasoningEffort: medium
agent: default
agents:
# Default agent, no tools or persona
- name: default
systemPrompt: |-
You are a helpful assistant.
# Coding agent
- name: coding
code: true # when true, append AGENTS.md / CLAUDE.md content to system prompt
# and store the CWD as the conversation's workspace
tools:
- Bash
- Glob
- Grep
- Read
- Write
- Edit
- Python
systemPrompt: |-
<agent>
<persona>You are an expert software engineer helping with coding tasks in a local repository. You have access to tools for reading, searching, and editing files, and for running shell commands.</persona>
<tools_guide>
Prefer dedicated tools over Bash whenever possible:
- To read a file, use Read — not Bash with cat/head/tail
- To search file contents, use Grep — not Bash with grep/rg
- To find files by name, use Glob — not Bash with find/ls
- To edit a file, use Edit — not Bash with sed/awk
- To create a new file, use Write
- Reserve Bash for tasks which aren't covered by the other tools: running builds, tests, or other shell commands as requested
- Check with the user before using Bash to build, run, or test code
- Prefer the Python tool over complex bash scripts.
</tools_guide>
<guiding_principles>
- **Data over algorithms.** Most real-world problems are solved by applying simple operations to well-defined, domain-specific shapes of data.
- **Separate data from behavior, but keep them co-located.** Data definitions should be distinct from the logic that manipulates them, but they shouldn't live far apart in the codebase.
- **Avoid accidental complexity.** True complexity should usually be a result of optimizing for performance, not solving problems *with the program's own structure*. More code that *solves domain problems*, less code that *prepares to solve problems*.
- **Domain understanding is the bottleneck.** It is impossible to write a good solution without a deep understanding of the domain. Code falls naturally from an excellent understanding of the domain. If implementation is hard, we might be missing key details. (Acknowledging that we sometimes write code to discover what is domain-relevant and what isn't.)
- **Strong naming.** Variables, functions, and types map directly back to real-world domain concerns.
- **Narrative control flow.** The execution path reads like a straightforward, step-by-step narrative of the real-world process. We avoid complex, nested, or inlined logic that obscures the narrative.
- **Ease of comprehension.** The data structures and overall flow should be grokkable to anyone familiar with the real-world domain, even with minimal programming experience.
</guiding_principles>
<workflow>
- Always read a file before editing it. Understand the existing code before making changes.
- Pay careful attention to tab-versus-space indendation.
- Make the minimum change needed. Do not refactor surrounding code, add comments, or introduce abstractions unless requested.
- Always preserve indentation style (tabs versus spaces)
- Follow existing error handling practices, do not introduce new error unless requested.
- Avoid creating new files unless necessary. Prefer editing existing ones.
- Follow established project coding style and patterns
- Edit performs an exact string replacement. old_string must match the file content exactly, including all whitespace and indentation. **It is important to search and replace with tab characters when the existing source file(s) use tabs.** If old_string appears more than once, provide more surrounding context to make it unique. Never guess at whitespace — read the file first.
</workflow>
<tone>Prefer concise, professional responses. You are working with an experienced full-stack software and infrastructure engineer.</tone>
</agent>
# Provider configuration
providers:
# llama.cpp
- kind: openai
name: llama.cpp
baseUrl: http://localhost:8980/v1
models:
# Whichever model llama-server is serving
- llama-server
# Ollama
- name: ollama
display: Ollama
kind: openai
baseUrl: http://localhost:11434/v1
models:
- name: qwen3.5:4b
# Deepseek
- name: deepseek
display: DeepSeek
kind: openai
baseUrl: https://api.deepseek.com/
apiKey: ...
models:
- deepseek-v4-pro
- deepseek-v4-flash
# Openrouter
- name: openrouter
display: OpenRouter
kind: openai
baseUrl: https://openrouter.ai/api/v1
apiKey: ...
models:
# Proprietary
- name: google/gemini-3.1-pro-preview
# Open weight
- name: z-ai/glm-4.7
# Deepinfra
- name: deepinfra
kind: openai
display: DeepInfra
baseUrl: https://api.deepinfra.com/v1/openai
apiKey: ...
models:
- name: zai-org/GLM-5
temperature: 0.75
# Anthropic
- name: anthropic
kind: anthropic
apiKey: ...
models:
- name: claude-sonnet-4-6
temperature: 1.0
- name: claude-sonnet-4-6
temperature: 1.0
# Cerebras
- name: cerebras
kind: openai
baseUrl: https://api.cerebras.ai/v1
apiKey: ...
models:
- name: zai-glm-4.7
# Z.ai coding plan
- name: zai-plan
display: Z.ai (Plan)
kind: openai
apiKey: ...
baseUrl: https://api.z.ai/api/coding/paas/v4
models:
- name: glm-5.1
temperature: 1.0
# Conversation management
conversations:
titleGenerationModel: llama-server
titleGenerationPrompt: |
When provided the conversation history between a user and an AI assistant, you respond *only* with a JSON object containing a single `title` field which provides a helpful 6-10 word title for the conversation, and no other text (no additional fields, no preamble, no markdown formatting, etc).
Example response:
{"title": "Help with math homework"}
# Sandbox
sandbox:
enabled: true # whether to enable bubblewrap sandbox, default false
# Host directories to mount into sandbox, Docker syntax
bindDirs:
- /home/mlow/.local/state/nvim
- /home/mlow/.local/share/nvim
- /home/mlow/.config/nvim:ro
- /home/mlow/go
# Directories which should persist across tool runs
# Stored to ~/.local/share/lmcli/sandbox/<path>
persistDirs:
- /tmp
# Look & Feel
chroma:
style: onedark
formatter: terminal16m
tui:
hideHelpHint: false # whether to hide the 'press ctrl+h for help' hint in the TUI, default false
Syntax highlighting
Syntax highlighting is performed by Chroma.
Refer to
Chroma/styles for
available styles (TODO: add support for custom Chroma styles).
Available formatters:
terminal- 8 colorsterminal16- 16 colorsterminal256- 256 colorsterminal16m- true color (default)
Agents
An 'agent' in lmcli is the combination of a system prompt and a toolbox. (the
agent's available tools). Agents are defined in config.yaml and are called
upon with the -a/--agent flag.
Tools
These buil:
Glob: List files based on glob patternsGrep:ripgrep-powered code searchRead: Read contents of a fileWrite: Write contents of a fileEdit: Search and replace contents of a fileBash: Execute shell commandsPython:uv run --with=<deps>-powered python execution
Obviously, some of these tools carry significant risk. See Sandbox for a layer of protection, but do not consider it a silver bullet. Use caution and common sense!
Tools run in their own process via the lmcli tool command, which is called
with everything it needs to execute any given tool.
Note: within the TUI, tools usage is gated behind a user prompt. Press ctrl-a
to cycle through read-only, read-write, and read-write-execute unattended
permissions.
Sandbox
When available and configured to, lmcli sandboxes each of its tool
invocations with bubblewrap.
Since lmcli's tools run in their own sandbox, they cannot e.g. read lmcli's
own configuration.
The sandbox still has access to your host network, and therefore is not
intended to prevent escape. Its primary purpose is to prevent the model from
having easy access to e.g. your config.yaml API keys.
Host paths may be mounted in as needed, see the sandbox.bindDirs
configuration (follows Docker's volume mount syntax).
Paths may be made persistent across tool calls via sandbox.persistDirs (e.g. build caches, /tmp, etc).
Sandboxing is currently opt-in, sandbox.enabled must be true.
Usage
$ lmcli help
lmcli - Large Language Model CLI
Usage:
lmcli <command> [flags]
lmcli [command]
Available Commands:
chat Open the chat interface
clone Clone conversations
code Open the chat interface with a code agent
completion Generate the autocompletion script for the specified shell
edit Edit the last user reply in a conversation
export Export a conversation
help Help about any command
list List conversations
new Start a new conversation
prompt Do a one-shot prompt
rename Rename a conversation
reply Reply to a conversation
retry Retry the last user reply in a conversation
rm Remove conversations
view View messages in a conversation
Flags:
-h, --help help for lmcli
Use "lmcli [command] --help" for more information about a command.
Note: Use ctrl+h in the TUI view for keybindings.
Examples
Start a new chat with the code agent wired up to perform file editing.
$ lmcli code
Start a new conversation, imperative style (no tui):
$ lmcli new "Help me plan meals for the next week"
Send a one-shot prompt (no persistence):
$ lmcli prompt "What is the answer to life, the universe, and everything?"
tmux
Do you run lmcli with tmux? To get scroll-wheel scrolling working, you'll want the following in your ~/.tmux.conf:
# Emulate scrolling by sending up and down keys if these commands are running in the pane
tmux_commands_with_legacy_scroll="lmcli"
bind-key -T root WheelUpPane \
if-shell -Ft= '#{?mouse_any_flag,1,#{pane_in_mode}}' \
'send -Mt=' \
'if-shell -t= "#{?alternate_on,true,false} || echo \"#{tmux_commands_with_legacy_scroll}\" | grep -q \"#{pane_current_command}\"" \
"send -t= Up" "copy-mode -et="'
bind-key -T root WheelDownPane \
if-shell -Ft = '#{?pane_in_mode,1,#{mouse_any_flag}}' \
'send -Mt=' \
'if-shell -t= "#{?alternate_on,true,false} || echo \"#{tmux_commands_with_legacy_scroll}\" | grep -q \"#{pane_current_command}\"" \
"send -t= Down" "send -Mt="'
This example will be removed when bubbletea is able to handle scrolling in tmux on its own.
Roadmap
I aim to keep lmcli lightweight and focused on strong execution of simple
ideas. The driving philosophy is that models will grow capable of accomplishing
increasingly complex tasks with access to the same set of simple tools.
lmcli intends to keep the user in the loop. Background and scheduled jobs are
currently not planned.
- Image output
- Custom tools (both local and MCP)
- Built-in web search tool
- RAG-driven prior conversation search
- Conversation categorization/tagging
- Token accounting
License
MIT
Acknowledgements
lmcli is a hobby project. Special thanks to the Go community and the creators
of the libraries used in this project.
