lmcli - NFHN Reader

lmcli - Large ____ Model CLI

lmcli is a versatile CLI and TUI for interacting with LLMs and LMMs.

Features

Clean & snappy
Sandbox tool execution
OpenAI-compatible and Anthropic-compatible API clients (i.e. talk to anything)
Branching, persisted conversations (SQLite)
vi-like movements
Export conversations to JSON or HTML
Image support - attach images for vision models
< 50MB RSS per instance

Installation

go install codeberg.org/mlow/lmcli@latest

Dependencies

lmcli works best when the following tools are available:

ripgrep - Grep tool
libchafa - image rendering
uv - Python tool --with dependencies
bubblewrap - Sandbox tools with bwrap

Configuration

# ~/.config/lmcli/config.yaml

defaults:
  model: deepseek-v4-pro
  maxTokens: 64000
  temperature: 0.8
  reasoningEffort: medium
  agent: default
agents:
  # Default agent, no tools or persona
  - name: default
    systemPrompt: |-
      You are a helpful assistant.

  # Coding agent
  - name: coding
    code: true # when true, append AGENTS.md / CLAUDE.md content to system prompt
               # and store the CWD as the conversation's workspace
    tools:
      - Bash
      - Glob
      - Grep
      - Read
      - Write
      - Edit
      - Python
    systemPrompt: |-
      <agent>
      <persona>You are an expert software engineer helping with coding tasks in a local repository. You have access to tools for reading, searching, and editing files, and for running shell commands.</persona>
      <tools_guide>
      Prefer dedicated tools over Bash whenever possible:
      - To read a file, use Read — not Bash with cat/head/tail
      - To search file contents, use Grep — not Bash with grep/rg
      - To find files by name, use Glob — not Bash with find/ls
      - To edit a file, use Edit — not Bash with sed/awk
      - To create a new file, use Write
      - Reserve Bash for tasks which aren't covered by the other tools: running builds, tests, or other shell commands as requested
      - Check with the user before using Bash to build, run, or test code
      - Prefer the Python tool over complex bash scripts.
      </tools_guide>
      <guiding_principles>
      - **Data over algorithms.** Most real-world problems are solved by applying simple operations to well-defined, domain-specific shapes of data.
      - **Separate data from behavior, but keep them co-located.** Data definitions should be distinct from the logic that manipulates them, but they shouldn't live far apart in the codebase.
      - **Avoid accidental complexity.** True complexity should usually be a result of optimizing for performance, not solving problems *with the program's own structure*. More code that *solves domain problems*, less code that *prepares to solve problems*.
      - **Domain understanding is the bottleneck.** It is impossible to write a good solution without a deep understanding of the domain. Code falls naturally from an excellent understanding of the domain. If implementation is hard, we might be missing key details. (Acknowledging that we sometimes write code to discover what is domain-relevant and what isn't.)
      - **Strong naming.** Variables, functions, and types map directly back to real-world domain concerns.
      - **Narrative control flow.** The execution path reads like a straightforward, step-by-step narrative of the real-world process. We avoid complex, nested, or inlined logic that obscures the narrative.
      - **Ease of comprehension.** The data structures and overall flow should be grokkable to anyone familiar with the real-world domain, even with minimal programming experience.
      </guiding_principles>
      <workflow>
      - Always read a file before editing it. Understand the existing code before making changes.
      - Pay careful attention to tab-versus-space indendation.
      - Make the minimum change needed. Do not refactor surrounding code, add comments, or introduce abstractions unless requested.
      - Always preserve indentation style (tabs versus spaces)
      - Follow existing error handling practices, do not introduce new error unless requested.
      - Avoid creating new files unless necessary. Prefer editing existing ones.
      - Follow established project coding style and patterns
      - Edit performs an exact string replacement. old_string must match the file content exactly, including all whitespace and indentation. **It is important to search and replace with tab characters when the existing source file(s) use tabs.** If old_string appears more than once, provide more surrounding context to make it unique. Never guess at whitespace — read the file first.
      </workflow>
      <tone>Prefer concise, professional responses. You are working with an experienced full-stack software and infrastructure engineer.</tone>
      </agent>

# Provider configuration
providers:
  # llama.cpp
  - kind: openai
    name: llama.cpp
    baseUrl: http://localhost:8980/v1
    models:
      # Whichever model llama-server is serving
      - llama-server

  # Ollama
  - name: ollama
    display: Ollama
    kind: openai
    baseUrl: http://localhost:11434/v1
    models:
      - name: qwen3.5:4b

  # Deepseek
  - name: deepseek
    display: DeepSeek
    kind: openai
    baseUrl: https://api.deepseek.com/
    apiKey: ...
    models:
      - deepseek-v4-pro
      - deepseek-v4-flash

  # Openrouter
  - name: openrouter
    display: OpenRouter
    kind: openai
    baseUrl: https://openrouter.ai/api/v1
    apiKey: ...
    models:
      # Proprietary
      - name: google/gemini-3.1-pro-preview
      # Open weight
      - name: z-ai/glm-4.7

  # Deepinfra
  - name: deepinfra
    kind: openai
    display: DeepInfra
    baseUrl: https://api.deepinfra.com/v1/openai
    apiKey: ...
    models:
      - name: zai-org/GLM-5
        temperature: 0.75

  # Anthropic
  - name: anthropic
    kind: anthropic
    apiKey: ...
    models:
      - name: claude-sonnet-4-6
        temperature: 1.0
      - name: claude-sonnet-4-6
        temperature: 1.0

  # Cerebras
  - name: cerebras
    kind: openai
    baseUrl: https://api.cerebras.ai/v1
    apiKey: ...
    models:
      - name: zai-glm-4.7

  # Z.ai coding plan
  - name: zai-plan
    display: Z.ai (Plan)
    kind: openai
    apiKey: ...
    baseUrl: https://api.z.ai/api/coding/paas/v4
    models:
      - name: glm-5.1
        temperature: 1.0

# Conversation management
conversations:
  titleGenerationModel: llama-server
  titleGenerationPrompt: |
    When provided the conversation history between a user and an AI assistant, you respond *only* with a JSON object containing a single `title` field which provides a helpful 6-10 word title for the conversation, and no other text (no additional fields, no preamble, no markdown formatting, etc).

    Example response:

    {"title": "Help with math homework"}

# Sandbox
sandbox:
  enabled: true # whether to enable bubblewrap sandbox, default false
  # Host directories to mount into sandbox, Docker syntax
  bindDirs:
    - /home/mlow/.local/state/nvim
    - /home/mlow/.local/share/nvim
    - /home/mlow/.config/nvim:ro
    - /home/mlow/go
  # Directories which should persist across tool runs
  # Stored to ~/.local/share/lmcli/sandbox/<path>
  persistDirs:
    - /tmp

# Look & Feel
chroma:
  style: onedark
  formatter: terminal16m

tui:
  hideHelpHint: false # whether to hide the 'press ctrl+h for help' hint in the TUI, default false

Syntax highlighting

Syntax highlighting is performed by Chroma.

Refer to Chroma/styles for available styles (TODO: add support for custom Chroma styles).

Available formatters:

terminal - 8 colors
terminal16 - 16 colors
terminal256 - 256 colors
terminal16m - true color (default)

Agents

An 'agent' in lmcli is the combination of a system prompt and a toolbox. (the agent's available tools). Agents are defined in config.yaml and are called upon with the -a/--agent flag.

Tools

These buil:

Glob: List files based on glob patterns
Grep: ripgrep-powered code search
Read: Read contents of a file
Write: Write contents of a file
Edit: Search and replace contents of a file
Bash: Execute shell commands
Python: uv run --with=<deps>-powered python execution

Obviously, some of these tools carry significant risk. See Sandbox for a layer of protection, but do not consider it a silver bullet. Use caution and common sense!

Tools run in their own process via the lmcli tool command, which is called with everything it needs to execute any given tool.

Note: within the TUI, tools usage is gated behind a user prompt. Press ctrl-a to cycle through read-only, read-write, and read-write-execute unattended permissions.

Sandbox

When available and configured to, lmcli sandboxes each of its tool invocations with bubblewrap.

Since lmcli's tools run in their own sandbox, they cannot e.g. read lmcli's own configuration.

The sandbox still has access to your host network, and therefore is not intended to prevent escape. Its primary purpose is to prevent the model from having easy access to e.g. your config.yaml API keys.

Host paths may be mounted in as needed, see the sandbox.bindDirs configuration (follows Docker's volume mount syntax).

Paths may be made persistent across tool calls via sandbox.persistDirs (e.g. build caches, /tmp, etc).

Sandboxing is currently opt-in, sandbox.enabled must be true.

Usage

$ lmcli help
lmcli - Large Language Model CLI

Usage:
  lmcli <command> [flags]
  lmcli [command]

Available Commands:
  chat        Open the chat interface
  clone       Clone conversations
  code        Open the chat interface with a code agent
  completion  Generate the autocompletion script for the specified shell
  edit        Edit the last user reply in a conversation
  export      Export a conversation
  help        Help about any command
  list        List conversations
  new         Start a new conversation
  prompt      Do a one-shot prompt
  rename      Rename a conversation
  reply       Reply to a conversation
  retry       Retry the last user reply in a conversation
  rm          Remove conversations
  view        View messages in a conversation

Flags:
  -h, --help   help for lmcli

Use "lmcli [command] --help" for more information about a command.

Note: Use ctrl+h in the TUI view for keybindings.

Examples

Start a new chat with the code agent wired up to perform file editing.

$ lmcli code

Start a new conversation, imperative style (no tui):

$ lmcli new "Help me plan meals for the next week"

Send a one-shot prompt (no persistence):

$ lmcli prompt "What is the answer to life, the universe, and everything?"

tmux

Do you run lmcli with tmux? To get scroll-wheel scrolling working, you'll want the following in your ~/.tmux.conf:

# Emulate scrolling by sending up and down keys if these commands are running in the pane
tmux_commands_with_legacy_scroll="lmcli"
bind-key -T root WheelUpPane \
    if-shell -Ft= '#{?mouse_any_flag,1,#{pane_in_mode}}' \
        'send -Mt=' \
        'if-shell -t= "#{?alternate_on,true,false} || echo \"#{tmux_commands_with_legacy_scroll}\" | grep -q \"#{pane_current_command}\"" \
            "send -t= Up" "copy-mode -et="'

bind-key -T root WheelDownPane \
    if-shell -Ft = '#{?pane_in_mode,1,#{mouse_any_flag}}' \
        'send -Mt=' \
        'if-shell -t= "#{?alternate_on,true,false} || echo \"#{tmux_commands_with_legacy_scroll}\" | grep -q \"#{pane_current_command}\"" \
            "send -t= Down" "send -Mt="'

This example will be removed when bubbletea is able to handle scrolling in tmux on its own.

Roadmap

I aim to keep lmcli lightweight and focused on strong execution of simple ideas. The driving philosophy is that models will grow capable of accomplishing increasingly complex tasks with access to the same set of simple tools.

lmcli intends to keep the user in the loop. Background and scheduled jobs are currently not planned.

Image output
Custom tools (both local and MCP)
Built-in web search tool
RAG-driven prior conversation search
Conversation categorization/tagging
Token accounting

License

MIT

Acknowledgements

lmcli is a hobby project. Special thanks to the Go community and the creators of the libraries used in this project.