GitHub - DeutscheKI/jetbrains-mini-agent: This is Hajo's Offline AI coding agent.

2 min read Original article ↗

OfflineAI: tiny JetBrains LLM Agent

OfflineAI is the little LLM coding agent that we use. It's optimized for privacy :)

It works quite well with Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf on an RTX 3090 or up.

To activate the AI agent, click anywhere into your source code and press

Ctrl+Alt+Shift+H (for Help!)

and the OfflineAI Chat window should open. Type a task and start it with Ctrl+Return. Or cancel with the "Reset Chat History" button in the top right corner.

The agent will never modify existing files directly, instead it will open a diff inside the IDE.

How to compile

and that will create build/distributions/jetbrains-mini-agent-0.1.0.zip

Installation

Compile or Download jetbrains-mini-agent-0.1.0.zip and install it using Settings/Preferences > Plugins > ⚙️ > Install plugin from disk...

Usage

Download Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf

Run the LLM locally, for example with

docker run --gpus all -v /path/to/gguf/file/:/models -p 8081:8081 ghcr.io/ggml-org/llama.cpp:server-cuda -m /models/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf --port 8081 --host 0.0.0.0 --jinja -ngl 99 --threads -1 --ctx-size 32768 --temp 0.7 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05 --verbose

And now the plugin will work 🎉

(Or you can patch OfflineAiInvocationService.kt line 21 to use a different LLM.)

Just fork it!

I'm putting this out here in case it's useful, but please don't expect any support or feature improvements. If in doubt, just fork it and create your own customized variant.