LSP, Hooks, and Workflow Design: What Actually Differentiates AI Coding Tools

6 min read Original article ↗

Why toolchain integration outweighs model choice

Stéphane Derosiaux

Press enter or click to view image in full size

Created by the author

Something strange is happening in developer tooling. Experienced JetBrains users, people with a decade of muscle memory and carefully tuned keybindings, are abandoning their IDE for VS Code with Cursor or Claude Code. These aren’t junior developers chasing shiny objects. They’re senior engineers who’ve built their entire workflow around IntelliJ’s refactoring capabilities.

The conventional explanation is that Claude or GPT-5.2 is simply “smarter” than whatever model powers Junie or Copilot. That explanation doesn’t hold up. Run the same model through different tools and you get dramatically different experiences. The model is identical. The wiring is not.

I watched a team spend three months chasing model upgrades, switching from Claude 3 to 3.5 to GPT-5 to 5.2, convinced each new release would fix their agent’s refactoring failures. It didn’t. The problem was never the model. Their agent was using grep to find function references instead of asking the language server (LSP). No amount of model intelligence fixes fundamentally broken plumbing.

What LSP Gives AI Agents That Grep Never Could

Press enter or click to view image in full size

Created by the author

Before AI agents entered the picture, Language Server Protocol (LSP) solved an industry-wide coordination problem. Every editor needed to support every language, creating O(M×N) integration work. LSP reduced this to O(M+N): one server per language, one client per editor.

LSP gives agents what grep never could: precise location of every function call, type information on hover, symbol navigation across thousands of files. This isn’t a convenience feature. It’s the difference between an agent that searches and an agent that knows.

When an agent needs to rename a function across 47 files, it has two choices.

  • It can grep for the function name and hope it doesn’t catch comments, strings, or identically-named variables in different scopes.
  • Or it can ask the LSP for all references and get a precise list with location and context. In practice, LSP-based renames are dramatically faster and don’t break builds.

Claude Code now exposes LSP (https://github.com/anthropics/claude-code/blob/main/CHANGELOG.md#2074) operations to the agent: textDocument/definition, textDocument/references, textDocument/hover, and workspace/symbol. This means the agent can perform the same navigation a human developer uses in their IDE. (!!!)

A caveat on LSP quality: TypeScript’s LSP is fast and comprehensive. Rust-analyzer can be sluggish on large codebases. Python’s ecosystem is fragmented between Pylance, Pyright, and Jedi. The promise of “LSP solves everything” assumes a good LSP exists for your language, which isn’t always true.

The Real Competition Isn’t Between Models

Press enter or click to view image in full size

Created by the author

The AI coding tool landscape is fragmented: Claude Code, Codex, OpenCode, Cursor, JetBrains AI Assistant, Zed, plus the protocol layer (MCP, ACP/A2A). Product announcements focus on model benchmarks.

The real battle is about integration architecture.

OpenCode, an open-source alternative, supported LSP before Claude Code did. mcp-language-server lets you expose any language server to Claude Code via the Model Context Protocol (MCP is a standard for connecting AI agents to external tools and data sources). The tooling to wire AI agents into existing infrastructure exists. The question is who assembles it best.

Same models. Different plumbing. Different outcomes.

Get Stéphane Derosiaux’s stories in your inbox

Join Medium for free to get updates from this writer.

Remember me for faster sign in

The differentiation comes from:

  • Architecture: How the tool structures context, manages token budgets, and handles long sessions without losing track (compaction today)
  • Hooks and feedback loops: When tests run (after every edit? at task completion?), how errors are summarized for the agent, what happens after failures
  • LSP depth: Whether the agent actually uses structured code intelligence or falls back to grep because it’s “easier”
  • Form factor: CLI gives context control and portability; IDE gives visual integration but creates lock-in

What This Means for Teams

If the model isn’t the bottleneck, what is?

  • Workflow orchestration becomes the new constraint. I watched an agent waste 20 minutes re-running tests after every single file edit. The context window filled with expected failures. By the time it got to the actual bug, it had forgotten what it was doing. The fix wasn’t a better model; it was instructing the agent (via CLAUDE.md configuration) to run tests at task completion, not after each edit.
  • Human-agent UX becomes the friction point. Permission prompts that stack up without clear resolution. Diffs that are hard to review in a terminal. No clean way to interrupt and redirect without losing context. These sound like minor annoyances, but they compound. Death by a thousand paper cuts.

What to invest in:

  • Configuration as code: CLAUDE.md files, hook definitions, runbooks that work across sessions and team members
  • Curated tool access: A small set of approved MCP servers, not everything from the marketplace. Each tool adds context load; more tools can mean worse outcomes if the agent drowns in options
  • Standard LSP adapters: One well-maintained bridge to your language servers, not custom integrations per tool
  • Test orchestration: Decide when tests run and how failures are summarized. This is an architectural decision, not an afterthought

What not to invest in:

  • Model provider chasing: If you’re switching providers every quarter based on benchmark leaderboards, you’re optimizing the wrong variable
  • Over-tooling: Token budgets are real. An agent with access to 15 MCP servers spends tokens figuring out which tool to use instead of using it

CLI versus IDE is the most underrated decision in agent tooling. Claude Code as CLI means your agent setup works in tmux, on remote servers, in CI pipelines, anywhere you have a shell. IDE integrations lock you to one editor and break when you SSH into a production box. The tradeoff is real: CLI means no inline diff viewing, no syntax highlighting of changes, harder code review. Pick portability or pick visual integration, but don’t pretend you’re not choosing.

The Hybrid Architecture Future

One objection to all this: won’t 1-million-token context windows make LSP irrelevant? If you can dump an entire codebase in context, why bother with structured queries? Three reasons:

  • cost,
  • latency,
  • noise.

Dumping 500K tokens of code costs real money and adds seconds of latency per request. Worse, the model has to find the needle in the haystack. An LSP query for “all references to this function” costs nearly nothing and returns exactly what’s needed. The token-efficient path wins.

The winning architecture is a powerful model at the center, surrounded by a ring of specialized tools, LSP for navigation, database engines for schema validation, static analyzers for type checking. Every token the model spends on tasks a deterministic tool could handle is wasted money and added latency. The teams that architect around this win. The teams that throw everything at the LLM burn budget.

This is the O(M+N) pattern applied to AI. Instead of building integrations for each model×tool combination, we get standard protocols (MCP, ACP) where any model can call any tool adapter.

For teams navigating this landscape:

  1. Standardize your internal agent platform: Approved plugins, LSP configuration, hook conventions, CLAUDE.md templates shared across projects
  2. Invest in adapters, not providers: Build or adopt MCP servers that expose your internal systems (databases, CI, documentation) to any model
  3. Measure what matters: Token consumption per task type, latency from agent action to feedback, ratio of agent suggestions accepted versus rejected

The model is the commodity. Every major provider is converging on similar capabilities. The wiring is the moat. The team that builds the best plumbing between AI agents and their existing developer infrastructure wins.