Taming Claude Code: Taking Back Control

I started using Claude Code in July last year. I was a Cursor user looking for something better, and I was skeptical at first. Claude Code is a terminal-based tool, and I was used to IDEs. With Cursor, reviewing AI-generated changes was straightforward, you see the diff, you accept or reject. How would that work in a terminal?

It took me a while to adapt. The breakthrough was realizing I could run Claude Code inside VS Code’s integrated terminal and use the Git extension to review changes. This became central to my workflow: I don’t let AI tools do whatever they want. I stay in the loop. Even when I’m vibe-coding a throwaway script, I want to quickly review what the AI is doing.

Once I figured out the workflow, I was pretty happy with Claude Code. It explored codebases more efficiently than Cursor, and I could see the model’s thinking traces, the reasoning behind each decision. This was huge for me. I could see how the model interpreted my request, catch wrong assumptions early, and course-correct before wasting time on the wrong path.

Then Claude Code 2.0 happened.

The Breaking Point

Anthropic updated Claude Code to be more accessible to everyone, not just power users. The VS Code extension got prettier, the UI got cleaner, and the thinking traces got hidden.

This was the moment I realized I couldn’t continue using Claude Code as-is. The thinking traces weren’t just a nice-to-have for me, they were essential. Without them, I had no way of knowing if the model had misunderstood my request until it finished working. Sometimes you ask something ambiguous, and instead of asking for clarification, the model assumes. Those assumptions are obvious in the thinking traces. Without them, you’re flying blind.

Claude Code thinking traces showing self-correction

In this example, you can see the model debugging a SQL error, catching its own mistakes (“Actually wait”, “Wait, let me count…”), and reconsidering assumptions in real-time. This is exactly what you lose when thinking traces are hidden.

So I did what any stubborn power user would do: I disabled auto-updates and pinned my version to the latest 1.x release.

Why I Customize Claude Code

Here’s my problem with the defaults: Claude Code has access to too many tools. In theory, more tools means more capabilities. In practice, it means more complexity for the model to navigate, more tokens consumed on tool selection, and worse results.

Take plan mode. Claude Code has multiple tools for entering plan mode, exiting plan mode, asking questions in plan mode. All of this consumes tokens and adds cognitive load for the model. But plan mode is just… a markdown file where the model writes without editing your actual code. You can achieve the same thing by telling the model: “Don’t edit anything. Plan first, ask me questions, save your plan in a markdown file.” You don’t need dedicated tools for this.

Or sub-agents. The main model can spawn another agent to explore the codebase and summarize findings. Sounds useful, but I found it performs worse for my use cases. The sub-agent summarizes what it found, and the main model works with that summary instead of the actual code. It’s like playing telephone, context gets lost. I’d rather have the main model read the code itself, even if it uses more tokens and hits the context limit faster. At least it’s working with the real thing.

My Setup

I’ve ended up with a Claude Code config that’s leaner, faster, and more predictable. Here’s what I changed.

Restoring Thinking Traces

The API still returns thinking traces, Claude Code just hides them in the UI since version 2.0. I found claude-code-patches , a community project that patches thinking traces back to being fully expanded by default.

First, install a compatible Claude Code version:

Then apply the patch:

git clone [email protected]:aleks-apostle/claude-code-patches.git
cd claude-code-thinking
node patch-thinking.js

Restart Claude Code, and thinking blocks now display inline.

Why version 2.0.62? That’s the latest version the patch maintainer has officially supported. Other contributors have submitted PRs for newer versions, and the patch is just one JS file, so it’s trivial to adapt. But 2.0.62 has everything I need, and I’d rather wait for the maintainer to officially update than risk a broken patch.

Recent versions have actually gotten worse for power users. Version 2.1.20 started hiding even more information , showing vague summaries like “Read 3 files” instead of the actual file paths. When I read about this, I was glad I’d pinned my version. If a future Claude Code version introduces a feature I really want, I’ll upgrade and use a community patch. Until then, I’m staying put.

Settings

In ~/.claude/settings.json:

{
  "includeCoAuthoredBy": false,
  "env": {
    "DISABLE_AUTOUPDATER": "1",
    "CLAUDE_CODE_DISABLE_FEEDBACK_SURVEY": "1",
    "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC": "1",
    "CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS": "50000"
  }
}

What each setting does:

includeCoAuthoredBy: false - Disables the “Co-authored-by: Claude” line in git commits. I don’t ask Claude to commit often, and when I do, I prefer clean commit messages.
DISABLE_AUTOUPDATER - Prevents automatic updates. I update manually after reviewing changelogs.
CLAUDE_CODE_DISABLE_FEEDBACK_SURVEY - Disables the “rate your session” popups. They appear frequently and break the flow.
CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC - Keeps things fast and minimal.
CLAUDE_CODE_FILE_READ_MAX_OUTPUT_TOKENS: 50000 - The default is 25,000 tokens. When Claude reads files larger than that, it gets truncated and errors out. I keep documentation files in my project directories (Cloudflare Workers docs, API references, etc.) that exceed 25k tokens, and I want Claude to read them fully.

Disabling Token-Heavy Tools

I use a shell alias that runs Claude Code with max thinking tokens and disables tools I don’t want:

# In ~/.zshrc
alias claude='MAX_THINKING_TOKENS=31999 claude --disallowedTools \
  "Task,AgentOutputTool,ExitPlanMode,NotebookEdit,AskUserQuestion,SlashCommand,EnterPlanMode"'

This disables:

Task / AgentOutputTool - Sub-agents that explore and summarize. I prefer the main model to read code directly.
EnterPlanMode / ExitPlanMode - Plan mode tools. I just ask the model to plan in a markdown file instead.
NotebookEdit - Jupyter notebook editing conflicts with VS Code’s live rendering. It doesn’t work well and wastes tokens. I don’t use Claude for notebooks much anyway.
AskUserQuestion - Part of plan mode. Claude can just ask questions in regular text, and I respond in text. No special tool needed.
SlashCommand - I’ve never used it.

The last three are more opinionated. If you use Jupyter notebooks heavily or like slash commands, keep those tools enabled.

Why use an alias instead of settings? When you specify --disallowedTools, these tools are completely removed from the model’s context. The model doesn’t even know they exist. This makes the context leaner and, I believe, improves the model’s focus.

Disabling Auto-Compaction

You can toggle this via /config in Claude Code, or directly in ~/.claude.json:

{
  "autoCompactEnabled": false
}

This one is important. By default, Claude Code reserves 22.5% of the model’s context (45k tokens out of 200k) for auto-compaction. When you approach this threshold, it summarizes the conversation and starts a new context. The summary is lossy, 99% of the time the model in the new session doesn’t have enough context about what you’ve tried, what worked, what didn’t.

After disabling auto-compaction, I rarely hit the context limit anyway. I keep my sessions short by default: start a session, work on a feature or bug, clear and start fresh. This leads to higher quality results.

For those times when I’m debugging something nasty and the context grows long, I’d rather continue the current session until it hits the actual limit. Here’s something nice about Claude Code: when you’re at 0% context remaining, it doesn’t immediately error out. It starts removing tool call results from the oldest messages (that test run from 30 messages ago, for example). You can often continue for 10-20 more messages before hitting the real API limit.

When I do hit the limit, I use /export to save user messages, assistant messages, and tool call titles to a text file. Then I start a fresh session and reference the transcript. This preserves more context than auto-compaction’s summary. I sometimes paste the export to another model (Gemini, for example) to get a second opinion on a tricky problem.

Why I Don’t Use MCPs

MCP servers let you connect Claude Code to external services: Google Drive, Jira, Notion, databases, etc. I don’t use any of them, except the VS Code MCP that Claude Code adds automatically when running in VS Code’s terminal (it provides useful things like IDE diagnostics).

My reasoning is the same as with the built-in tools: each MCP adds tools to the context. Some people add tens or even hundreds of tools via MCPs. The model then has to evaluate all of them for every request. More tools means more token overhead and, I believe, worse decision-making.

Claude Code context with multiple MCPs enabled

For services I actually need, I use CLI tools instead:

GitHub - I mention in my CLAUDE.md that the gh CLI is available and authenticated. Claude knows how to use it.
BigQuery - Same approach. I have bq CLI authenticated and add minimal instructions to CLAUDE.md.

For rarely-used integrations like Google Docs or Jira, I ask myself: what percentage of my Claude Code messages actually need Google Drive access? Maybe 1 in 100? 1 in 1000? For those rare cases, I just:

Export the Google Doc as markdown, add it to the project directory, reference it manually
Or ask Claude to write to a markdown file, then copy-paste into Google Docs

This takes 30 seconds and doesn’t pollute every session with tools I rarely use.

The exception: Skills. Skills are similar to MCPs but more minimal. The model only sees a one-sentence description of each skill until it’s actually needed. When triggered, it dynamically loads the full instructions.

For example, I have an Obsidian skill. The description says “use when the user mentions Obsidian or daily notes.” The full instructions (vault location, how to find daily notes, formatting preferences) only load when relevant. When I’m working on code and not mentioning Obsidian, my context stays clean.

For use cases like this, Skills are more efficient than MCPs. I’ll continue using them selectively.

The Result

My Claude Code setup is leaner: fewer tools in context, no auto-compaction overhead, full visibility into thinking traces. I spend less time wondering what the model is doing and more time actually working. When something goes wrong, I can see exactly where the model’s assumptions diverged from mine.

Here’s the difference. Running /context in a fresh session:

Before (default settings): Claude Code context usage with default settings

After (my setup): Claude Code context usage with my settings

Is it perfect? No. I’m pinned to an older version and relying on a community patch. But the tradeoff is worth it for the control I get back.

Your mileage may vary. If you’re happy with the defaults, that’s fine. But if you’re a power user who wants more control, and who wants to understand what the AI is doing rather than just trusting it, these customizations might help.

The goal isn’t to use AI tools less. It’s to use them with your eyes open.

Comment? Reply via Email, Bluesky or Twitter.