How ChatShell Solves MCP Context Bloat with Progressive Disclosure

10 min read Original article ↗

Six weeks into building ChatShell, I connected three MCP servers for a power user test. The agent got noticeably slower and dumber. I opened the context window log and found thousands of tokens of tool schemas sitting there before the user had typed a single word. That is when I knew something had to change.

Every tool you expose to an LLM costs tokens. The name, description, and full JSON Schema for parameters all land in the model’s context window before the conversation even begins. With a handful of built-in tools this is manageable. Add a few MCP servers — each advertising ten or twenty tools — plus a library of skill workflows, and you are burning thousands of tokens on metadata the model will never use in that conversation.

This is the context bloat problem: the more capable you make your agent, the less room it has to think.

ChatShell solves this with progressive disclosure — a two-layer architecture where the model sees a compact catalog of what is available, then pulls full definitions only when a task demands them. Instead of N tool definitions, the model sees two or three meta-tools and discovers everything else on demand.

The scale of the problem

Before diving into the solution, it helps to see just how bad the naive approach gets:

ApproachStartup Token Cost
Register 50 MCP tools directly~15,000 tokens
ChatShell meta-tool approach~300 tokens
Savings~98%

These are conservative estimates. A single MCP tool’s JSON Schema — with its properties, required fields, and nested objects — can easily run 200–400 tokens. Ten tools from a database server, twenty from a filesystem server, more from a custom API: it adds up fast. With ChatShell’s meta-tool approach, the model sees only tool names and short descriptions — a few dozen tokens total — until it explicitly asks for more.

Layer one: MCP schema-first discovery

MCP servers can expose dozens of tools each. A PostgreSQL server might advertise query, list_tables, describe_table, insert, update, delete, and more — each with a substantial JSON Schema. Registering all of them as separate LLM functions would repeat those schemas in the context on every turn.

ChatShell splits this into two meta-tools with distinct responsibilities.

mcp_schema: the catalog and schema lookup

The backend registers a tool named mcp_schema. Its description embeds an XML catalog of all available MCP tools — grouped by server, with only short descriptions:

async fn definition(&self, _prompt: String) -> ToolDefinition {

let mut desc = String::from(

"Look up the schema for an MCP tool before calling it with mcp_tool_use. \

You MUST call this first to understand the required parameters.\n\n\

<available_mcp_tools>\n",

);

for server in &self.catalog {

desc.push_str(&format!(" <server name=\"{}\">\n", server.name));

for (tool_name, tool_desc) in &server.tools {

desc.push_str(&format!(

" <tool name=\"{}\">{}</tool>\n",

tool_name, tool_desc

));

}

desc.push_str(" </server>\n");

}

desc.push_str("</available_mcp_tools>");

// ...

}

The model sees tool names and short descriptions — enough to decide which tool to use — but not the full parameter schemas. When it needs to call a specific tool, it first calls mcp_schema with the server and tool name. The handler reads the full JSON Schema from a cached file on disk:

async fn call(&self, args: Self::Args) -> Result<Self::Output, Self::Error> {

let path = std::path::Path::new(&self.schema_dir)

.join(&args.server)

.join(format!("{}.json", args.tool));

let content = tokio::fs::read_to_string(&path).await?;

Ok(format!(

"<mcp_schema server=\"{}\" tool=\"{}\">\n{}\n</mcp_schema>",

args.server, args.tool, content

))

}

mcp_tool_use: the single execution surface

The second meta-tool is intentionally minimal. Its definition simply states that the model should read parameters via mcp_schema first:

ToolDefinition {

name: "mcp_tool_use".to_string(),

description: "Call an MCP tool by server and name. \

Always read the schema with mcp_schema first to understand \

the required parameters.".to_string(),

parameters: json!({

"type": "object",

"properties": {

"server": { "type": "string" },

"tool": { "type": "string" },

"arguments": { "type": "object" }

},

"required": ["server", "tool"]

}),

}

At runtime, call looks up the MCP client by a "server/tool" composite key in a shared map and invokes the tool through the rmcp client. If the tool is unknown, the error message steers the model back to the catalog:

#[error("Unknown MCP tool: {0}. Check the available tools via mcp_schema.")]

UnknownTool(String),

The result: the model discovers tools through the embedded catalog, loads the exact schema on demand, then executes through one stable interface — two tool definitions instead of N, regardless of how many MCP tools are connected.

How schemas reach disk

Before mcp_schema can serve schemas, the streaming pipeline syncs tool definitions from each connected MCP server into the app cache directory (under mcp-tools/). Each server gets a subdirectory, and each tool’s full JSON Schema is written as {tool}.json. This happens once when the MCP server is connected for a conversation — the live catalog (names and descriptions) stays in the tool definition, while the heavy schema payloads sit on disk until the model explicitly requests them.

Skills face a structurally identical problem — and the same catalog pattern solves it.

Progressive disclosure for skills is common practice — far more so than for MCP tools. Two approaches are typical: injecting a skill catalog directly into the system prompt, or embedding it in the description of a dedicated skill tool. ChatShell uses the second approach to stay consistent with the MCP layer.

Skills in ChatShell are structured packages — each a directory containing a SKILL.md file with instructions and optional resource files. On the TypeScript side, the Skill type captures everything the system needs to know:

export interface Skill {

id: string

name: string

description?: string

source: SkillSource

path: string

required_tool_ids: string[]

allow_model_invocation: boolean

allow_user_invocation: boolean

is_enabled: boolean

// ...

}

The required_tool_ids field is the key architectural choice: skills declare which built-in tools they need, and those tools are enabled automatically when the skill is active. This means a skill for “code review” can require read, grep, and glob without the user manually toggling them.

The important design decision is what not to do: we do not paste every skill’s instructions into the system prompt.

How the catalog works

The backend registers a single tool named skill. Its description embeds an XML catalog of all available skills — just names, paths, and short descriptions:

async fn definition(&self, _prompt: String) -> ToolDefinition {

let mut desc = String::from(

"Load the full instructions for a skill by name. \

Use this when a task matches an available skill.\n\n\

<available_skills>\n",

);

for entry in &self.skills {

let d = entry.description.as_deref().unwrap_or("No description");

desc.push_str(&format!(

" <skill name=\"{}\" path=\"{}\">{}</skill>\n",

entry.name, entry.path, d

));

}

desc.push_str("</available_skills>");

ToolDefinition {

name: "skill".to_string(),

description: desc,

parameters: json!({

"type": "object",

"properties": {

"name": {

"type": "string",

"description": "Skill name from the available_skills catalog"

}

},

"required": ["name"]

}),

}

}

The model sees something like:

<available_skills>

<skill name="code-review" path="/Users/me/.chatshell/skills/code-review/SKILL.md">

Review code changes for bugs, style issues, and improvements

</skill>

<skill name="create-rule" path="/Users/me/.chatshell/skills/create-rule/SKILL.md">

Create persistent AI guidance rules for a project

</skill>

</available_skills>

This is a table of contents, not a textbook. The model can scan it in a few dozen tokens and decide whether any skill is relevant to the current task.

Loading a skill on demand

When the model calls skill with { "name": "code-review" }, the implementation resolves the entry from the catalog, reads the SKILL.md file, strips YAML frontmatter, and returns the instructions wrapped in <skill_content>. It also lists up to 20 ancillary files in the skill directory — templates, example configs, reference files — so the model knows what resources are available without dumping their contents:

async fn call(&self, args: Self::Args) -> Result<Self::Output, Self::Error> {

let entry = self.skills.iter()

.find(|e| e.name == args.name)

.ok_or_else(|| SkillError(format!("Unknown skill: {}", args.name)))?;

let content = tokio::fs::read_to_string(path).await?;

let body = strip_frontmatter(&content);

let resource_files = collect_resource_files(skill_dir).await;

Ok(format!(

"<skill_content name=\"{}\">\n{}\n\n{}\n</skill_content>",

args.name, body, path_note

))

}

The system prompt stays thin. When skills are available, ChatShell appends a single sentence via SKILL_INSTRUCTIONS: “When a task matches one of the available skills, use the skill tool to load its full instructions before proceeding.” That is the entire system-prompt cost of the skill system, regardless of how many skills exist.

ChatShell’s Rust backend uses the rig crate for agent construction. AgentConfig holds optional instances of SkillTool, McpSchemaTool, and McpToolUseTool alongside boolean flags for each built-in capability:

pub struct AgentConfig {

pub system_prompt: Option<String>,

pub model_params: ModelParameters,

// Built-in tool flags

pub enable_web_search: bool,

pub enable_bash: bool,

pub enable_read: bool,

pub enable_edit: bool,

// ...

// Progressive disclosure tools

pub mcp_schema_tool: Option<McpSchemaTool>,

pub mcp_tool_use: Option<McpToolUseTool>,

pub skill_tool: Option<SkillTool>,

}

The agent builder only attaches each meta-tool when its Option is populated. The provider sees at most two MCP meta-tools (regardless of how many MCP tools exist) plus one skill tool (regardless of how many skills are on disk) — alongside whichever built-in tools are enabled for that specific conversation.

Progressive disclosure would be undermined if every server and skill were always wired in. ChatShell assembles the tool set per streaming request in handle_agent_streaming.

The enabled tool set is the union of:

  1. Assistant configuration — the tools and skills attached to the active assistant preset.
  2. Skill dependencies — each enabled skill contributes its required_tool_ids, auto-enabling the built-in tools it needs.
  3. Conversation settings — per-conversation overrides that can additively enable MCP servers and skills without changing global defaults.

Skills from the assistant and the conversation are deduplicated by SKILL.md path, then converted into SkillCatalogEntry structs and passed to SkillTool::new. Non-builtin tool IDs that correspond to MCP servers trigger connection to those servers, syncing of schema files, construction of McpServerCatalog entries, and attachment of both McpSchemaTool and McpToolUseTool to the agent config.

If the model does not support tool use (detected via a capabilities check), both skill entries and tool IDs are cleared — no point paying for catalog text the model cannot act on.

The system prompt is then conditionally extended:

if !skill_entries.is_empty() {

effective_system_prompt.push_str("\n\n");

effective_system_prompt.push_str(prompts::SKILL_INSTRUCTIONS);

}

if config.mcp_schema_tool.is_some() {

effective_system_prompt.push_str("\n\n");

effective_system_prompt.push_str(prompts::MCP_INSTRUCTIONS);

}

Each conversation carries only the tools and skills the user actually needs for that chat. And even then, full skill text and MCP schemas only appear after the model makes a targeted tool call to request them.

What this means for users

Assistants can ship with a default tool and skill mix. Conversations can layer on additional MCP servers or skills without changing the assistant’s global configuration. From the model’s perspective, the XML catalog in each meta-tool’s description is a lightweight, always-visible index. Everything beyond that is pay-as-you-go.

It is the same way humans work with large APIs: skim the table of contents, then open the reference page when you are ready to make a call.

Credits and lineage

This design draws directly from two ideas in the ecosystem:

  • Cursor — Dynamic Context Discovery introduced the principle that agents should discover context and capabilities progressively — loading what they need, when they need it — instead of front-loading every rule and tool definition into the prompt. ChatShell’s XML catalogs and schema-on-demand flow apply that principle to skills and MCP tools.

  • AgentSkills.io documents how clients can treat skills as structured, loadable resources (a SKILL.md file with frontmatter and metadata) rather than static prompt text permanently embedded in the system message. ChatShell follows this model: skills live as catalog entries in the tool description, full instructions are loaded on invocation, and resource files are listed for discoverability.

Together, these influences shaped a concrete Rust implementation: XML catalogs embedded in tool definitions, thin execution surfaces, cached JSON schemas on disk, and conversation-scoped tool enablement — lean context by default, full capability on request.