The 22,000 Token Tax: Why I Replaced My MCP Server with Shell Scripts

In a recent company workshop, I found myself in a heated debate with a group of developers who were just beginning their journey with Large Language Models (LLMs). Their primary concern was cost—specifically, the €25 a week they were spending on API calls. They wanted to optimize their prompts to bring that figure down to €20. My response was blunt: "You are in the learning phase. Spend more, not less. Break things. Create costs. The insight you gain from a €50 bill is worth ten times more than the €5 you save by being cautious."

However, I quickly followed that up with a caveat. There is one scenario where token consumption matters critically, and it has nothing to do with your credit card balance. It is about Context Preservation. When you are using high-performance models via n1n.ai, every token you send at the start of a session is a tax on the quality of every response that follows. This is the story of how I realized I was paying a 22,000 token tax just to say 'hello' to my AI agent, and why I eventually killed my MCP server to reclaim that context.

The Silent Killer: Context Rot

Research and practical experience both confirm a phenomenon often referred to as "context rot" or the "lost in the middle" problem. As the context window of an LLM fills up, its ability to reason effectively over the provided information degrades. Even with models like Claude 3.5 Sonnet or GPT-4o, which boast massive context windows, the density of information matters. If you fill the first 20% of your window with useless metadata, the model's 'attention' is spread thinner.

In the early days of tools like Claude Code, sessions would simply crash when the limit was reached. Today, we have auto-compaction, which summarizes previous parts of the conversation to make room for new ones. But auto-compaction is a destructive process. You never truly know which vital piece of information survived the squeeze. Therefore, every unnecessary token loaded at startup is a direct threat to the long-term intelligence of your session.

The 22,000 Token Audit

One evening, I decided to audit my own environment. I was running three Model Context Protocol (MCP) servers:

mcp-atlassian: For Jira and Confluence integration.
chrome-devtools: For browser automation.
context7: For local documentation lookups.

I opened a fresh session and ran the /context command. The result was staggering: 22,000 tokens were consumed before I had even typed a single prompt.

The biggest offender was the Atlassian MCP server. It was registering 33 different tools. I only used six of them. I had attempted to use the disabledTools configuration in Claude Code to trim the fat, but the audit revealed a frustrating truth: disabledTools is merely a runtime filter.

When you disable a tool in the config, the AI is instructed not to call it, but the MCP server still starts, the Docker container still spins up, and—most importantly—the tool definitions and schemas are still injected into the context window. Each tool schema can take up 300-500 tokens of metadata. Multiply that by 33, and you are burning 10,000+ tokens on tools you have explicitly said you don't want to use.

The Solution: Graduation to Shell Scripts

I realized that the MCP was abstracting away complexity that didn't actually exist. Most modern enterprise tools, like Jira, have robust REST APIs. If you are already using a platform like n1n.ai to access stable LLM endpoints, you don't need a heavy middleware protocol to talk to a web service.

I decided to replace the entire Atlassian MCP server with a set of lightweight shell scripts. The pattern is simple: credentials stored in a secure JSON file, curl for the requests, and jq for parsing.

Step 1: Secure Credential Management

Instead of environment variables in a Docker container, I use a local config file:

# ~/.config/jira/credentials.json
{
  "personal_token": "your_api_token_here",
  "base_url": "https://your-company.atlassian.net"
}

Ensure the security of this file with chmod 600 ~/.config/jira/credentials.json.

Step 2: The Logic Script

Here is a simplified version of the script I use to fetch Jira issues. It replaces a 300-token MCP schema with 0 tokens of overhead until it is actually called.

#!/bin/bash
# jira-get-issue.sh

TOKEN=$(jq -r '.personal_token' ~/.config/jira/credentials.json)
BASE_URL=$(jq -r '.base_url' ~/.config/jira/credentials.json)
ISSUE_ID=$1

if [ -z "$ISSUE_ID" ]; then
  echo "Usage: jira-get-issue &lt;ISSUE-KEY&gt;"
  exit 1
fi

# Note: Use -k only if dealing with internal self-signed certs
curl -s -H "Authorization: Bearer $TOKEN" \
     -H "Content-Type: application/json" \
     "$BASE_URL/rest/api/2/issue/$ISSUE_ID"

Why Scripts Win Over MCP

Zero Startup Cost: Unlike MCP tools, which must be described to the LLM at the start of every session, shell scripts (or 'skills') are only invoked when needed. You save 10,000+ tokens instantly.
Customization: I can bake project-specific defaults directly into the script. For example, I can force every new ticket to include a specific 'Team' label or 'Component' without having to explain that to the AI every time.
Reliability: I found that creating Jira tickets via MCP often failed due to complex custom fields. With a direct curl POST request, I can map the JSON exactly as Jira expects it.

Example of a ticket creation script that "just works":

curl -s -X POST \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "fields": {
      "project": {"key": "PROJ"},
      "issuetype": {"name": "Task"},
      "summary": "'"$1"'",
      "components": [{"name": "API"}]
    }
  }' "$BASE_URL/rest/api/2/issue"

The Comparison Table

Feature	Atlassian MCP Server	Native Shell Scripts
Startup Token Cost	~10,000 Tokens	0 Tokens
Setup Complexity	Low (Docker-based)	Medium (Writing bash)
Maintenance	Third-party updates	You own the code
Flexibility	Rigid Schemas	Infinite (Any API)
Security	Opaque Docker Image	Transparent Source

When to Use What?

MCP is an excellent "getting started" tool. If you are experimenting and want to see what is possible, the low barrier to entry is fantastic. However, once you move into professional production workflows—especially if you are using high-speed, high-reliability APIs from n1n.ai—you will eventually hit the "Abstraction Wall."

When your agent is performing multi-step workflows that consume 100k+ tokens, you cannot afford to waste 10% of your context on unused tool definitions. Graduating to shell scripts is not just an optimization; it is a necessity for maintaining the "intelligence" of your AI sessions over long periods.

Conclusion: Engineering over Abstraction

The move from MCP back to shell scripts felt like a regression at first, but it was actually a move toward engineering maturity. By removing the middleware, I gained visibility, saved money, and significantly improved the performance of my LLM.

If you find your AI agent becoming forgetful or making errors late in a session, check your token tax. You might find that you're paying for a lot of "tools" you never actually use.

Get a free API key at n1n.ai

Source: https://dev.to/codewithagents_de/the-22000-token-tax-why-i-killed-my-mcp-server-2c12