Stop overspending on LLM tokens

See what your prompts cost across GPT, Claude, and Gemini — then cut it. Free calculator below, plus an MCP server that does it automatically inside Claude & Cursor.

npm: mcp-token-optimizer · works with any MCP client · MIT
AI-compressed:
Review before use — AI compression can occasionally drop nuance. This is the engine behind the Pro tier.
Estimated tokens: (+ output)
Browser estimate (~chars/4). Install the MCP server for exact token counts, prompt slimming, and per-call automation.

Install in Claude / Cursor (30 seconds)

Add this to your MCP config (claude_desktop_config.json or .cursor/mcp.json):

{
  "mcpServers": {
    "token-optimizer": {
      "command": "npx",
      "args": ["-y", "mcp-token-optimizer"]
    }
  }
}

Then ask: "slim this system prompt and show what I'd save at 50k calls a month" or "which model is cheapest for this prompt?"

What it does

count_tokens

Exact token count + cost across models.

estimate_cost

Per-call + monthly/yearly spend.

slim_prompt

Compress prompts, measure $ saved.

compare_model_costs

Find the cheapest capable model.

Read: how to reduce LLM token costs →