All docs

Claude Code

Claude Code is the primary target for Squeezr. Because it sends large tool results (file reads, shell output, search results) in every turn, it benefits the most from compression — typically saving 50–70% of input tokens per session.

Setup

Run squeezr setup once. It automatically sets ANTHROPIC_BASE_URL=http://localhost:8080 in your environment. No manual configuration needed.

Then start the proxy:

squeezr setup   # one-time
squeezr start

Claude Code reads ANTHROPIC_BASE_URL automatically and routes all API calls through the proxy. Your existing ANTHROPIC_API_KEY or OAuth token continues to work — Squeezr forwards it to the Anthropic API transparently.

OAuth support

Claude Max, Team, and Enterprise plans authenticate via OAuth tokens instead of API keys. Squeezr detects and forwards OAuth tokens transparently. No additional configuration is needed.

How it works with Claude Code

Layer 1: System prompt compression

Claude Code's system prompt is ~13KB and is sent with every request. Squeezr compresses it once using a cheap AI model (Haiku) and caches the result. Every subsequent request reuses the cached version, saving ~3,000 tokens per request.

Layer 2: Deterministic preprocessing

Zero-latency rule-based transforms applied to every tool result before anything else:

  • ANSI escape codes and progress bars stripped
  • Duplicate stack frames and repeated lines deduplicated
  • JSON whitespace collapsed
  • Blank lines consolidated

Layer 3: Tool-specific patterns

Each tool result is matched against 30+ specialized patterns. Errors and actionable information are always preserved:

ToolCompression strategyTypical savings
ReadCross-turn dedup, large file → signatures only40–90%
BashPattern matching (git, test, build)50–80%
GrepGrouped by file, matches capped30–60%
GlobDirectory tree summary (>30 files)30–50%
WebFetchAI summarization60–80%

Cross-turn file read deduplication

When Claude Code reads the same file multiple times in a conversation, Squeezr detects the duplication and replaces repeated reads with a compact reference pointer. This alone can save thousands of tokens in long sessions.

Adaptive pressure

Compression aggressiveness scales automatically with context window usage:

Context usageThresholdBehavior
< 50%1,500 charsLight — only compress large results
50–75%800 charsNormal — standard compression
75–90%400 charsAggressive — compress most results
> 90%150 charsCritical — compress everything, 0 git diff context

Per-command skip

Add # squeezr:skip anywhere in a Bash command to bypass compression for that specific result:

cat important-file.txt  # squeezr:skip

Configuration tips

  • Skip specific tools:
    [compression]
    skip_tools = ["Read"]
  • Only compress specific tools:
    [compression]
    only_tools = ["Bash"]
  • Raise the threshold to compress less (fewer false positives):
    [compression]
    threshold = 1500

Verifying it works

squeezr status

After a few interactions, you should see a positive savings percentage. Check the troubleshooting guide if you see 502 errors or the proxy is not intercepting requests.