Claude Code
Claude Code is the primary target for Squeezr. Because it sends large tool results (file reads, shell output, search results) in every turn, it benefits the most from compression — typically saving 50–70% of input tokens per session.
Setup
Run squeezr setup once. It automatically sets ANTHROPIC_BASE_URL=http://localhost:8080 in your environment. No manual configuration needed.
Then start the proxy:
squeezr setup # one-time
squeezr startClaude Code reads ANTHROPIC_BASE_URL automatically and routes all API calls through the proxy. Your existing ANTHROPIC_API_KEY or OAuth token continues to work — Squeezr forwards it to the Anthropic API transparently.
OAuth support
Claude Max, Team, and Enterprise plans authenticate via OAuth tokens instead of API keys. Squeezr detects and forwards OAuth tokens transparently. No additional configuration is needed.
How it works with Claude Code
Layer 1: System prompt compression
Claude Code's system prompt is ~13KB and is sent with every request. Squeezr compresses it once using a cheap AI model (Haiku) and caches the result. Every subsequent request reuses the cached version, saving ~3,000 tokens per request.
Layer 2: Deterministic preprocessing
Zero-latency rule-based transforms applied to every tool result before anything else:
- ANSI escape codes and progress bars stripped
- Duplicate stack frames and repeated lines deduplicated
- JSON whitespace collapsed
- Blank lines consolidated
Layer 3: Tool-specific patterns
Each tool result is matched against 30+ specialized patterns. Errors and actionable information are always preserved:
| Tool | Compression strategy | Typical savings |
|---|---|---|
Read | Cross-turn dedup, large file → signatures only | 40–90% |
Bash | Pattern matching (git, test, build) | 50–80% |
Grep | Grouped by file, matches capped | 30–60% |
Glob | Directory tree summary (>30 files) | 30–50% |
WebFetch | AI summarization | 60–80% |
Cross-turn file read deduplication
When Claude Code reads the same file multiple times in a conversation, Squeezr detects the duplication and replaces repeated reads with a compact reference pointer. This alone can save thousands of tokens in long sessions.
Adaptive pressure
Compression aggressiveness scales automatically with context window usage:
| Context usage | Threshold | Behavior |
|---|---|---|
| < 50% | 1,500 chars | Light — only compress large results |
| 50–75% | 800 chars | Normal — standard compression |
| 75–90% | 400 chars | Aggressive — compress most results |
| > 90% | 150 chars | Critical — compress everything, 0 git diff context |
Per-command skip
Add # squeezr:skip anywhere in a Bash command to bypass compression for that specific result:
cat important-file.txt # squeezr:skipConfiguration tips
- Skip specific tools:
[compression] skip_tools = ["Read"] - Only compress specific tools:
[compression] only_tools = ["Bash"] - Raise the threshold to compress less (fewer false positives):
[compression] threshold = 1500
Verifying it works
squeezr statusAfter a few interactions, you should see a positive savings percentage. Check the troubleshooting guide if you see 502 errors or the proxy is not intercepting requests.