Quick Start
Get Squeezr running in under two minutes. Two commands and you're done.
Step 1: Install
npm install -g squeezr-aiStep 2: Run setup
squeezr setupThat's it. squeezr setup handles everything automatically:
- Sets
ANTHROPIC_BASE_URLandGEMINI_API_BASE_URLto point at the proxy. - Installs a shell wrapper in your PowerShell profile (Windows) or
~/.bashrc/~/.zshrc(Linux/macOS/WSL) so env vars refresh automatically after eachsqueezrcommand — no need to restart your terminal. - Registers auto-start so the proxy comes back up after a reboot (Task Scheduler on Windows, systemd on Linux, launchd on macOS).
- Imports the MITM CA certificate so Codex trusts the proxy's TLS (Windows Certificate Store on Windows;
~/.squeezr/mitm-ca/bundle.crton macOS/Linux/WSL).
Step 3: Start the proxy
squeezr startUse your coding tool exactly as before — Squeezr compresses transparently.
What happens behind the scenes
Every request passes through a three-layer compression pipeline:
- System prompt compression — Claude Code's ~13KB system prompt is compressed once and cached. Subsequent requests reuse the cached version, saving ~3,000 tokens per request.
- Deterministic preprocessing — Zero-latency rule-based transforms: ANSI escape codes stripped, repeated stack frames deduplicated, JSON whitespace collapsed, progress bars removed.
- Tool-specific patterns — 30+ rules matched against git, test runners, build output, package managers, infra tools, and more. Errors and actionable information are always preserved.
Typical savings
- Per tool result: 70–95% reduction depending on tool
- Per session (2 hours): ~200K tokens → ~80K tokens (60% savings)
- System prompt: ~13KB → ~600 tokens (cached)
Next steps
- Read the guide for your specific tool: Claude Code, Codex, Aider, Gemini CLI, Ollama.
- Learn how the compression pipeline works.
- Explore the full configuration reference.