Squeezr is an HTTP proxy running on Node.js. It runs on any machine from the last 10 years. No GPU required. AI compression is delegated to cloud APIs (Haiku / GPT-mini / Gemini Flash) or an optional local Ollama server.
Minimum requirements
| Resource | Minimum | Recommended | Notes |
|---|
| CPU | 1 core x64/arm64 | 2+ cores | Regex + MD5 + Myers diff; brief spikes when processing 1–2 MB requests. |
| RAM | 256 MB free | 512 MB free | Measured: ~140 MB working set with all 3 processes (proxy + MITM + desktop proxy). |
| Disk | 200 MB | 300 MB | 128 MB node_modules + ~10 MB code + bounded caches in ~/.squeezr/. |
| Network | Outbound HTTPS (443) | — | api.anthropic.com, api.openai.com, generativelanguage.googleapis.com |
| GPU | Not required | All AI compression is cloud-based or via optional external Ollama. |
Software
| Requirement | Version |
|---|
| Node.js | ≥ 18 (tested on 24.x) |
| OS | Windows 10/11, macOS, Linux, WSL2 |
| npm | Whichever ships with your Node version |
Measured resource usage
- Main proxy RAM: ~140 MB working set (proxy + dashboard + MITM 8081).
- Desktop proxy (optional, Claude/Codex Desktop only): separate process, ~80 MB.
- CPU idle: ~0% — only active while a request is in flight.
- CPU peak: one core for 10–50 ms per request (deterministic pipeline is synchronous).
- Disk in
~/.squeezr/: bounded — cache.json (LRU 1000 entries), history.json (500 sessions max), captures/ (20 files max, opt-in).
Added latency per request
| Pipeline stage | Latency |
|---|
| Deterministic (regex, dedup, Myers diff) | 5–50 ms |
| AI compression (Haiku / GPT-mini / Gemini Flash) | 300–1500 ms — only on old blocks above threshold, runs in parallel |
| Forward to upstream | Passthrough — streaming preserved |
Ports used (configurable)
| Port | Purpose |
|---|
8080 | Main HTTP proxy (Claude Code, Codex CLI, Aider, Gemini CLI) + dashboard |
8081 | MITM proxy (Codex OAuth) |
8443 / 8088 | Desktop proxy — only needed for Claude Desktop / Codex Desktop |
What you do NOT need
- ❌ GPU / CUDA
- ❌ Docker
- ❌ A database — everything is JSON in
~/.squeezr/ - ❌ Ollama — optional, only if you want a fully local AI compression backend
With Zest local model (future)
When the Zest local model (zest-0.8b) ships via Ollama, requirements increase: ~2 GB extra RAM for the model (~1 GB in Q4 quantization) and a CPU with AVX2 or a small GPU for reasonable inference speed. This will be documented separately.