All docs

Requirements

Squeezr is an HTTP proxy running on Node.js. It runs on any machine from the last 10 years. No GPU required. AI compression is delegated to cloud APIs (Haiku / GPT-mini / Gemini Flash) or an optional local Ollama server.

Minimum requirements

ResourceMinimumRecommendedNotes
CPU1 core x64/arm642+ coresRegex + MD5 + Myers diff; brief spikes when processing 1–2 MB requests.
RAM256 MB free512 MB freeMeasured: ~140 MB working set with all 3 processes (proxy + MITM + desktop proxy).
Disk200 MB300 MB128 MB node_modules + ~10 MB code + bounded caches in ~/.squeezr/.
NetworkOutbound HTTPS (443)api.anthropic.com, api.openai.com, generativelanguage.googleapis.com
GPUNot requiredAll AI compression is cloud-based or via optional external Ollama.

Software

RequirementVersion
Node.js≥ 18 (tested on 24.x)
OSWindows 10/11, macOS, Linux, WSL2
npmWhichever ships with your Node version

Measured resource usage

  • Main proxy RAM: ~140 MB working set (proxy + dashboard + MITM 8081).
  • Desktop proxy (optional, Claude/Codex Desktop only): separate process, ~80 MB.
  • CPU idle: ~0% — only active while a request is in flight.
  • CPU peak: one core for 10–50 ms per request (deterministic pipeline is synchronous).
  • Disk in ~/.squeezr/: bounded — cache.json (LRU 1000 entries), history.json (500 sessions max), captures/ (20 files max, opt-in).

Added latency per request

Pipeline stageLatency
Deterministic (regex, dedup, Myers diff)5–50 ms
AI compression (Haiku / GPT-mini / Gemini Flash)300–1500 ms — only on old blocks above threshold, runs in parallel
Forward to upstreamPassthrough — streaming preserved

Ports used (configurable)

PortPurpose
8080Main HTTP proxy (Claude Code, Codex CLI, Aider, Gemini CLI) + dashboard
8081MITM proxy (Codex OAuth)
8443 / 8088Desktop proxy — only needed for Claude Desktop / Codex Desktop

What you do NOT need

  • ❌ GPU / CUDA
  • ❌ Docker
  • ❌ A database — everything is JSON in ~/.squeezr/
  • ❌ Ollama — optional, only if you want a fully local AI compression backend

With Zest local model (future)

When the Zest local model (zest-0.8b) ships via Ollama, requirements increase: ~2 GB extra RAM for the model (~1 GB in Q4 quantization) and a CPU with AVX2 or a small GPU for reasonable inference speed. This will be documented separately.