All docs

Config File Reference

Squeezr uses TOML for configuration. This page is the complete reference for every key in the config file, organized by section.

File locations

# Global config — next to the installed binary (in npm global prefix)
squeezr.toml

# Project config — deep-merged over global, apply per-repo overrides
.squeezr.toml   (in your project root)

Use squeezr config to print the resolved path and current values.

[proxy]

Controls the proxy server ports.

KeyTypeDefaultDescription
portinteger8080HTTP proxy port (Claude Code, Aider, Gemini CLI).
mitm_portinteger8081MITM proxy port (Codex). Defaults to port + 1.

[compression]

Controls how and when content is compressed.

KeyTypeDefaultDescription
thresholdinteger800Minimum content size (chars) to trigger compression.
keep_recentinteger3Last N tool results to leave uncompressed.
compress_system_promptbooleantrueCompress and cache the system prompt.
compress_conversationbooleanfalseAlso compress assistant messages (aggressive mode).
skip_toolsarray[]Tool names to never compress (e.g. ["Read"]).
only_toolsarray[]Only compress these tools, skip all others (e.g. ["Bash"]).
ai_compressionbooleanfalseEnable AI-based compression (Haiku/GPT-mini/Gemini Flash). Off by default — deterministic compression still runs. Can also be toggled from the dashboard and is persisted in ~/.squeezr/ai-compression.json.
ai_min_charsinteger1500Minimum block size (chars) to send to the AI backend. Blocks smaller than this use deterministic-only compression. Data shows blocks <500 chars are often expanded by AI; blocks ≥1500 save 70–91%.
stale_turnsbooleantrueCollapse old assistant/user turns in very long sessions to save context. Only triggers when the session exceeds stale_turn_threshold user turns. Never touches the last stale_turn_keep_recent turns.
stale_turn_thresholdinteger50Number of user turns after which stale turn summarization activates.
stale_turn_keep_recentinteger20Number of recent turns to always keep at full fidelity (never summarized).
capture_requestsbooleanfalseSave anonymized incoming request payloads to ~/.squeezr/captures/ for debugging. Auth headers are redacted. Stops after capture_limit files.
capture_limitinteger20Maximum number of capture files to write before stopping.

[cache]

Controls in-process caching of compressed results.

KeyTypeDefaultDescription
enabledbooleantrueEnable the cache.
max_entriesinteger1000Maximum number of cached compressed results.

[adaptive]

Adaptive pressure automatically increases compression aggressiveness as the context window fills up.

KeyTypeDefaultDescription
enabledbooleantrueEnable adaptive compression.
low_thresholdinteger1500Min chars to compress when context is below 50%.
mid_thresholdinteger800Min chars to compress when context is 50–75%.
high_thresholdinteger400Min chars to compress when context is 75–90%.
critical_thresholdinteger150Min chars to compress when context exceeds 90%. Git diff context set to 0.

[local]

Configuration for local model servers (Ollama) used as the compression backend.

KeyTypeDefaultDescription
enabledbooleantrueEnable local model support.
upstream_urlstring"http://localhost:11434"URL of the local model server.
compression_modelstring"qwen2.5-coder:1.5b"Local model to use for AI compression.

Full example

# squeezr.toml — v1.67.4 complete reference

[proxy]
port = 8080
mitm_port = 8081

[compression]
threshold = 800
keep_recent = 3
compress_system_prompt = true
compress_conversation = false
ai_compression = false        # OFF by default — toggle from dashboard or set here
ai_min_chars = 1500           # Minimum block size for AI compression
stale_turns = true            # Summarize old turns in long sessions
stale_turn_threshold = 50     # Activate after N user turns
stale_turn_keep_recent = 20   # Keep last N turns at full fidelity
# skip_tools = ["Read"]
# only_tools = ["Bash"]
# capture_requests = false    # Debug: save anonymized payloads to ~/.squeezr/captures/
# capture_limit = 20

[cache]
enabled = true
max_entries = 1000

[adaptive]
enabled = true
low_threshold = 1500
mid_threshold = 800
high_threshold = 400
critical_threshold = 150

[local]
enabled = true
upstream_url = "http://localhost:11434"
compression_model = "qwen2.5-coder:1.5b"