token-optimizer
High-risk skill: executes shell commands and downloads remote scripts (curl -fsSL https://ollama.ai/install.sh | sh) and runs installers (python3 src/optimizer.py, ollama CLI). It also calls external endpoints (https://api.anthropic.com/v1/usage, https://clawhub.ai), references ANTHROPIC_API_KEY/tokens, and modifies agent config/identity files (~/.openclaw/openclaw.json, ~/.openclaw/workspace/SOUL.md).
Token Optimizer for OpenClaw
Reduce your AI costs by 97% - From $1,500+/month to under $50/month
The Problem
If you've been running OpenClaw and watching your API bills climb, you're not alone. The default configuration prioritizes capability over cost, which means you're probably burning through tokens on routine tasks that don't need expensive models.
Common issues:
- Loading 50KB of history on every message (2-3M wasted tokens/session)
- Using Sonnet/Opus for simple tasks that Haiku handles perfectly
- Paying for API heartbeats that could run on a free local LLM
- No rate limits leading to runaway automation costs
The Solution
Token Optimizer applies four key optimizations that work together to slash your costs:
| Optimization | Before | After | Savings |
|---|---|---|---|
| Session Management | 50KB context | 8KB context | 80% |
| Model Routing | Sonnet for everything | Haiku default | 92% |
| Heartbeat to Ollama | Paid API | Free local LLM | 100% |
| Prompt Caching | No caching | 90% cache hits | 90% |
Combined result: 97% cost reduction
Cost Comparison
| Time Period | Before | After |
|---|---|---|
| Daily | $2-3 | $0.10 |
| Monthly | $70-90 | $3-5 |
| Yearly | $800+ | $40-60 |
Quick Start
Installation
Windows (PowerShell):
.\scripts\install.ps1
macOS/Linux:
chmod +x scripts/install.sh
./scripts/install.sh
Manual Python:
python src/optimizer.py --mode full
Verify Setup
python src/verify.py
Features
1. Intelligent Model Routing
Sets Haiku as the default model with easy aliases for switching:
haiku- Fast, cheap, perfect for 80% of taskssonnet- Complex reasoning, architecture decisionsopus- Mission-critical only
2. Free Heartbeats via Ollama
Routes heartbeat checks to a local LLM (llama3.2:3b) instead of paid API:
- Zero API calls for status checks
- No impact on rate limits
- Saves $5-15/month automatically
3. Lean Session Management
Optimized context loading rules that reduce startup context from 50KB to 8KB:
- Load only essential files (SOUL.md, USER.md)
- On-demand history retrieval
- Daily memory notes instead of history bloat
4. Prompt Caching
Automatic 90% discount on repeated content:
- System prompts cached and reused
- 5-minute TTL for optimal cache hits
- Per-model cache configuration
5. Budget Controls
Built-in rate limits and budget warnings:
- Daily/monthly budget caps
- Warning at 75% threshold
- Rate limiting between API calls
Usage
Analyze Current Setup
python src/analyzer.py
Shows:
- Current configuration status
- Workspace file sizes
- Optimization opportunities
- Estimated monthly savings
Apply Full Optimization
python src/optimizer.py --mode full
Applies all optimizations:
- Updates
~/.openclaw/openclaw.json - Generates workspace templates
- Creates system prompt rules
- Sets up Ollama heartbeat
Apply Specific Optimizations
# Model routing only
python src/optimizer.py --mode routing
# Heartbeat to Ollama only
python src/optimizer.py --mode heartbeat
# Prompt caching only
python src/optimizer.py --mode caching
# Rate limits only
python src/optimizer.py --mode limits
Dry Run (Preview Changes)
python src/optimizer.py --mode full --dry-run
Configuration
After installation, edit these files:
~/.openclaw/workspace/SOUL.md
Agent principles and operating rules. Includes:
- Model selection rules
- Session initialization rules
- Rate limit rules
~/.openclaw/workspace/USER.md
Your context: name, role, mission, success metrics.
~/.openclaw/prompts/OPTIMIZATION-RULES.md
Copy these rules into your agent's system prompt.
Requirements
- Python 3.8+
- OpenClaw installed and configured
- Ollama (optional, for free heartbeats)
Installing Ollama
macOS/Linux:
curl -fsSL https://ollama.ai/install.sh | sh
ollama pull llama3.2:3b
ollama serve
Windows: Download from ollama.ai and run:
ollama pull llama3.2:3b
ollama serve
File Structure
token-optimizer/
├── skill.json # Skill manifest
├── README.md # This file
├── src/
│ ├── __init__.py
│ ├── analyzer.py # Analyzes current config
│ ├── optimizer.py # Applies optimizations
│ └── verify.py # Verifies setup
├── templates/
│ ├── openclaw-config-optimized.json
│ ├── SOUL.md
│ ├── USER.md
│ └── OPTIMIZATION-RULES.md
└── scripts/
├── install.sh # Unix installer
└── install.ps1 # Windows installer
Troubleshooting
Context size still large
- Ensure SESSION INITIALIZATION RULE is in your system prompt
- Check that SOUL.md and USER.md are lean (<15KB total)
Still using Sonnet for everything
- Verify
~/.openclaw/openclaw.jsonhas correct model configuration - Ensure MODEL SELECTION RULE is in system prompt
Heartbeat errors
- Make sure Ollama is running:
ollama serve - Verify model is installed:
ollama list
Costs haven't dropped
- Run
python src/verify.pyto check all optimizations - Ensure system prompt includes all optimization rules
Support
- Documentation: docs.tokenoptimizer.ai
- Issues: GitHub Issues
- Email: support@tokenoptimizer.ai
License
Commercial license. See LICENSE for details.
Built with care by TokenOptimizer
Stop burning tokens. Start building things.