Routerly
One gateway. Any AI model. Total control.
Routerly is a self-hosted LLM API gateway with intelligent routing, cost tracking, and budget enforcement. It sits between your application and your LLM providers — OpenAI, Anthropic, Google Gemini, Ollama, and more — and decides which model to use for every request based on configurable policies.
Compatible with OpenAI and Anthropic SDKs out of the box. Change the base URL in your client, nothing else.
Key Features
| Feature | Description |
|---|---|
| Intelligent routing | 9 configurable policies score every request in parallel — cheapest, fastest, healthiest, most capable, or LLM-powered |
| Budget enforcement | Hard limits per model, project, and token. Requests are blocked before being sent to providers |
| Full cost tracking | Every API call logged with tokens, USD cost, latency, and routing trace |
| Multi-project isolation | Each project has its own API token, model list, routing config, and budget envelope |
| OpenAI + Anthropic compatible | Drop-in replacement for both APIs — /v1/chat/completions and /v1/messages |
| Zero infrastructure | No database, no Redis, no PostgreSQL — config lives in JSON files |
| Web dashboard | Register models, configure projects, monitor usage, manage users |
| Admin CLI | Full management from the terminal with routerly commands |
| RBAC | Role-based access control with 7 granular permissions and custom roles |
| Notifications | Email and webhook alerts via SMTP, SES, SendGrid, Azure, Google, or custom webhook |
How It Works
Point your app — or any AI tool like Cursor, Open WebUI, OpenClaw, or LangChain — at localhost:3000 instead of the provider's URL. From that moment, Routerly takes over: it picks the best available model for each request, enforces your budget, tracks every token spent, and automatically reroutes if a provider fails.
Any Client Routerly Providers
────────── ──────── ─────────
Your App ──▶ Authenticate OpenAI
Cursor Policy scoring ──▶ Anthropic
Open WebUI POST /v1/ Select model Gemini
OpenClaw chat/ Forward request Ollama
LangChain completions Track cost ◀── ...
◀── Return response
- Your app sends a request to Routerly using a project token as the Bearer header
- Routerly authenticates the token and resolves the project
- Enabled routing policies run in parallel and score each candidate model
- The highest-scoring model within budget receives the request
- Response is forwarded back; tokens, cost, and latency are recorded
- On error or timeout, the next candidate is tried automatically
Use Cases
| Scenario | How Routerly helps |
|---|---|
| SaaS & multi-tenant | One project per tenant, hard spend caps, automatic cheapest-model routing |
| Local-first development | Develop against Ollama locally, promote to GPT-4o in production — same API, same code |
| Automatic cost reduction | Route simple tasks to cheap models, complex ones to capable models |
| Resilience & failover | Register the same capability across OpenAI, Claude, and Gemini — Routerly detects failures and reroutes in real time |
| AI tools & chat UIs | Cursor, Open WebUI, OpenClaw, LibreChat — any tool that speaks OpenAI or Anthropic format |
| Model evaluation / A/B testing | Split traffic with the fairness policy; compare cost, latency, and error rate in the dashboard |
Quick Start
macOS / Linux:
curl -fsSL https://www.routerly.ai/install.sh | bash
Windows (PowerShell):
powershell -c "irm https://www.routerly.ai/install.ps1 | iex"
Then register a model, create a project, and start:
routerly model add --id gpt-5-mini --provider openai --api-key sk-YOUR_KEY
routerly project add --name "My App" --slug my-app --models gpt-5-mini
routerly start
Open http://localhost:3000/dashboard to manage everything from the web UI.
→ See the Installation guide and Quick Start tutorial for full details.
Comparison
Routerly is the only gateway that combines self-hosting, native Anthropic support, LLM-powered routing, and zero external dependencies.
| Routerly | LiteLLM | OpenRouter | |
|---|---|---|---|
| Self-hosted | ✅ | ✅ | ❌ cloud-only |
| OpenAI-compatible API | ✅ | ✅ | ✅ |
| Native Anthropic API format | ✅ | ❌ | ❌ |
| Local model support (Ollama) | ✅ | ✅ | ❌ |
| LLM-powered smart routing | ✅ | ❌ | ❌ |
| Deterministic routing policies | ✅ | ⚠️ limited | ❌ |
| Budget enforcement | ✅ | ✅ | ✅ |
| Database required | None | SQLite / PostgreSQL | N/A |
| External infrastructure | None | Optional Redis | N/A |
| Web dashboard | ✅ built-in | ✅ | ✅ |
| Admin CLI | ✅ | ✅ | ❌ |
| Data privacy | ✅ stays local | ✅ | ❌ transits cloud |