Architecture
Routerly is a self-hosted API gateway that sits between your application and one or more LLM providers. It exposes standard-compatible endpoints (/v1/chat/completions, /v1/responses, /v1/messages) so existing SDKs work without modification.
Component Overview
┌────────────────────────────────────────────────────────────────┐
│ Any Client │
│ │
│ Your App │ OpenAI / Anthropic SDK │ Cursor │ Open WebUI│
│ │ LibreChat │ OpenClaw │ LangChain / LlamaIndex│
└───────────────────────┬────────────────────────────────────────┘
│ Bearer sk-lr-<token>
│ POST /v1/chat/completions (OpenAI)
│ POST /v1/messages (Anthropic)
▼
┌─────────────────────────────────────────────────────┐
│ Routerly Service │
│ ┌────────────┐ ┌────────────┐ ┌──────────────┐ │
│ │ Auth Guard │ │ Router │ │ Budget Guard │ │
│ └────────────┘ └─────┬──────┘ └──────────────┘ │
│ │ │
│ ┌─────────────────────▼────────────────────────┐ │
│ │ Provider Adapters │ │
│ │ OpenAI · Anthropic · Gemini · Mistral · … │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘
│
┌──────────┴──────────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ OpenAI API │ … │ Ollama API │
└─────────────┘ └─────────────┘
Packages
Routerly is a monorepo composed of four packages:
| Package | Description |
|---|---|
packages/service | The core Fastify HTTP server, routing engine, and provider adapters |
packages/dashboard | The React + Vite web UI served at /dashboard |
packages/cli | The routerly CLI tool (Commander.js) |
packages/shared | Shared TypeScript types, provider definitions, and utilities |
Request Lifecycle
When your application sends a chat request to Routerly:
- Authentication — The Bearer token is validated against the list of project tokens.
- Project resolution — The project's routing configuration and budget are loaded.
- Budget pre-check — If the project or any parent budget is exhausted, Routerly returns
503immediately. - Routing — The configured routing policies are applied in priority order to select a model. Each policy can score or filter the candidate set.
- Provider dispatch — The request is translated to the target provider's wire format (OpenAI, Anthropic Messages, Gemini, …) and forwarded.
- Streaming or buffering — If
stream: true, Routerly SSE-proxies the provider stream. Otherwise it buffers and returns a standard response. - Cost accounting — Token counts and cost are computed and appended to
usage.json. - Budget update — All applicable budget windows (token, project, global) are incremented.
- Notifications — If any budget threshold was crossed, alert channels (email, webhook) are triggered.
Configuration Storage
All state is stored as JSON files on disk under ~/.routerly/ (override with $ROUTERLY_HOME). There is no external database dependency.
| File | Contents |
|---|---|
config/settings.json | Service settings |
config/models.json | Registered LLM models (API keys AES-encrypted) |
config/projects.json | Projects, routing, tokens, member roles |
config/users.json | Dashboard users (passwords bcrypt-hashed) |
config/roles.json | Custom RBAC roles |
data/usage.json | Per-request usage records (append-only) |
Ports and Protocols
| Endpoint prefix | Protocol | Purpose |
|---|---|---|
/v1/* | HTTP/1.1 + SSE | LLM proxy — authenticated with project tokens |
/api/* | HTTP/1.1 | Management API — authenticated with JWT session |
/dashboard | HTTP/1.1 | React SPA |
/health | HTTP/1.1 | Health check (unauthenticated) |