Skip to main content

Architecture

Routerly is a self-hosted API gateway that sits between your application and one or more LLM providers. It exposes standard-compatible endpoints (/v1/chat/completions, /v1/responses, /v1/messages) so existing SDKs work without modification.


Component Overview

┌────────────────────────────────────────────────────────────────┐
│ Any Client │
│ │
│ Your App │ OpenAI / Anthropic SDK │ Cursor │ Open WebUI│
│ │ LibreChat │ OpenClaw │ LangChain / LlamaIndex│
└───────────────────────┬────────────────────────────────────────┘
│ Bearer sk-lr-<token>
│ POST /v1/chat/completions (OpenAI)
│ POST /v1/messages (Anthropic)

┌─────────────────────────────────────────────────────┐
│ Routerly Service │
│ ┌────────────┐ ┌────────────┐ ┌──────────────┐ │
│ │ Auth Guard │ │ Router │ │ Budget Guard │ │
│ └────────────┘ └─────┬──────┘ └──────────────┘ │
│ │ │
│ ┌─────────────────────▼────────────────────────┐ │
│ │ Provider Adapters │ │
│ │ OpenAI · Anthropic · Gemini · Mistral · … │ │
│ └─────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────┘

┌──────────┴──────────┐
▼ ▼
┌─────────────┐ ┌─────────────┐
│ OpenAI API │ … │ Ollama API │
└─────────────┘ └─────────────┘

Packages

Routerly is a monorepo composed of four packages:

PackageDescription
packages/serviceThe core Fastify HTTP server, routing engine, and provider adapters
packages/dashboardThe React + Vite web UI served at /dashboard
packages/cliThe routerly CLI tool (Commander.js)
packages/sharedShared TypeScript types, provider definitions, and utilities

Request Lifecycle

When your application sends a chat request to Routerly:

  1. Authentication — The Bearer token is validated against the list of project tokens.
  2. Project resolution — The project's routing configuration and budget are loaded.
  3. Budget pre-check — If the project or any parent budget is exhausted, Routerly returns 503 immediately.
  4. Routing — The configured routing policies are applied in priority order to select a model. Each policy can score or filter the candidate set.
  5. Provider dispatch — The request is translated to the target provider's wire format (OpenAI, Anthropic Messages, Gemini, …) and forwarded.
  6. Streaming or buffering — If stream: true, Routerly SSE-proxies the provider stream. Otherwise it buffers and returns a standard response.
  7. Cost accounting — Token counts and cost are computed and appended to usage.json.
  8. Budget update — All applicable budget windows (token, project, global) are incremented.
  9. Notifications — If any budget threshold was crossed, alert channels (email, webhook) are triggered.

Configuration Storage

All state is stored as JSON files on disk under ~/.routerly/ (override with $ROUTERLY_HOME). There is no external database dependency.

FileContents
config/settings.jsonService settings
config/models.jsonRegistered LLM models (API keys AES-encrypted)
config/projects.jsonProjects, routing, tokens, member roles
config/users.jsonDashboard users (passwords bcrypt-hashed)
config/roles.jsonCustom RBAC roles
data/usage.jsonPer-request usage records (append-only)

Ports and Protocols

Endpoint prefixProtocolPurpose
/v1/*HTTP/1.1 + SSELLM proxy — authenticated with project tokens
/api/*HTTP/1.1Management API — authenticated with JWT session
/dashboardHTTP/1.1React SPA
/healthHTTP/1.1Health check (unauthenticated)