Skip to main content

HTTP Endpoints

The service exposes three groups of HTTP endpoints on the same port (default: 3000):

GroupPath prefixAuthPurpose
LLM Proxy/v1/*Bearer project token (sk-rt-…)Forward requests to LLM providers
Management API/api/*Bearer JWT (dashboard session)Configure models, projects, users
Dashboard/dashboard/*Browser session (cookie)Serve the React web UI
Health/healthNoneLiveness probe

For the full request/response schemas of each route, see API — LLM Proxy and API — Management.


LLM Proxy

These routes accept the same request bodies as the original provider APIs. Authentication is via a project token (Authorization: Bearer sk-rt-…).

Every request goes through the full routing and budget stack before being forwarded to a provider.

POST /v1/chat/completions

OpenAI Chat Completions format. Supports both streaming ("stream": true) and non-streaming responses.

POST /v1/chat/completions
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
"model": "gpt-5-mini",
"messages": [{ "role": "user", "content": "Hello!" }],
"stream": false
}

The model field is the model ID registered in your project. Routerly ignores it as an upstream model directive — the routing engine picks the actual provider model based on your policies.

POST /v1/responses

OpenAI Responses API format (newer API surface). Uses input instead of messages and always streams. Routerly normalises it to the chat/completions shape internally before routing.

POST /v1/responses
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
"model": "gpt-5-mini",
"input": [{ "role": "user", "content": "Hello!" }]
}

POST /v1/messages

Anthropic Messages API format. The request body matches the Anthropic SDK wire format exactly.

POST /v1/messages
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
"model": "claude-haiku-4-5",
"max_tokens": 1024,
"messages": [{ "role": "user", "content": "Hello!" }]
}

Routerly proxies this to the Anthropic provider adapter. If the selected model is an OpenAI model, the adapter translates the request format automatically.

GET /v1/models

Returns the list of models available in the project associated with the token, in the OpenAI GET /v1/models response format.

GET /v1/models
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN

Error format

All LLM Proxy errors follow the OpenAI error envelope:

{
"error": {
"message": "Budget exceeded for model gpt-5-mini",
"type": "budget_exceeded",
"code": "budget_exceeded"
}
}

Common status codes:

CodeCause
401Missing or invalid project token
503No model passed all routing filters (all excluded or over budget)
503Budget exhausted for the project or token
504Provider timeout

Management API

The Management API is used by the dashboard and the CLI. Authentication is via a JWT obtained from POST /api/auth/login.

Full endpoint catalogue: API — Management.

Key routes

MethodPathDescription
POST/api/auth/loginObtain a JWT
GET/api/modelsList registered models
POST/api/modelsRegister a new model
PUT/api/models/:idUpdate a model
DELETE/api/models/:idRemove a model
GET/api/projectsList projects
POST/api/projectsCreate a project
GET/api/usageQuery usage records
GET/api/settingsRead service settings
PUT/api/settingsUpdate service settings
GET/api/usersList users (admin only)

Dashboard

When dashboardEnabled: true (default), the service bundles and serves the React web UI as static files.

PathBehaviour
GET /dashboard/Serves index.html (React app entry point)
GET /dashboard/*Static assets (JS, CSS, icons) — falls back to index.html for client-side routes
GET /dashboardRedirects to /dashboard/
GET /Redirects to /dashboard/

To disable the dashboard (e.g. in a headless production deployment):

// settings.json
{ "dashboardEnabled": false }

Health Check

GET /health

No authentication required. Returns HTTP 200 with a JSON body:

{
"status": "ok",
"version": "0.1.5",
"timestamp": "2026-03-27T12:00:00.000Z"
}

Suitable for Docker HEALTHCHECK, Kubernetes liveness probes, and load balancer checks.


Trace Header

Every LLM Proxy response includes an x-routerly-trace-id header containing a UUID that identifies the routing trace for that request. You can use this ID to look up the routing decision in the dashboard's Playground trace viewer.

x-routerly-trace-id: 3fa85f64-5717-4562-b3fc-2c963f66afa6