HTTP Endpoints

The service exposes three groups of HTTP endpoints on the same port (default: 3000):

Group	Path prefix	Auth	Purpose
LLM Proxy	`/v1/*`	Bearer project token (`sk-rt-…`)	Forward requests to LLM providers
Management API	`/api/*`	Bearer JWT (dashboard session)	Configure models, projects, users
Dashboard	`/dashboard/*`	Browser session (cookie)	Serve the React web UI
Health	`/health`	None	Liveness probe

For the full request/response schemas of each route, see API — LLM Proxy and API — Management.

LLM Proxy

These routes accept the same request bodies as the original provider APIs. Authentication is via a project token (Authorization: Bearer sk-rt-…).

Every request goes through the full routing and budget stack before being forwarded to a provider.

`POST /v1/chat/completions`

OpenAI Chat Completions format. Supports both streaming ("stream": true) and non-streaming responses.

POST /v1/chat/completions
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
  "model": "gpt-5-mini",
  "messages": [{ "role": "user", "content": "Hello!" }],
  "stream": false
}

The model field is the model ID registered in your project. Routerly ignores it as an upstream model directive — the routing engine picks the actual provider model based on your policies.

`POST /v1/responses`

OpenAI Responses API format (newer API surface). Uses input instead of messages and always streams. Routerly normalises it to the chat/completions shape internally before routing.

POST /v1/responses
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
  "model": "gpt-5-mini",
  "input": [{ "role": "user", "content": "Hello!" }]
}

`POST /v1/messages`

Anthropic Messages API format. The request body matches the Anthropic SDK wire format exactly.

POST /v1/messages
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN
Content-Type: application/json

{
  "model": "claude-haiku-4-5",
  "max_tokens": 1024,
  "messages": [{ "role": "user", "content": "Hello!" }]
}

Routerly proxies this to the Anthropic provider adapter. If the selected model is an OpenAI model, the adapter translates the request format automatically.

`GET /v1/models`

Returns the list of models available in the project associated with the token, in the OpenAI GET /v1/models response format.

GET /v1/models
Authorization: Bearer sk-rt-YOUR_PROJECT_TOKEN

Error format

All LLM Proxy errors follow the OpenAI error envelope:

{
  "error": {
    "message": "Budget exceeded for model gpt-5-mini",
    "type": "budget_exceeded",
    "code": "budget_exceeded"
  }
}

Common status codes:

Code	Cause
`401`	Missing or invalid project token
`503`	No model passed all routing filters (all excluded or over budget)
`503`	Budget exhausted for the project or token
`504`	Provider timeout

Management API

The Management API is used by the dashboard and the CLI. Authentication is via a JWT obtained from POST /api/auth/login.

Full endpoint catalogue: API — Management.

Key routes

Method	Path	Description
`POST`	`/api/auth/login`	Obtain a JWT
`GET`	`/api/models`	List registered models
`POST`	`/api/models`	Register a new model
`PUT`	`/api/models/:id`	Update a model
`DELETE`	`/api/models/:id`	Remove a model
`GET`	`/api/projects`	List projects
`POST`	`/api/projects`	Create a project
`GET`	`/api/usage`	Query usage records
`GET`	`/api/settings`	Read service settings
`PUT`	`/api/settings`	Update service settings
`GET`	`/api/users`	List users (admin only)

Dashboard

When dashboardEnabled: true (default), the service bundles and serves the React web UI as static files.

Path	Behaviour
`GET /dashboard/`	Serves `index.html` (React app entry point)
`GET /dashboard/*`	Static assets (JS, CSS, icons) — falls back to `index.html` for client-side routes
`GET /dashboard`	Redirects to `/dashboard/`
`GET /`	Redirects to `/dashboard/`

To disable the dashboard (e.g. in a headless production deployment):

// settings.json
{ "dashboardEnabled": false }

Health Check

GET /health

No authentication required. Returns HTTP 200 with a JSON body:

{
  "status": "ok",
  "version": "0.1.5",
  "timestamp": "2026-03-27T12:00:00.000Z"
}

Suitable for Docker HEALTHCHECK, Kubernetes liveness probes, and load balancer checks.

Trace Header

Every LLM Proxy response includes an x-routerly-trace-id header containing a UUID that identifies the routing trace for that request. You can use this ID to look up the routing decision in the dashboard's Playground trace viewer.

x-routerly-trace-id: 3fa85f64-5717-4562-b3fc-2c963f66afa6

API — LLM Proxy — full request/response schemas
API — Management — full management endpoint catalogue
Service — Routing Engine — how the model is selected for each request

LLM Proxy​

POST /v1/chat/completions​

POST /v1/responses​

POST /v1/messages​

GET /v1/models​

Error format​

Management API​

Key routes​

Dashboard​

Health Check​

Trace Header​

Related​