Skip to main content

Providers

A provider is an LLM platform that Routerly knows how to communicate with. Each provider has its own wire protocol, authentication scheme, and model catalogue.


Supported Providers

ProviderIDAuthenticationNotes
OpenAIopenaiAPI keyChat completions + Responses API + token counting
AnthropicanthropicAPI keyMessages API + token counting
Google GeminigeminiAPI keyOpenAI-compatible endpoint
MistralmistralAPI keyOpenAI-compatible endpoint
CoherecohereAPI keyOpenAI-compatible endpoint
xAI (Grok)xaiAPI keyOpenAI-compatible endpoint
OllamaollamaNoneLocal inference; set baseUrl to your Ollama host
CustomcustomOptionalAny OpenAI-compatible endpoint

OpenAI

Model IDContextInput priceOutput priceCapabilities
gpt-5.2128k$1.75 / 1M$14 / 1MVision, function calling, JSON
gpt-5.1128k$1.25 / 1M$10 / 1MVision, function calling, JSON
gpt-5128k$1.25 / 1M$10 / 1MVision, function calling, JSON
gpt-5-mini128k$0.25 / 1M$2 / 1MVision, function calling, JSON
gpt-5-nano128k$0.05 / 1M$0.4 / 1MFunction calling, JSON
gpt-4.11M$2 / 1M$8 / 1MVision, function calling, JSON
gpt-4.1-mini1M$0.40 / 1M$1.6 / 1MVision, function calling, JSON
gpt-4.1-nano1M$0.10 / 1M$0.4 / 1MFunction calling, JSON
gpt-4o128k$2.50 / 1M$10 / 1MVision, function calling, JSON
gpt-4o-mini128k$0.15 / 1M$0.6 / 1MVision, function calling, JSON
o1200k$15 / 1M$60 / 1MThinking, function calling, JSON
o3200k$2 / 1M$8 / 1MThinking, function calling, JSON
o4-mini200k$1.10 / 1M$4.4 / 1MThinking, function calling, JSON

Prices are per 1 million tokens unless otherwise noted.


Anthropic

Model IDContextInput priceOutput priceNotes
claude-opus-4-6200k$5 / 1M$25 / 1MTier >200k tokens: $10 / $37.5
claude-sonnet-4-6200k$3 / 1M$15 / 1M
claude-sonnet-4-5200k$3 / 1M$15 / 1MTier >200k tokens: $6 / $22.5
claude-haiku-4-5200k$1 / 1M$5 / 1M
claude-opus-4-1200k$15 / 1M$75 / 1MVision, function calling, JSON
claude-sonnet-4-1200k$3 / 1M$15 / 1MVision, function calling, JSON

Google Gemini

Model IDContextInput priceOutput priceNotes
gemini-2.5-pro2M$1.25 / 1M$10 / 1MTier >200k: $2.5 / $15
gemini-2.5-flash1M$0.30 / 1M$2.5 / 1M
gemini-2.5-flash-lite1M$0.10 / 1M$0.4 / 1M
gemini-3.1-pro-preview2M$2 / 1M$12 / 1MTier >200k: higher
gemini-3-pro-preview2MExperimental
gemini-3-flash-preview1MExperimental
gemini-2.0-flash1M$0.10 / 1M$0.4 / 1M
gemini-2.0-flash-lite1M$0.075 / 1M$0.3 / 1M

Mistral

Model IDNotes
mistral-large-latestFlagship model
mistral-small-latestEfficient, low cost
mistral-nemoOpen-weight, 12B
codestral-latestCode specialised
ministral-8b-latestUltra-small

Cohere

Model IDNotes
command-r-plusBest quality
command-rBalanced
command-a-03-2025Latest generation
command-nightlyBleeding edge
c4ai-aya-expanse-8bMultilingual, 8B
c4ai-aya-expanse-32bMultilingual, 32B
embed-english-v3.0Embeddings

xAI (Grok)

Model IDNotes
grok-3Latest flagship
grok-3-fastOptimised for speed
grok-3-miniEfficient
grok-3-mini-fastSmallest / fastest

Ollama (Local)

Model IDNotes
ollama/llama3.2Meta Llama 3.2, 3B
ollama/llama3.1:8bMeta Llama 3.1, 8B
ollama/qwen3:4bQwen3, 4B
ollama/qwen3:8bQwen3, 8B
ollama/mistralMistral 7B
ollama/phi4-miniMicrosoft Phi-4 Mini
ollama/gemma3:4bGoogle Gemma 3, 4B
ollama/deepseek-r1:7bDeepSeek R1, 7B

Ollama models require a running Ollama server. The default base URL is http://localhost:11434. Override it per-model in the dashboard with the Base URL field.


Custom / Self-hosted

Use provider ID custom for any OpenAI-compatible endpoint (vLLM, LM Studio, LocalAI, etc.):

routerly model add \
--id my-custom-model \
--provider custom \
--base-url http://192.168.1.50:8000/v1 \
--input-price 0 \
--output-price 0

Adding a Provider Model

All models must be registered in Routerly before they can be used. See Concepts: Models for registration details.