Routing Configuration
Routing rules determine how the gateway distributes incoming requests across providers and models. Each rule targets a specific capability and applies a strategy to select the best provider for each request.
Route List
The Routing page displays all configured routes as cards. Each card shows:
- Name and description
- Strategy — the algorithm used to select a provider
- Capability — which request type this route handles (e.g. chat, embedding)
- Provider count — how many providers are assigned
- Status — enabled or disabled
- Provider weights (for weighted strategy)
- Fallback chain (if configured)
Creating a Route
Click Create Route to open the form with these fields:
| Field | Description |
|---|---|
| Name | A descriptive name for the route |
| Description | Optional notes about the route’s purpose |
| Strategy | The routing algorithm (see below) |
| Capability | The request type this route applies to |
| Enabled | Toggle to activate or deactivate the route |
| Provider Weights | Shown only for the weighted strategy; assign a numeric weight per provider |
| Fallback Chain | Comma-separated list of provider IDs to try in order if the primary fails |
Routing Strategies
| Strategy | Behavior |
|---|---|
| Priority | Routes to the highest-priority provider that is healthy |
| Round Robin | Distributes requests evenly across providers in rotation |
| Weighted | Distributes requests proportionally based on assigned weights |
| Least Latency | Selects the provider with the lowest recent average latency |
| Least Cost | Selects the provider with the lowest per-token cost for the requested model |
| Free Tier First | Prefers providers with remaining free-tier quota before falling back to paid |
| Task Optimized | Selects the provider best suited for the specific task type |
| Cost Optimized | Balances cost and quality based on request characteristics |
| Failover | Uses the primary provider and switches to the next on failure |
| Random | Selects a provider at random from the available pool |
Capabilities
Routes are scoped to a single capability. The available capabilities are:
chat— Chat completionscompletion— Text completionsembedding— Vector embeddingsimage— Image generationaudio— Audio transcription and translationtext-to-speech— Speech synthesisrerank— Rerankingvideo-generation— Video generation
You can create multiple routes for the same capability; the gateway evaluates them in order and uses the first enabled match.
Fallback Chains
Every route supports an optional fallback chain. When the selected provider returns an error or is unhealthy, the gateway tries each provider in the fallback chain in order. Specify provider IDs separated by commas (e.g. openai, anthropic, azure).
Fallback chains work with any strategy. For example, a least-latency route can still fall back to a manually specified chain if all preferred providers are down.
Weighted Distribution
When using the weighted strategy, you assign a numeric weight to each provider. The gateway distributes traffic proportionally. For example, weights of openai: 3, anthropic: 1 send roughly 75% of traffic to OpenAI and 25% to Anthropic.
Add or remove provider weight entries using the form controls. Each entry requires a provider ID and a weight between 0 and 100.
Editing and Deleting Routes
- Click Edit on any route card to modify its settings.
- Click Delete to remove a route. A confirmation dialog warns that traffic using this route will fall back to default routing.
- Use the Enabled toggle to temporarily disable a route without deleting it.
Default Routing Strategy
A system-wide default routing strategy is configured on the Settings page. Routes defined here override the default for their specific capability.