Skip to content

Routing Configuration

Routing rules determine how the gateway distributes incoming requests across providers and models. Each rule targets a specific capability and applies a strategy to select the best provider for each request.

Route List

The Routing page displays all configured routes as cards. Each card shows:

  • Name and description
  • Strategy — the algorithm used to select a provider
  • Capability — which request type this route handles (e.g. chat, embedding)
  • Provider count — how many providers are assigned
  • Status — enabled or disabled
  • Provider weights (for weighted strategy)
  • Fallback chain (if configured)

Creating a Route

Click Create Route to open the form with these fields:

FieldDescription
NameA descriptive name for the route
DescriptionOptional notes about the route’s purpose
StrategyThe routing algorithm (see below)
CapabilityThe request type this route applies to
EnabledToggle to activate or deactivate the route
Provider WeightsShown only for the weighted strategy; assign a numeric weight per provider
Fallback ChainComma-separated list of provider IDs to try in order if the primary fails

Routing Strategies

StrategyBehavior
PriorityRoutes to the highest-priority provider that is healthy
Round RobinDistributes requests evenly across providers in rotation
WeightedDistributes requests proportionally based on assigned weights
Least LatencySelects the provider with the lowest recent average latency
Least CostSelects the provider with the lowest per-token cost for the requested model
Free Tier FirstPrefers providers with remaining free-tier quota before falling back to paid
Task OptimizedSelects the provider best suited for the specific task type
Cost OptimizedBalances cost and quality based on request characteristics
FailoverUses the primary provider and switches to the next on failure
RandomSelects a provider at random from the available pool

Capabilities

Routes are scoped to a single capability. The available capabilities are:

  • chat — Chat completions
  • completion — Text completions
  • embedding — Vector embeddings
  • image — Image generation
  • audio — Audio transcription and translation
  • text-to-speech — Speech synthesis
  • rerank — Reranking
  • video-generation — Video generation

You can create multiple routes for the same capability; the gateway evaluates them in order and uses the first enabled match.

Fallback Chains

Every route supports an optional fallback chain. When the selected provider returns an error or is unhealthy, the gateway tries each provider in the fallback chain in order. Specify provider IDs separated by commas (e.g. openai, anthropic, azure).

Fallback chains work with any strategy. For example, a least-latency route can still fall back to a manually specified chain if all preferred providers are down.

Weighted Distribution

When using the weighted strategy, you assign a numeric weight to each provider. The gateway distributes traffic proportionally. For example, weights of openai: 3, anthropic: 1 send roughly 75% of traffic to OpenAI and 25% to Anthropic.

Add or remove provider weight entries using the form controls. Each entry requires a provider ID and a weight between 0 and 100.

Editing and Deleting Routes

  • Click Edit on any route card to modify its settings.
  • Click Delete to remove a route. A confirmation dialog warns that traffic using this route will fall back to default routing.
  • Use the Enabled toggle to temporarily disable a route without deleting it.

Default Routing Strategy

A system-wide default routing strategy is configured on the Settings page. Routes defined here override the default for their specific capability.