Introduction

Gatewyse is a unified API gateway that sits between your applications and AI providers. Point your apps at a single endpoint and the gateway picks the best provider for the job — prioritizing free-tier usage, respecting budgets, and falling back automatically on failure.

Why Gatewyse?

No vendor lock-in. Switch providers without changing application code. The gateway exposes an OpenAI-compatible API regardless of which provider handles the request.
Cost control. Free-tier-first routing burns through Groq, DeepSeek, and other free inference before touching paid keys. Per-tenant, per-org, and per-user budgets with automatic enforcement.
One API for everything. Text, images, audio, embeddings, reranking — configure which capabilities are enabled and the gateway handles provider selection.
Enterprise controls. Multi-tenant isolation, RBAC with 6 default roles, API key management, PII detection, prompt injection guards, and immutable audit logs.
Your keys, your rules. Each tenant configures their own provider API keys, enables or disables capabilities, and sets routing preferences through the admin dashboard.

Supported Providers

Gatewyse supports 28 providers across cloud and self-hosted deployments:

Cloud Providers	Self-Hosted
OpenAI, Anthropic, Google Gemini, Azure OpenAI, Groq, Mistral, Cohere, DeepSeek, Together AI, Perplexity, Fireworks AI, Replicate, AI21 Labs, HuggingFace, xAI, Cerebras, SambaNova, AssemblyAI, ElevenLabs	Ollama, vLLM, LM Studio, LocalAI, llama.cpp, Whisper Local, ComfyUI, Stability AI

Routing Strategies

The gateway provides 10 routing strategies that can be configured per tenant and per capability:

Strategy	Description
Priority	Try providers in a fixed order; fail over to the next on error
Round-robin	Distribute requests evenly across providers
Weighted	Route based on configured weight percentages
Least-cost	Prefer the cheapest available provider
Least-latency	Prefer the provider with the lowest recent latency
Free-tier-first	Exhaust free-tier providers before using paid ones
Task-optimized	Select the best provider based on task type and model capabilities
Cost-optimized	Route to the cheapest provider based on model pricing
Failover	Priority ordering with automatic demotion of degraded providers
Random	Randomly select a provider for simple load distribution

Architecture Overview

Your Application (OpenAI / Anthropic SDK)
  │
  ▼
AI Gateway (Express middleware pipeline)
  ├── Auth ─► Tenant Resolver ─► RBAC ─► Validation
  ├── Format Detection ─► Normalizer ─► Prompt Guards
  ├── Budget Check ─► Semantic Cache ─► Usage Tracking
  │
  ▼
Routing Service (10 strategies, LRU-cached)
  │
  ▼
Provider Adapter ──► OpenAI / Anthropic / Gemini / ...
  │
  ▼
Response ─► Cache ─► Usage Tracking ─► Audit Log ─► Client

Who Is This For?

Gatewyse is built for engineering teams that:

Use multiple AI providers and need a unified API
Want to control costs with budgets and free-tier optimization
Require enterprise security: multi-tenancy, RBAC, audit logs, PII guards
Need an admin dashboard for non-technical team members to manage providers and routing
Want to avoid vendor lock-in while keeping their integration code simple

Tech Stack

Component	Technology
Runtime	Node.js 24+, TypeScript (strict)
Server	Express 5
Database	MongoDB 7+ (replica set)
Cache / Queue	Redis 7+, BullMQ
Admin UI	Nuxt 4, Vue 3, PrimeVue 4
Real-time	Socket.io
Validation	Zod
Deployment	Docker, Kubernetes

Next Steps

Installation — set up Gatewyse locally
Quick Start — make your first API call in 5 minutes
Configuration — understand environment variables and feature flags