Introduction
Gatewyse is a unified API gateway that sits between your applications and AI providers. Point your apps at a single endpoint and the gateway picks the best provider for the job — prioritizing free-tier usage, respecting budgets, and falling back automatically on failure.
Why Gatewyse?
- No vendor lock-in. Switch providers without changing application code. The gateway exposes an OpenAI-compatible API regardless of which provider handles the request.
- Cost control. Free-tier-first routing burns through Groq, DeepSeek, and other free inference before touching paid keys. Per-tenant, per-org, and per-user budgets with automatic enforcement.
- One API for everything. Text, images, audio, embeddings, reranking — configure which capabilities are enabled and the gateway handles provider selection.
- Enterprise controls. Multi-tenant isolation, RBAC with 6 default roles, API key management, PII detection, prompt injection guards, and immutable audit logs.
- Your keys, your rules. Each tenant configures their own provider API keys, enables or disables capabilities, and sets routing preferences through the admin dashboard.
Supported Providers
Gatewyse supports 28 providers across cloud and self-hosted deployments:
| Cloud Providers | Self-Hosted |
|---|---|
| OpenAI, Anthropic, Google Gemini, Azure OpenAI, Groq, Mistral, Cohere, DeepSeek, Together AI, Perplexity, Fireworks AI, Replicate, AI21 Labs, HuggingFace, xAI, Cerebras, SambaNova, AssemblyAI, ElevenLabs | Ollama, vLLM, LM Studio, LocalAI, llama.cpp, Whisper Local, ComfyUI, Stability AI |
Routing Strategies
The gateway provides 10 routing strategies that can be configured per tenant and per capability:
| Strategy | Description |
|---|---|
| Priority | Try providers in a fixed order; fail over to the next on error |
| Round-robin | Distribute requests evenly across providers |
| Weighted | Route based on configured weight percentages |
| Least-cost | Prefer the cheapest available provider |
| Least-latency | Prefer the provider with the lowest recent latency |
| Free-tier-first | Exhaust free-tier providers before using paid ones |
| Task-optimized | Select the best provider based on task type and model capabilities |
| Cost-optimized | Route to the cheapest provider based on model pricing |
| Failover | Priority ordering with automatic demotion of degraded providers |
| Random | Randomly select a provider for simple load distribution |
Architecture Overview
Your Application (OpenAI / Anthropic SDK) │ ▼AI Gateway (Express middleware pipeline) ├── Auth ─► Tenant Resolver ─► RBAC ─► Validation ├── Format Detection ─► Normalizer ─► Prompt Guards ├── Budget Check ─► Semantic Cache ─► Usage Tracking │ ▼Routing Service (10 strategies, LRU-cached) │ ▼Provider Adapter ──► OpenAI / Anthropic / Gemini / ... │ ▼Response ─► Cache ─► Usage Tracking ─► Audit Log ─► ClientWho Is This For?
Gatewyse is built for engineering teams that:
- Use multiple AI providers and need a unified API
- Want to control costs with budgets and free-tier optimization
- Require enterprise security: multi-tenancy, RBAC, audit logs, PII guards
- Need an admin dashboard for non-technical team members to manage providers and routing
- Want to avoid vendor lock-in while keeping their integration code simple
Tech Stack
| Component | Technology |
|---|---|
| Runtime | Node.js 24+, TypeScript (strict) |
| Server | Express 5 |
| Database | MongoDB 7+ (replica set) |
| Cache / Queue | Redis 7+, BullMQ |
| Admin UI | Nuxt 4, Vue 3, PrimeVue 4 |
| Real-time | Socket.io |
| Validation | Zod |
| Deployment | Docker, Kubernetes |
Next Steps
- Installation — set up Gatewyse locally
- Quick Start — make your first API call in 5 minutes
- Configuration — understand environment variables and feature flags