Architecture Overview
Gatewyse is an enterprise multi-tenant system that sits between your applications and AI providers. It provides unified API access, intelligent routing, security guards, budget enforcement, and observability across 28 AI providers.
Monorepo Structure
ai-gateway/+-- packages/| +-- shared/ # Types, Zod schemas, utilities (consumed by all packages)| +-- server/ # Express 5 API server (gateway + admin API)| +-- worker/ # BullMQ background job processor| +-- admin/ # Nuxt 4 SPA dashboard (PrimeVue 4)| +-- docs/ # Astro Starlight documentation site+-- docker/ # Docker Compose configs, Dockerfiles, nginx+-- tests/| +-- unit/ # Jest unit tests (777 tests)| +-- integration/ # Jest integration tests with Docker MongoDB/Redis (46 tests)| +-- smoke/ # Live server smoke tests (6 phases)+-- e2e/ # Playwright E2E tests for admin dashboard (46 tests)Key Technologies
| Component | Technology |
|---|---|
| Runtime | Node.js 24+ |
| API Server | Express 5 |
| Admin Dashboard | Nuxt 4, PrimeVue 4, Pinia |
| Database | MongoDB (Mongoose ODM) |
| Cache / Queue | Redis (ioredis) |
| Job Queue | BullMQ |
| Package Manager | pnpm 10+ (workspaces) |
| Auth | JWT + RBAC + SSO (OIDC/SAML) |
Request Lifecycle
Every gateway request follows this pipeline:
Client Request | v[1] Authentication -- Validate API key or JWT | v[2] Rate Limiting -- Per-IP and per-tenant rate limits | v[3] Budget Check -- Verify spending limits (tenant/org/dept/user) | v[4] Semantic Cache Lookup -- Check for cached similar responses | (cache hit? return cached response) v[5] Guard Pipeline -- PII detection, injection defense, content filter, | toxicity scoring, token/cost limits, custom rules | (blocked? return error) v[6] Routing -- Resolve provider chain using configured strategy | (priority, round-robin, weighted, least-cost, | least-latency, free-tier-first, task-optimized) v[7] Provider Execution -- Adapter translates to provider format, sends request | with retry/backoff on transient failures | (failure? try next provider in chain) v[8] Response Formatting -- Adapter translates provider response to unified format | v[9] Cache Write -- Store response in semantic cache | v[10] Usage Tracking -- Record tokens, cost, latency in usage + budget counters | vClient ResponseData Flow Between Services
+------------+ +----------+ +-----------+| Client | ----> | Server | ----> | Provider || (your app) | <---- | (Express)| <---- | (OpenAI, |+------------+ +----+-----+ | Anthropic,| | | etc.) | v +-----------+ +-----+------+ | MongoDB | Configs, users, tenants, | | audit logs, usage records +-----+------+ | +-----+------+ | Redis | Cache, rate limits, routing | | counters, session state, queues +-----+------+ | +-----+------+ | Worker | Budget resets, SIEM export, | (BullMQ) | backup jobs, health checks +------------+Admin Dashboard
The admin dashboard is a Nuxt 4 single-page application (SPA mode, ssr: false) that communicates with the server’s /api/admin/* endpoints. It provides management interfaces for:
- Tenants, users, and RBAC role management
- Provider and model configuration
- Routing rule management
- Guard configuration
- Budget creation and monitoring
- Usage analytics and audit logs
- Semantic cache management
- System settings and backups
API Surface
The server exposes two groups of endpoints:
Gateway API (/v1/*) — OpenAI-compatible endpoints consumed by applications:
/v1/chat/completions— Chat completions (streaming and non-streaming)/v1/completions— Text completions/v1/embeddings— Vector embeddings/v1/audio/transcriptions,/v1/audio/translations,/v1/audio/speech— Audio/v1/images/generations— Image generation/v1/rerank— Document reranking/v1/video/generations— Video generation/v1/models— List available models/v1/usage,/v1/budget— API key self-service
Admin API (/api/admin/*) — Dashboard management endpoints for tenants, users, providers, models, routing, guards, budgets, usage, audit logs, cache, settings, documents, backups, and model intelligence.
Next Steps
- Routing Strategies — Deep dive into the ten routing algorithms
- Provider Adapters — How adapters translate between the unified API and 28 providers
- Contributing — Development setup and testing