Skip to content

Architecture Overview

Gatewyse is an enterprise multi-tenant system that sits between your applications and AI providers. It provides unified API access, intelligent routing, security guards, budget enforcement, and observability across 28 AI providers.

Monorepo Structure

ai-gateway/
+-- packages/
| +-- shared/ # Types, Zod schemas, utilities (consumed by all packages)
| +-- server/ # Express 5 API server (gateway + admin API)
| +-- worker/ # BullMQ background job processor
| +-- admin/ # Nuxt 4 SPA dashboard (PrimeVue 4)
| +-- docs/ # Astro Starlight documentation site
+-- docker/ # Docker Compose configs, Dockerfiles, nginx
+-- tests/
| +-- unit/ # Jest unit tests (777 tests)
| +-- integration/ # Jest integration tests with Docker MongoDB/Redis (46 tests)
| +-- smoke/ # Live server smoke tests (6 phases)
+-- e2e/ # Playwright E2E tests for admin dashboard (46 tests)

Key Technologies

ComponentTechnology
RuntimeNode.js 24+
API ServerExpress 5
Admin DashboardNuxt 4, PrimeVue 4, Pinia
DatabaseMongoDB (Mongoose ODM)
Cache / QueueRedis (ioredis)
Job QueueBullMQ
Package Managerpnpm 10+ (workspaces)
AuthJWT + RBAC + SSO (OIDC/SAML)

Request Lifecycle

Every gateway request follows this pipeline:

Client Request
|
v
[1] Authentication -- Validate API key or JWT
|
v
[2] Rate Limiting -- Per-IP and per-tenant rate limits
|
v
[3] Budget Check -- Verify spending limits (tenant/org/dept/user)
|
v
[4] Semantic Cache Lookup -- Check for cached similar responses
| (cache hit? return cached response)
v
[5] Guard Pipeline -- PII detection, injection defense, content filter,
| toxicity scoring, token/cost limits, custom rules
| (blocked? return error)
v
[6] Routing -- Resolve provider chain using configured strategy
| (priority, round-robin, weighted, least-cost,
| least-latency, free-tier-first, task-optimized)
v
[7] Provider Execution -- Adapter translates to provider format, sends request
| with retry/backoff on transient failures
| (failure? try next provider in chain)
v
[8] Response Formatting -- Adapter translates provider response to unified format
|
v
[9] Cache Write -- Store response in semantic cache
|
v
[10] Usage Tracking -- Record tokens, cost, latency in usage + budget counters
|
v
Client Response

Data Flow Between Services

+------------+ +----------+ +-----------+
| Client | ----> | Server | ----> | Provider |
| (your app) | <---- | (Express)| <---- | (OpenAI, |
+------------+ +----+-----+ | Anthropic,|
| | etc.) |
v +-----------+
+-----+------+
| MongoDB | Configs, users, tenants,
| | audit logs, usage records
+-----+------+
|
+-----+------+
| Redis | Cache, rate limits, routing
| | counters, session state, queues
+-----+------+
|
+-----+------+
| Worker | Budget resets, SIEM export,
| (BullMQ) | backup jobs, health checks
+------------+

Admin Dashboard

The admin dashboard is a Nuxt 4 single-page application (SPA mode, ssr: false) that communicates with the server’s /api/admin/* endpoints. It provides management interfaces for:

  • Tenants, users, and RBAC role management
  • Provider and model configuration
  • Routing rule management
  • Guard configuration
  • Budget creation and monitoring
  • Usage analytics and audit logs
  • Semantic cache management
  • System settings and backups

API Surface

The server exposes two groups of endpoints:

Gateway API (/v1/*) — OpenAI-compatible endpoints consumed by applications:

  • /v1/chat/completions — Chat completions (streaming and non-streaming)
  • /v1/completions — Text completions
  • /v1/embeddings — Vector embeddings
  • /v1/audio/transcriptions, /v1/audio/translations, /v1/audio/speech — Audio
  • /v1/images/generations — Image generation
  • /v1/rerank — Document reranking
  • /v1/video/generations — Video generation
  • /v1/models — List available models
  • /v1/usage, /v1/budget — API key self-service

Admin API (/api/admin/*) — Dashboard management endpoints for tenants, users, providers, models, routing, guards, budgets, usage, audit logs, cache, settings, documents, backups, and model intelligence.

Next Steps