Architecture Overview

Gatewyse is an enterprise multi-tenant system that sits between your applications and AI providers. It provides unified API access, intelligent routing, security guards, budget enforcement, and observability across 28 AI providers.

Monorepo Structure

ai-gateway/
+-- packages/
|   +-- shared/       # Types, Zod schemas, utilities (consumed by all packages)
|   +-- server/       # Express 5 API server (gateway + admin API)
|   +-- worker/       # BullMQ background job processor
|   +-- admin/        # Nuxt 4 SPA dashboard (PrimeVue 4)
|   +-- docs/         # Astro Starlight documentation site
+-- docker/           # Docker Compose configs, Dockerfiles, nginx
+-- tests/
|   +-- unit/         # Jest unit tests (777 tests)
|   +-- integration/  # Jest integration tests with Docker MongoDB/Redis (46 tests)
|   +-- smoke/        # Live server smoke tests (6 phases)
+-- e2e/              # Playwright E2E tests for admin dashboard (46 tests)

Key Technologies

Component	Technology
Runtime	Node.js 24+
API Server	Express 5
Admin Dashboard	Nuxt 4, PrimeVue 4, Pinia
Database	MongoDB (Mongoose ODM)
Cache / Queue	Redis (ioredis)
Job Queue	BullMQ
Package Manager	pnpm 10+ (workspaces)
Auth	JWT + RBAC + SSO (OIDC/SAML)

Request Lifecycle

Every gateway request follows this pipeline:

Client Request
      |
      v
[1] Authentication        -- Validate API key or JWT
      |
      v
[2] Rate Limiting         -- Per-IP and per-tenant rate limits
      |
      v
[3] Budget Check          -- Verify spending limits (tenant/org/dept/user)
      |
      v
[4] Semantic Cache Lookup -- Check for cached similar responses
      |  (cache hit? return cached response)
      v
[5] Guard Pipeline        -- PII detection, injection defense, content filter,
      |                       toxicity scoring, token/cost limits, custom rules
      |  (blocked? return error)
      v
[6] Routing               -- Resolve provider chain using configured strategy
      |                       (priority, round-robin, weighted, least-cost,
      |                        least-latency, free-tier-first, task-optimized)
      v
[7] Provider Execution    -- Adapter translates to provider format, sends request
      |                       with retry/backoff on transient failures
      |  (failure? try next provider in chain)
      v
[8] Response Formatting   -- Adapter translates provider response to unified format
      |
      v
[9] Cache Write           -- Store response in semantic cache
      |
      v
[10] Usage Tracking       -- Record tokens, cost, latency in usage + budget counters
      |
      v
Client Response

Data Flow Between Services

+------------+       +----------+       +-----------+
|   Client   | ----> |  Server  | ----> | Provider  |
| (your app) | <---- | (Express)| <---- | (OpenAI,  |
+------------+       +----+-----+       | Anthropic,|
                          |             | etc.)     |
                          v             +-----------+
                    +-----+------+
                    |  MongoDB   |  Configs, users, tenants,
                    |            |  audit logs, usage records
                    +-----+------+
                          |
                    +-----+------+
                    |   Redis    |  Cache, rate limits, routing
                    |            |  counters, session state, queues
                    +-----+------+
                          |
                    +-----+------+
                    |   Worker   |  Budget resets, SIEM export,
                    | (BullMQ)   |  backup jobs, health checks
                    +------------+

Admin Dashboard

The admin dashboard is a Nuxt 4 single-page application (SPA mode, ssr: false) that communicates with the server’s /api/admin/* endpoints. It provides management interfaces for:

Tenants, users, and RBAC role management
Provider and model configuration
Routing rule management
Guard configuration
Budget creation and monitoring
Usage analytics and audit logs
Semantic cache management
System settings and backups

API Surface

The server exposes two groups of endpoints:

Gateway API (/v1/*) — OpenAI-compatible endpoints consumed by applications:

/v1/chat/completions — Chat completions (streaming and non-streaming)
/v1/completions — Text completions
/v1/embeddings — Vector embeddings
/v1/audio/transcriptions, /v1/audio/translations, /v1/audio/speech — Audio
/v1/images/generations — Image generation
/v1/rerank — Document reranking
/v1/video/generations — Video generation
/v1/models — List available models
/v1/usage, /v1/budget — API key self-service

Admin API (/api/admin/*) — Dashboard management endpoints for tenants, users, providers, models, routing, guards, budgets, usage, audit logs, cache, settings, documents, backups, and model intelligence.

Next Steps

Routing Strategies — Deep dive into the ten routing algorithms
Provider Adapters — How adapters translate between the unified API and 28 providers
Contributing — Development setup and testing