Kubernetes Deployment
Gatewyse can be deployed on Kubernetes for production workloads. The repository includes manifests in the k8s/ directory.
Architecture
A typical Kubernetes deployment includes:
| Resource | Purpose |
|---|---|
| Deployment (server) | Gateway API pods |
| Deployment (worker) | BullMQ background worker pods |
| Deployment (admin) | Admin dashboard pods |
| Service | Internal and external networking |
| ConfigMap | Non-sensitive configuration |
| Secret | JWT secrets, encryption keys, database credentials |
| HPA | Horizontal Pod Autoscaler for the server |
| Ingress | External traffic routing |
Secret
Store sensitive values in a Kubernetes Secret:
apiVersion: v1kind: Secretmetadata: name: aigw-secrets namespace: ai-gatewaytype: OpaquestringData: JWT_SECRET: "<random-64-char-string>" JWT_REFRESH_SECRET: "<random-64-char-string>" ENCRYPTION_KEY: "<random-64-hex-chars>" REDIS_PASSWORD: "<redis-password>" SUPER_ADMIN_PASSWORD: "<complex-password>" MONGODB_URI: "mongodb://mongo-0.mongo:27017/ai-gateway?replicaSet=rs0"ConfigMap
Store non-sensitive configuration in a ConfigMap:
apiVersion: v1kind: ConfigMapmetadata: name: aigw-config namespace: ai-gatewaydata: NODE_ENV: "production" PORT: "3000" LOG_LEVEL: "info" REDIS_HOST: "redis-master" REDIS_PORT: "6379" ADMIN_URL: "https://admin.your-domain.com" BULLMQ_PREFIX: "aigw" CACHE_SIMILARITY_THRESHOLD: "0.96" CACHE_DEFAULT_TTL_SECONDS: "86400"Server Deployment
apiVersion: apps/v1kind: Deploymentmetadata: name: aigw-server namespace: ai-gatewayspec: replicas: 2 selector: matchLabels: app: aigw-server template: metadata: labels: app: aigw-server spec: containers: - name: server image: ai-gateway/server:latest ports: - containerPort: 3000 envFrom: - configMapRef: name: aigw-config - secretRef: name: aigw-secrets resources: requests: cpu: 500m memory: 512Mi limits: cpu: "2" memory: 1Gi livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 15 periodSeconds: 10 readinessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 5 periodSeconds: 5Worker Deployment
apiVersion: apps/v1kind: Deploymentmetadata: name: aigw-worker namespace: ai-gatewayspec: replicas: 2 selector: matchLabels: app: aigw-worker template: metadata: labels: app: aigw-worker spec: containers: - name: worker image: ai-gateway/worker:latest envFrom: - configMapRef: name: aigw-config - secretRef: name: aigw-secrets resources: requests: cpu: 250m memory: 256Mi limits: cpu: "1" memory: 512MiService and Ingress
apiVersion: v1kind: Servicemetadata: name: aigw-server namespace: ai-gatewayspec: selector: app: aigw-server ports: - port: 80 targetPort: 3000 type: ClusterIP---apiVersion: networking.k8s.io/v1kind: Ingressmetadata: name: aigw-ingress namespace: ai-gatewayspec: rules: - host: gateway.your-domain.com http: paths: - path: / pathType: Prefix backend: service: name: aigw-server port: number: 80Horizontal Pod Autoscaler
apiVersion: autoscaling/v2kind: HorizontalPodAutoscalermetadata: name: aigw-server-hpa namespace: ai-gatewayspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: aigw-server minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70Resource Recommendations
| Component | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|
| Server | 500m | 2 | 512Mi | 1Gi |
| Worker | 250m | 1 | 256Mi | 512Mi |
| Admin | 100m | 500m | 128Mi | 256Mi |
Infrastructure Dependencies
MongoDB and Redis should be deployed separately using their respective Helm charts or managed services:
- MongoDB: Use the MongoDB Community Operator or a managed service (Atlas, DocumentDB). Replica set mode is required.
- Redis: Use the Bitnami Redis Helm chart or a managed service (ElastiCache, Memorystore).