Kubernetes Deployment

Gatewyse can be deployed on Kubernetes for production workloads. The repository includes manifests in the k8s/ directory.

Architecture

A typical Kubernetes deployment includes:

Resource	Purpose
Deployment (server)	Gateway API pods
Deployment (worker)	BullMQ background worker pods
Deployment (admin)	Admin dashboard pods
Service	Internal and external networking
ConfigMap	Non-sensitive configuration
Secret	JWT secrets, encryption keys, database credentials
HPA	Horizontal Pod Autoscaler for the server
Ingress	External traffic routing

Secret

Store sensitive values in a Kubernetes Secret:

apiVersion: v1
kind: Secret
metadata:
  name: aigw-secrets
  namespace: ai-gateway
type: Opaque
stringData:
  JWT_SECRET: "<random-64-char-string>"
  JWT_REFRESH_SECRET: "<random-64-char-string>"
  ENCRYPTION_KEY: "<random-64-hex-chars>"
  REDIS_PASSWORD: "<redis-password>"
  SUPER_ADMIN_PASSWORD: "<complex-password>"
  MONGODB_URI: "mongodb://mongo-0.mongo:27017/ai-gateway?replicaSet=rs0"

ConfigMap

Store non-sensitive configuration in a ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: aigw-config
  namespace: ai-gateway
data:
  NODE_ENV: "production"
  PORT: "3000"
  LOG_LEVEL: "info"
  REDIS_HOST: "redis-master"
  REDIS_PORT: "6379"
  ADMIN_URL: "https://admin.your-domain.com"
  BULLMQ_PREFIX: "aigw"
  CACHE_SIMILARITY_THRESHOLD: "0.96"
  CACHE_DEFAULT_TTL_SECONDS: "86400"

Server Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aigw-server
  namespace: ai-gateway
spec:
  replicas: 2
  selector:
    matchLabels:
      app: aigw-server
  template:
    metadata:
      labels:
        app: aigw-server
    spec:
      containers:
        - name: server
          image: ai-gateway/server:latest
          ports:
            - containerPort: 3000
          envFrom:
            - configMapRef:
                name: aigw-config
            - secretRef:
                name: aigw-secrets
          resources:
            requests:
              cpu: 500m
              memory: 512Mi
            limits:
              cpu: "2"
              memory: 1Gi
          livenessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 15
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /health
              port: 3000
            initialDelaySeconds: 5
            periodSeconds: 5

Worker Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aigw-worker
  namespace: ai-gateway
spec:
  replicas: 2
  selector:
    matchLabels:
      app: aigw-worker
  template:
    metadata:
      labels:
        app: aigw-worker
    spec:
      containers:
        - name: worker
          image: ai-gateway/worker:latest
          envFrom:
            - configMapRef:
                name: aigw-config
            - secretRef:
                name: aigw-secrets
          resources:
            requests:
              cpu: 250m
              memory: 256Mi
            limits:
              cpu: "1"
              memory: 512Mi

Service and Ingress

apiVersion: v1
kind: Service
metadata:
  name: aigw-server
  namespace: ai-gateway
spec:
  selector:
    app: aigw-server
  ports:
    - port: 80
      targetPort: 3000
  type: ClusterIP
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: aigw-ingress
  namespace: ai-gateway
spec:
  rules:
    - host: gateway.your-domain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: aigw-server
                port:
                  number: 80

Horizontal Pod Autoscaler

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: aigw-server-hpa
  namespace: ai-gateway
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: aigw-server
  minReplicas: 2
  maxReplicas: 10
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 70

Resource Recommendations

Component	CPU Request	CPU Limit	Memory Request	Memory Limit
Server	500m	2	512Mi	1Gi
Worker	250m	1	256Mi	512Mi
Admin	100m	500m	128Mi	256Mi

Infrastructure Dependencies

MongoDB and Redis should be deployed separately using their respective Helm charts or managed services:

MongoDB: Use the MongoDB Community Operator or a managed service (Atlas, DocumentDB). Replica set mode is required.
Redis: Use the Bitnami Redis Helm chart or a managed service (ElastiCache, Memorystore).