Configure Guards

Guards are a pipeline of checks that run on every request (and optionally every response) before the request reaches a provider. This tutorial walks you through configuring a PII detection guard.

Prerequisites

A running Gatewyse instance with at least one provider configured
Admin dashboard access

Guard Types

The gateway supports seven guard types:

Type	Purpose
`pii-detector`	Detects emails, phone numbers, SSNs, credit cards, IP addresses
`injection-detector`	Scans for prompt injection patterns (e.g., “ignore previous instructions”)
`content-filter`	Blocks requests containing configurable keywords
`toxicity-scorer`	Heuristic toxicity detection based on pattern matching
`token-limit`	Blocks requests exceeding an estimated token count
`cost-limit`	Blocks requests exceeding an estimated cost threshold
`custom-rules`	User-defined regex rules with named patterns

Step 1 — Navigate to the Guards Page

In the sidebar, click Guards.
The guards page shows the current guard configuration for your tenant.
Click Create Guard Config if no configuration exists, or Edit to modify an existing one.

Step 2 — Add a PII Detection Guard

Click Add Guard within the configuration.
Set Type to pii-detector.
Set Level to one of:
- block — Reject the request entirely and return an error.
- warn — Allow the request but log a warning. The warning appears in audit logs.
- monitor — Log silently for analytics. No user-visible effect.
- off — Disable this guard.
Set Apply To:
- request — Only scan incoming prompts.
- response — Only scan provider responses.
- both — Scan in both directions.
Set Priority — Lower numbers run first. If you have multiple guards, set PII detection to 10 so it runs early.

Step 3 — Configure PII Sensitivity

The PII detector scans for five pattern types by default:

Email addresses — Standard email format
Phone numbers — US phone formats with optional country code
Social Security Numbers — XXX-XX-XXXX pattern
Credit card numbers — 16-digit patterns with optional separators
IP addresses — IPv4 dotted-decimal format

Confidence is calculated as min(detectionCount / 3, 1). A single PII match triggers the configured action.

Step 4 — Add an Injection Detection Guard

Click Add Guard again.
Set Type to injection-detector.
Set Level to block.
Set Apply To to request.
Set Priority to 5 (runs before PII detection).

The injection detector checks for patterns such as:

“ignore all previous instructions”
“you are now a…”
“new system instructions:”
“override system/safety”
INST/im_start tokens
DAN mode / developer mode references

A confidence score is computed as min(matchedPatterns / 3, 1). The default threshold is 0.7.

Step 5 — Save and Test

Click Save to persist the guard configuration.
Test with a request containing PII:

curl -X POST http://localhost:3000/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "My SSN is 123-45-6789"}]
  }'

With the PII guard set to block, the response will be a 403 error:

{
  "error": {
    "message": "PII detected: ssn",
    "type": "guard_blocked"
  }
}

With the guard set to warn, the request proceeds but a warning is logged in the audit trail.

How the Guard Pipeline Works

Load config — The PromptGuardService loads the tenant’s guard configuration from Redis cache (10-minute TTL) or falls back to MongoDB.
Filter — Guards are filtered by direction (request or response) and removed if level is off.
Sort — Guards execute in priority order (ascending).
Execute — Each guard runs against the request content and produces a GuardResult with an action (pass, warn, monitor, or block) and confidence score.
Short-circuit — If any guard returns block, the pipeline stops immediately and returns the blocked response.
Content modification — Some guards can modify content (e.g., redacting PII). Modified content flows to subsequent guards.

Next Steps

Add a content-filter guard with custom blocked keywords for your organization’s compliance requirements
Configure custom-rules with regex patterns specific to your data governance policies
Review guard activity in the Audit Logs section of the dashboard