Budget Management
Budgets let you set spending limits and usage caps at multiple levels of your organization. When a budget is exceeded, the gateway can block further requests or issue warnings. This tutorial walks through creating and testing a budget.
Prerequisites
- A running Gatewyse instance with at least one provider and routing rule configured
- Admin dashboard access
Step 1 — Navigate to the Budgets Page
- In the sidebar, click Budgets.
- The budgets list shows all active budget configurations for your tenant.
- Click Create Budget.
Step 2 — Set the Budget Scope
Budgets operate in a hierarchy. Each request is checked against all applicable budgets from the most specific scope upward:
| Scope | Description |
|---|---|
user | Limits spending for a single user |
department | Limits spending for a department |
organization | Limits spending for an organization |
tenant | Limits spending for the entire tenant |
- Select a Scope, for example
department. - Select the Scope ID — the specific department this budget applies to.
Step 3 — Configure Limits
Set one or more spending limits. All limits are optional; configure only the ones you need.
| Limit Type | Description |
|---|---|
dailyUsd | Maximum spend in USD per day |
weeklyUsd | Maximum spend in USD per week |
monthlyUsd | Maximum spend in USD per month |
dailyTokens | Maximum tokens consumed per day |
monthlyTokens | Maximum tokens consumed per month |
dailyRequests | Maximum number of requests per day |
Example configuration for a department:
- Daily USD:
50.00 - Monthly USD:
1000.00 - Daily Requests:
5000
Step 4 — Set the Enforcement Action
Choose what happens when a budget limit is exceeded:
- Block — Reject the request with a budget-exceeded error. The API returns a clear error message identifying which budget was exceeded.
- Warn — Allow the request but include a warning in the response metadata and log the event.
Step 5 — Configure Alert Thresholds (Optional)
Alerts notify you when usage approaches a budget limit, before it is actually exceeded.
- Click Add Alert.
- Set Threshold as a percentage, for example
80. - When usage reaches 80% of any limit, the alert fires.
- You can add multiple thresholds (e.g., 50%, 80%, 95%).
Alert thresholds produce warnings in the budget check response. They reset when the budget period resets.
Step 6 — Save and Enable
- Toggle Enabled to on.
- Click Save.
Step 7 — Test Budget Enforcement
Send requests until the budget is approached:
curl -X POST http://localhost:3000/v1/chat/completions \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{"role": "user", "content": "Hello"}] }'When the budget is exceeded with action set to block, the response will be:
{ "error": { "message": "Budget exceeded", "details": { "scope": "department", "limitType": "dailyUsd", "limit": 50, "current": 50.12 } }}How Budget Tracking Works
The BudgetService performs the following on each request:
- Check — Queries all enabled budgets matching the request’s tenant, user, org, and department scopes. Results are cached in Redis for 5 minutes with jittered TTL.
- Evaluate — For each budget, compares
currentUsageagainstlimitsfor every limit type (daily/weekly/monthly USD, tokens, requests). - Alert — If usage percentage crosses an alert threshold, a warning is included in the result.
- Enforce — If any limit is exceeded and the action is
block, the request is rejected. - Update — After a successful request,
updateUsageincrements all applicable counters. Costs are rounded to micro-dollar precision (6 decimal places) to prevent floating-point accumulation errors.
Budget Reset Schedule
Counters reset automatically on a schedule managed by the worker service:
- Daily counters reset at midnight UTC (dailyUsd, dailyTokens, dailyRequests)
- Weekly counters reset on Monday at midnight UTC (weeklyUsd)
- Monthly counters reset on the 1st at midnight UTC (monthlyUsd, monthlyTokens)
Alert triggered flags also reset when the period resets, so alerts can fire again in the new period.
Next Steps
- Create budgets at multiple scopes (user + department + tenant) for layered cost control
- Monitor usage trends on the Usage page in the dashboard
- Set up Guards for additional request-level protections