Budgets
Budgets let you set spending and usage limits at multiple levels of your organization hierarchy. When a budget is exceeded, the gateway automatically blocks further requests for that scope until the period resets.
Budget List
The Budgets page displays a paginated table with these columns:
| Column | Description |
|---|---|
| Scope | The scope type and name (e.g. “tenant / Acme Corp”) |
| Limits | Cost, request, and token limits for the period |
| Current Usage | A progress bar showing consumption as a percentage of the cost limit |
| Status | active, warning, exceeded, or inactive |
Creating a Budget
Click Create Budget to open the form:
Scope Type
Budgets can be scoped to any level in the multi-tenancy hierarchy:
| Scope Type | Description |
|---|---|
| Tenant | Applies to an entire tenant and all its users |
| Organization | Applies to an organization within a tenant |
| Department | Applies to a department within an organization |
| User | Applies to a single user |
| API Key | Applies to a specific API key |
Period
Choose how frequently the budget resets:
| Period | Reset Cadence |
|---|---|
| Hourly | Every hour on the hour |
| Daily | Midnight UTC each day |
| Weekly | Midnight UTC each Monday |
| Monthly | First of each month at midnight UTC |
Limits
You can set one or more limits per budget. All limits are optional — set only the ones relevant to your use case:
- Cost Limit ($) — Maximum dollar spend for the period
- Request Limit — Maximum number of requests
- Token Limit — Maximum total tokens (input + output combined)
Alert Threshold
The Alert Threshold slider (range: 50% to 100%, default: 80%) determines when the budget transitions to warning status. When usage reaches this percentage of any limit, the gateway marks the budget as warning and can trigger notification alerts.
Budget Statuses
| Status | Meaning |
|---|---|
active | Budget is tracking usage; limits not yet reached |
warning | Usage has crossed the alert threshold |
exceeded | One or more limits have been reached; requests are blocked |
inactive | Budget is disabled and not enforcing limits |
Automatic Enforcement
When a budget enters the exceeded state, the gateway returns a 429 Too Many Requests response for any further requests from that scope. The response includes a Retry-After header indicating when the budget period resets.
Enforcement is hierarchical — if a tenant budget is exceeded, all users, departments, and organizations within that tenant are blocked, regardless of their individual budget status.
Editing and Deleting Budgets
- Click Edit to modify limits, period, scope, or alert threshold.
- Click Delete to remove a budget. A confirmation dialog warns that usage tracking for that scope will stop. Deletion does not affect historical usage data.
Best Practices
- Set tenant-level budgets as overall spending caps, then use department or user budgets for finer-grained control.
- Use the hourly period for rate-sensitive workloads that could spike unexpectedly.
- Set alert thresholds to 80% to give teams time to react before hitting hard limits.
- Combine cost limits with token limits for defense in depth — a low-cost model can still consume excessive tokens.