Spending Controls#
Spending controls enforce AI budgets at execution time. Set limits at the organization, team, or individual agent level — agents are stopped when limits are reached.
How it works#
Spending limits apply to managed agents using the LLM proxy. Before every model request:
- The LLM proxy checks the agent's spending against its applicable limits
- If the budget is exceeded, the request is rejected with a budget-exceeded notification
- The agent receives the notification and stops gracefully
- If the budget service is temporarily unavailable, the request is allowed (fail-open for availability)
Setting limits#
Limits can be set at three levels:
| Level | Scope | Example |
|---|---|---|
| Organization | Total AI spending across all teams and agents | $5,000/month |
| Team | Spending for a specific team's agents | $1,000/month |
| Agent | Spending for a single agent | $100/month |
Limits stack — an agent is bound by its own limit, its team's limit, and the org's limit. The most restrictive limit applies.
Example: Organization limit is $5,000/month. Team limit is $1,000/month. Agent limit is $100/month.
- Agent hits $100 → this agent is stopped, other team agents continue
- Team hits $1,000 → all agents in this team are stopped, other teams continue
- Organization hits $5,000 → all agents in the organization are stopped
Spending controls track LLM costs only (input tokens, output tokens, cache tokens as reported by model providers). Agent-hour platform costs and infrastructure costs are not included in spending limit calculations.
Periods#
| Period | Resets |
|---|---|
| Daily | Every 24 hours |
| Weekly | Every 7 days |
| Monthly | Every calendar month |
Configuration#
Spending limits can be set at any of three scopes:
- Organization level — caps total spend across the entire organization.
- Team level — caps spend for a specific team, within the organization's overall cap.
- Agent level — caps spend for a single agent, within the team's cap.
Limits at more specific scopes always bind more tightly than broader ones — an agent's limit cannot exceed its team's, and a team's cannot exceed its organization's.
Cost tracking#
The LLM proxy automatically extracts usage data from every model request:
- Input tokens — tokens sent to the model
- Output tokens — tokens generated by the model
- Cache tokens — read and write cache hits (for prompt caching)
- Model — which model was used
- Provider — which provider processed the request
- Cost — calculated cost per request
This data feeds into the Cost Explorer for visualization and the spending limit checks for enforcement.
What happens when a limit is reached#
When an agent's spending exceeds its limit:
- The next LLM request is rejected with an HTTP
429 Retry-Afterresponse and an RFC 7807budget-exceedederror body - The agent is notified with a budget-exceeded message
- Heartbeat checks skip gracefully (not counted as errors)
- Scheduled tasks notify the user of the budget situation
The agent does not crash or lose state. It pauses LLM-dependent operations until the budget resets or the limit is increased.
For integration partners: the error response follows RFC 7807 Problem Details format with type budget-exceeded, including the limit period and reset time.
Cost visibility#
Monitor spending through:
- Cost Explorer — usage summaries, time-series breakdowns, drill-down by team/agent/model
- Agent detail page — spending for a specific agent
- Spending limit status — agents can check their own budget via tools
Scope#
Spending controls currently apply to managed agents using the LLM proxy. Desktop AI tools and external agents use their own model providers — their LLM costs are billed directly by the provider, not through Lens Agents.
Agent-hour billing applies to all agent types. Spending controls govern the LLM costs on top of agent-hour billing.
Related#
- Cost Explorer — visualize spending
- What Is an Agent-Hour? — billing unit measurement
- Policies — access control
- Audit trail — track what agents spend