Skip to content

Spending Controls#

Spending controls enforce AI budgets at execution time. Set limits at the organization, team, or individual agent level — agents are stopped when limits are reached.


How it works#

Spending limits apply to managed agents using the LLM proxy. Before every model request:

  1. The LLM proxy checks the agent's spending against its applicable limits
  2. If the budget is exceeded, the request is rejected with a budget-exceeded notification
  3. The agent receives the notification and stops gracefully
  4. If the budget service is temporarily unavailable, the request is allowed (fail-open for availability)

Setting limits#

Limits can be set at three levels:

Level Scope Example
Organization Total AI spending across all teams and agents $5,000/month
Team Spending for a specific team's agents $1,000/month
Agent Spending for a single agent $100/month

Limits stack — an agent is bound by its own limit, its team's limit, and the org's limit. The most restrictive limit applies.

Example: Organization limit is $5,000/month. Team limit is $1,000/month. Agent limit is $100/month.

  • Agent hits $100 → this agent is stopped, other team agents continue
  • Team hits $1,000 → all agents in this team are stopped, other teams continue
  • Organization hits $5,000 → all agents in the organization are stopped

Spending controls track LLM costs only (input tokens, output tokens, cache tokens as reported by model providers). Agent-hour platform costs and infrastructure costs are not included in spending limit calculations.

Periods#

Period Resets
Daily Every 24 hours
Weekly Every 7 days
Monthly Every calendar month

Configuration#

Spending limits can be set at any of three scopes:

  • Organization level — caps total spend across the entire organization.
  • Team level — caps spend for a specific team, within the organization's overall cap.
  • Agent level — caps spend for a single agent, within the team's cap.

Limits at more specific scopes always bind more tightly than broader ones — an agent's limit cannot exceed its team's, and a team's cannot exceed its organization's.


Cost tracking#

The LLM proxy automatically extracts usage data from every model request:

  • Input tokens — tokens sent to the model
  • Output tokens — tokens generated by the model
  • Cache tokens — read and write cache hits (for prompt caching)
  • Model — which model was used
  • Provider — which provider processed the request
  • Cost — calculated cost per request

This data feeds into the Cost Explorer for visualization and the spending limit checks for enforcement.


What happens when a limit is reached#

When an agent's spending exceeds its limit:

  1. The next LLM request is rejected with an HTTP 429 Retry-After response and an RFC 7807 budget-exceeded error body
  2. The agent is notified with a budget-exceeded message
  3. Heartbeat checks skip gracefully (not counted as errors)
  4. Scheduled tasks notify the user of the budget situation

The agent does not crash or lose state. It pauses LLM-dependent operations until the budget resets or the limit is increased.

For integration partners: the error response follows RFC 7807 Problem Details format with type budget-exceeded, including the limit period and reset time.


Cost visibility#

Monitor spending through:

  • Cost Explorer — usage summaries, time-series breakdowns, drill-down by team/agent/model
  • Agent detail page — spending for a specific agent
  • Spending limit status — agents can check their own budget via tools

Scope#

Spending controls currently apply to managed agents using the LLM proxy. Desktop AI tools and external agents use their own model providers — their LLM costs are billed directly by the provider, not through Lens Agents.

Agent-hour billing applies to all agent types. Spending controls govern the LLM costs on top of agent-hour billing.