Supported Models#

Lens Agents is designed to be model-agnostic. The platform provides tools, connectivity, and governance; the model provides intelligence. The set of integrated providers expands as customer engagements require.

Currently integrated providers#

Provider	Models	How integration works
Anthropic	Claude Opus, Sonnet, Haiku	Direct Anthropic API (desktop and external agents)
AWS Bedrock	Claude, Llama, Mistral, and other Bedrock-hosted models	Regional deployment, IAM/SigV4-based access
Azure AI Foundry	Claude (Anthropic Messages API) and GPT models (OpenAI Chat Completions and Responses APIs)	Single Foundry resource serving both model families
AWS Bedrock (OpenAI-compatible surface)	Anthropic and OpenAI-compatible model access	Single Bedrock credential, no additional environment required

Additional providers — including Google and self-hosted endpoints — are configured to customer requirements during evaluation engagements. Contact us during onboarding to discuss which providers your deployment requires.

How models work by agent type#

Desktop AI tools#

Desktop tools like Claude Desktop, Cursor, ChatGPT, and Copilot use their own built-in model. Lens Agents has no involvement in the model layer — it provides tools and connectivity via MCP. The user's existing subscription or API key powers the intelligence. This means desktop tools can use any model their vendor supports, independently of what Lens Agents integrates directly.

External agents#

External agents (LangChain, CrewAI, Claude Agent SDK, custom frameworks) bring their own model configuration. The agent developer chooses the provider, model, and parameters. Lens Agents provides the tool execution environment through MCP — the model choice is outside the platform's scope.

Managed agents#

Managed agents route model requests through the platform's managed inference endpoint. The endpoint provides:

Per-policy provider selection — operators choose the managed inference backend at the policy level: AWS Bedrock (default, backward compatible), Azure AI Foundry, or AWS Bedrock (OpenAI-compatible surface). Only backends the deployment has configured are selectable. The boundary between providers is a data-residency and compliance decision enforced in policy.
Multi-format usage extraction — token counts extracted from both Anthropic and OpenAI wire formats, including the OpenAI Responses API (input tokens, output tokens, cache read, cache write)
Spending enforcement — budget checked before every LLM call. Exceeded budgets return a 429 error.
PII masking — sensitive data detected and masked before requests reach the model provider, applied uniformly across all managed backends
Audit logging — every model request is recorded with model name, token usage, cost, and prompt metadata
Prompt caching — stable context sections are cached across invocations to reduce latency and cost

The default backend for managed agents is AWS Bedrock (backward compatible). The classification model for alert suppression and channel routing is Claude Haiku.

Metering and governance boundary#

The managed inference endpoint is the metering boundary for managed agents. Usage metering, budget enforcement, and PII masking are properties of traffic routed through this endpoint. Managed sandboxes are provisioned to use it by default.

To guarantee full governance coverage, operators deny direct access to provider hosts in policy so that agents reach model providers only through the platform's inference endpoint. This ensures that all LLM traffic is subject to spending controls, PII masking, and audit — with no bypass path.

For a hands-on walkthrough of installing the platform with a managed-inference provider (Bedrock or Azure) on a local cluster, see Try Lens Agents Locally.

Spending controls#

Spending controls apply to managed agent LLM usage. They do not apply to desktop or external agents, because those agents use their own model provider and billing.

Spending limits can be set at three levels:

Level	What it controls
Organization	Total LLM spend across all managed agents
Team	LLM spend for all managed agents in a team
Agent	LLM spend for a single managed agent

When a limit is exceeded, the agent's next LLM request is rejected. Heartbeat monitoring skips gracefully and does not count as an error. Scheduled tasks notify the user that the budget has been exceeded.

Cost visibility#

LLM costs are tracked per request and visible in:

Cost Explorer — time-series cost breakdown by agent, team, or organization
Audit trail — LLM proxy events include token counts and cost per request
Agent overview — each agent's detail page shows 30-day LLM cost

Model configuration for managed agents#

Setting	Default
Primary model	Claude (default backend: AWS Bedrock; per-policy selectable)
Classification model	Claude Haiku
Temperature	0.3
Max output tokens	16,000 per response
Step limit	100 tool-use steps per invocation
Tool output truncation	30,000 characters per tool call

Prompt caching is applied automatically. Stable prompt sections (identity, workspace files, memory summary) are cached. Dynamic sections (timestamp, heartbeat context, recent messages) are not.

Spending controls — budget enforcement for LLM usage
Cost Explorer — cost visualization
How Lens Agents works — architecture and request flow