LLM gateway configuration

Learn how to configure Claude Code to work with LLM gateway solutions. Covers gateway requirements, authentication configuration, model selection, and provider-specific endpoint setup.

LLM gateways provide a centralized proxy layer between Claude Code and model providers, often providing:

Centralized authentication - Single point for API key management
Usage tracking - Monitor usage across teams and projects
Cost controls - Implement budgets and rate limits
Audit logging - Track all model interactions for compliance
Model routing - Switch between providers without code changes

Gateway requirements

For an LLM gateway to work with Claude Code, it must meet the following requirements:

API format

The gateway must expose to clients at least one of the following API formats:

Anthropic Messages: /v1/messages, /v1/messages/count_tokens
- Must forward request headers: anthropic-beta, anthropic-version
Bedrock InvokeModel: /invoke, /invoke-with-response-stream
- Must preserve request body fields: anthropic_beta, anthropic_version
Vertex rawPredict: :rawPredict, :streamRawPredict, /count-tokens:rawPredict
- Must forward request headers: anthropic-beta, anthropic-version

Failure to forward headers or preserve body fields may result in reduced functionality or inability to use Claude Code features.

Request headers

Claude Code includes the following headers on API requests:

Header	Description
`X-Claude-Code-Session-Id`	A unique identifier for the current Claude Code session. Proxies can use this to aggregate all API requests from a single session without parsing the request body.
`X-Claude-Code-Agent-Id`	Identifier of the subagent or teammate that issued the request. Your proxy can use this to attribute API cost to individual parallel subagents within a session, without parsing the request body. Present only for requests made by an in-process subagent or teammate.
`X-Claude-Code-Parent-Agent-Id`	Identifier of the agent that spawned the agent making the request. Use this with `X-Claude-Code-Agent-Id` to attribute API costs across nested agents in your proxy. Present only when the requesting agent was itself spawned by another agent.

Both agent ID headers are ephemeral per-spawn identifiers, not persistent user or device IDs.

Claude Code also prepends a short attribution block to the system prompt containing the client version and a fingerprint derived from the conversation. The Anthropic API strips this block before processing, so it does not affect first-party prompt caching. If your gateway implements its own prompt cache keyed on the full request body, set CLAUDE_CODE_ATTRIBUTION_HEADER=0 to omit it.

Configuration

Model selection

By default, Claude Code uses standard model names for the selected API format.

When ANTHROPIC_BASE_URL points at a gateway that exposes the Anthropic Messages format, Claude Code can query the gateway's /v1/models endpoint at startup and add the returned models to the /model picker. Set CLAUDE_CODE_ENABLE_GATEWAY_MODEL_DISCOVERY=1 to enable this. Discovery is off by default so that gateways backed by a shared API key do not surface every model the key can access to every user. Each discovered entry is labeled "From gateway" and uses the display_name field from the response when one is provided. This requires Claude Code v2.1.129 or later.

Discovery applies only to the Anthropic Messages format. It does not run for Bedrock or Vertex pass-through endpoints, and it does not run when ANTHROPIC_BASE_URL is unset or points at api.anthropic.com.

The discovery request authenticates the same way as inference requests: it sends ANTHROPIC_AUTH_TOKEN as a bearer token, or ANTHROPIC_API_KEY as the x-api-key header when no auth token is set, along with any headers from ANTHROPIC_CUSTOM_HEADERS. Only models whose ID begins with claude or anthropic are added to the picker. Results are cached to ~/.claude/cache/gateway-models.json and refreshed on each startup. If the request fails or the gateway does not implement /v1/models, the picker falls back to the cached list from the previous startup or to the built-in model list.

If your gateway uses model names that do not match the discovery filter, use the environment variables documented in Model configuration to add them manually.

LiteLLM configuration

Prerequisites

Claude Code updated to the latest version
LiteLLM Proxy Server deployed and accessible
Access to Claude models through your chosen provider

Basic LiteLLM setup

Configure Claude Code:

Authentication methods

Static API key

Simplest method using a fixed API key:

# Set in environment
export ANTHROPIC_AUTH_TOKEN=sk-litellm-static-key

# Or in Claude Code settings
{
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "sk-litellm-static-key"
  }
}

This value will be sent as the Authorization header.

Dynamic API key with helper

For rotating keys or per-user authentication:

Create an API key helper script:

#!/bin/bash
# ~/bin/get-litellm-key.sh

# Example: Fetch key from vault
vault kv get -field=api_key secret/litellm/claude-code

# Example: Generate JWT token
jwt encode \
  --secret="${JWT_SECRET}" \
  --exp="+1h" \
  '{"user":"'${USER}'","team":"engineering"}'

Configure Claude Code settings to use the helper:

{
  "apiKeyHelper": "~/bin/get-litellm-key.sh"
}

Set token refresh interval:

# Refresh every hour (3600000 ms)
export CLAUDE_CODE_API_KEY_HELPER_TTL_MS=3600000

This value will be sent as Authorization and X-Api-Key headers. The apiKeyHelper has lower precedence than ANTHROPIC_AUTH_TOKEN or ANTHROPIC_API_KEY.

Unified endpoint (recommended)

Using LiteLLM's Anthropic format endpoint:

export ANTHROPIC_BASE_URL=https://litellm-server:4000

Benefits of the unified endpoint over pass-through endpoints:

Load balancing
Fallbacks
Consistent support for cost tracking and end-user tracking

Provider-specific pass-through endpoints (alternative)

Claude API through LiteLLM

Using pass-through endpoint:

export ANTHROPIC_BASE_URL=https://litellm-server:4000/anthropic

Amazon Bedrock through LiteLLM

Using pass-through endpoint:

export ANTHROPIC_BEDROCK_BASE_URL=https://litellm-server:4000/bedrock
export CLAUDE_CODE_SKIP_BEDROCK_AUTH=1
export CLAUDE_CODE_USE_BEDROCK=1

Google Vertex AI through LiteLLM

Using pass-through endpoint:

export ANTHROPIC_VERTEX_BASE_URL=https://litellm-server:4000/vertex_ai/v1
export ANTHROPIC_VERTEX_PROJECT_ID=your-gcp-project-id
export CLAUDE_CODE_SKIP_VERTEX_AUTH=1
export CLAUDE_CODE_USE_VERTEX=1
export CLOUD_ML_REGION=us-east5

Claude Platform on AWS through a gateway

Route to a gateway that forwards to the Claude Platform on AWS endpoint:

export ANTHROPIC_AWS_BASE_URL=https://litellm-server:4000/anthropic-aws
export ANTHROPIC_AWS_WORKSPACE_ID=wrkspc_01ABCDEFGHIJKLMN
export CLAUDE_CODE_SKIP_ANTHROPIC_AWS_AUTH=1
export CLAUDE_CODE_USE_ANTHROPIC_AWS=1

For more detailed information, refer to the LiteLLM documentation.

Additional resources

llm-gateway.md +17 −2

39 39

40**Request headers**40**Request headers**

41 41

~~42Claude Code includes the following headers on every API request:~~42Claude Code includes the following headers on API requests:

43 43

44| Header | Description |44| Header | Description |

~~45| :------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------ |~~45| :------------------------------ | :----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

46| `X-Claude-Code-Session-Id` | A unique identifier for the current Claude Code session. Proxies can use this to aggregate all API requests from a single session without parsing the request body. |46| `X-Claude-Code-Session-Id` | A unique identifier for the current Claude Code session. Proxies can use this to aggregate all API requests from a single session without parsing the request body. |

47| `X-Claude-Code-Agent-Id` | Identifier of the subagent or teammate that issued the request. Your proxy can use this to attribute API cost to individual parallel subagents within a session, without parsing the request body. Present only for requests made by an in-process subagent or teammate. |

48| `X-Claude-Code-Parent-Agent-Id` | Identifier of the agent that spawned the agent making the request. Use this with `X-Claude-Code-Agent-Id` to attribute API costs across nested agents in your proxy. Present only when the requesting agent was itself spawned by another agent. |

50Both agent ID headers are ephemeral per-spawn identifiers, not persistent user or device IDs.

47 51

48Claude Code also prepends a short attribution block to the system prompt containing the client version and a fingerprint derived from the conversation. The Anthropic API strips this block before processing, so it does not affect first-party prompt caching. If your gateway implements its own prompt cache keyed on the full request body, set [`CLAUDE_CODE_ATTRIBUTION_HEADER=0`](/en/env-vars) to omit it.52Claude Code also prepends a short attribution block to the system prompt containing the client version and a fingerprint derived from the conversation. The Anthropic API strips this block before processing, so it does not affect first-party prompt caching. If your gateway implements its own prompt cache keyed on the full request body, set [`CLAUDE_CODE_ATTRIBUTION_HEADER=0`](/en/env-vars) to omit it.

49 53

186export CLOUD_ML_REGION=us-east5190export CLOUD_ML_REGION=us-east5

187```191```

188 192

193##### Claude Platform on AWS through a gateway

194

195Route to a gateway that forwards to the [Claude Platform on AWS](/en/claude-platform-on-aws) endpoint:

196

197```bash theme={null}

198export ANTHROPIC_AWS_BASE_URL=https://litellm-server:4000/anthropic-aws

199export ANTHROPIC_AWS_WORKSPACE_ID=wrkspc_01ABCDEFGHIJKLMN

200export CLAUDE_CODE_SKIP_ANTHROPIC_AWS_AUTH=1

201export CLAUDE_CODE_USE_ANTHROPIC_AWS=1

202```

203

189For more detailed information, refer to the [LiteLLM documentation](https://docs.litellm.ai/).204For more detailed information, refer to the [LiteLLM documentation](https://docs.litellm.ai/).

190 205

191## Additional resources206## Additional resources

llm-gateway.md 2026-05-10 23:03 UTC to 2026-05-11 23:00 UTC

LLM gateway configuration

Gateway requirements

Configuration

Model selection

LiteLLM configuration

Prerequisites

Basic LiteLLM setup

Authentication methods

Static API key

Dynamic API key with helper

Unified endpoint (recommended)

Provider-specific pass-through endpoints (alternative)

Claude API through LiteLLM

Amazon Bedrock through LiteLLM

Google Vertex AI through LiteLLM

Claude Platform on AWS through a gateway

Additional resources