1> ## Documentation Index
2> Fetch the complete documentation index at: https://code.claude.com/docs/llms.txt
3> Use this file to discover all available pages before exploring further.
4
5# Claude apps gateway configuration
6
7> Reference for every gateway.yaml option: listener and TLS, OIDC, session, Postgres store, Bedrock/Agent Platform/Foundry upstreams, model routing, managed policies, and telemetry.
8
9A Claude apps gateway deployment is configured by one YAML file, conventionally `gateway.yaml`. The file defines everything the gateway does: where it listens, how developers sign in, where inference goes, and which policies and telemetry apply. This page is the reference for every option in that file. To write your first one, start from the [quickstart](/en/claude-apps-gateway#quickstart), which builds a minimal working config and runs it; once you have a config you're happy with, the [deployment guide](/en/claude-apps-gateway-deploy) covers containerizing and hosting it on Kubernetes, Cloud Run, or your own platform.
10
11The gateway reads the file once, at startup, with `claude gateway --config /path/to/gateway.yaml`. Every option is validated against a schema at boot, so a malformed config fails at start with a field-level error rather than at first use.
12
13The [complete example](#complete-example) at the end of this page exercises every section.
14
15## File structure
16
17Five sections are [required](#required-sections). Every other section is [optional](#optional-sections), and an omitted section takes its defaults. Unknown keys fail boot, so a typo surfaces as a named error rather than a silently ignored setting.
18
19**Required sections:**
20
21* [`listen`](#listen): bind address, public URL, TLS termination
22* [`oidc`](#oidc): your identity provider (IdP), including issuer, client, claim mapping, and who may sign in
23* [`session`](#session): the bearer tokens the gateway mints, with secret and lifetime
24* [`store`](#store): PostgreSQL, for device grants and rate-limit counters
25* [`upstreams`](#upstreams): where inference goes, whether Anthropic, Bedrock, Agent Platform, or Foundry
26
27**Optional sections:**
28
29* [`admin`](#admin): Admin API auth and retention for spend limits
30* [`enforcement`](#enforcement): spend-limit fail-open or fail-closed behavior
31* [`models`](#models) and `auto_include_builtin_models`: admin-curated model list and per-upstream IDs
32* [`managed`](#managed): managed settings policies by IdP group
33* [`telemetry`](#telemetry): OTLP forwarding to your observability stack
34* [`access_control`, `limits`, `timeouts`, `rate_limits`](#http-tuning): IP allow/deny, request size caps, upstream time-to-first-byte, and per-IP sign-in limits
35
36## Secret expansion
37
38Don't write secrets such as `client_secret`, `jwt_secret`, or `postgres_url` directly in `gateway.yaml`. Reference them with one of the forms below, and the gateway resolves the value at boot from an environment variable or a file:
39
40| Form | Resolves to | Use for |
41| --------------- | -------------------------------------------------------- | ---------------------------------------------------------------------- |
42| `${VAR}` | The environment variable `VAR`. Boot fails if undefined. | Container environment variables, AWS Secrets Manager via env injection |
43| `${file:/path}` | File contents, trimmed | Kubernetes Secret volume mounts, Vault Agent, SOPS |
44
45## Required sections
46
47### `listen`
48
49The `listen` block controls where the gateway serves: the bind address and port, the externally visible origin, and optional TLS termination.
50
51| Field | Required | Description |
52| ---------------------- | -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
53| `host` | No | Bind address. Default `0.0.0.0`. |
54| `port` | No | Bind port. Default `8080`. |
55| `public_url` | Behind a proxy | The externally visible `https://` origin, used to build the IdP `redirect_uri` and discovery metadata. Required behind any TLS-terminating proxy such as an ALB, Ingress, or Cloud Run, because the gateway doesn't trust `X-Forwarded-*` headers when constructing its own origin; they are client-spoofable. `trusted_proxies` below governs client-IP resolution only. Also required to enable [telemetry](#telemetry), because the gateway builds the OTLP endpoint it pushes to clients from this URL. |
56| `tls.cert` / `tls.key` | No | PEM paths if the gateway terminates TLS itself |
57| `trusted_proxies` | No | CIDRs or IPs of load balancers in front of the gateway. When set, the gateway trusts `X-Forwarded-For` only from these peers and records the real client IP for per-IP rate limiting and audit. Equivalent to nginx `set_real_ip_from`. |
58
59### `oidc`
60
61OpenID Connect (OIDC) is the SSO protocol the gateway uses with your identity provider; see [Identity provider setup](/en/claude-apps-gateway-deploy#identity-provider-setup) for what to register on the IdP side. The `oidc` block connects the gateway to your identity provider and decides who can sign in. It names the issuer and OAuth client, maps the claims that carry email and groups, and restricts sign-in by email domain or group.
62
63| Field | Required | Description |
64| ------------------------------- | -------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
65| `issuer` | Yes | OIDC discovery base. Must serve discovery at `/.well-known/openid-configuration`. Use HTTPS in production; the gateway accepts an `http://` issuer. A loopback issuer such as `http://localhost:8081` is rejected by the [SSRF guard](/en/claude-apps-gateway-deploy#threat-model-summary) unless `CLAUDE_GATEWAY_ALLOW_LOOPBACK=1` is set in the gateway's environment. |
66| `client_id` / `client_secret` | Yes | From your OAuth client registration |
67| `allowed_email_domains` | No | Reject id\_tokens whose `email` claim isn't in one of these domains, case-insensitive. Defense-in-depth against multi-tenant IdP misconfiguration. Independent of this setting, an id\_token whose `email_verified` claim is explicitly `false` is always rejected. |
68| `allowed_groups` | No | Restrict sign-in to members of these IdP groups, matched against `groups_claim`. A user in an allowed email domain but in none of these groups is rejected. Requires the IdP to emit the groups claim. |
69| `groups_claim` | No | Which id\_token claim carries group membership. Default `groups`. Microsoft Entra emits app roles under `roles`. Accepts a flat key or an RFC 6901 JSON Pointer such as `/resource_access/gateway/roles` for nested claims. |
70| `google_groups` | No | Look up the signed-in user's groups through the Google Workspace Admin SDK Directory API, because Google's id\_token carries no groups claim. Set `service_account_json_path` to a service-account key file with domain-wide delegation on the `https://www.googleapis.com/auth/admin.directory.group.readonly` scope, and `admin_email` to a Workspace administrator the service account impersonates; the Directory API requires a real admin subject. Each user's group email addresses become their groups claim, so `allowed_groups` and `managed.policies.match.groups` match on group emails. |
71| `email_claim` | No | Which id\_token claim carries the user's email. Default `email`. Some IdPs, such as ADFS and Entra B2C, emit `upn` or `preferred_username` instead. Accepts a flat key, a JSON Pointer, or a list of fallback keys where the first present key is used. |
72| `scopes` | No | Full override of the OIDC scopes the gateway requests. Default `[openid, profile, email, offline_access]`. Set when your IdP rejects scopes it doesn't recognize, or requires a custom scope to emit groups or email. Must include `openid`. Dropping `offline_access` disables refresh tokens, so developers re-run the browser login every `session.ttl_hours`. See [Identity provider setup](/en/claude-apps-gateway-deploy#identity-provider-setup) for per-IdP scope recipes such as Google's refresh-token flow. |
73| `extra_auth_params` | No | Extra query parameters appended to the IdP authorization request, verbatim. This is the override mechanism for IdP-specific behavior, such as `access_type: offline` for Google refresh tokens, `domain_hint` for some Entra tenants, or `acr_values` for step-up flows. Cannot override the gateway-managed protocol params: `state`, `nonce`, `redirect_uri`, PKCE, `scope`, `response_type`, `response_mode`, and `client_id`. |
74| `userinfo_fallback` | No | When the id\_token omits email or groups, fetch them from `/userinfo`. Needed for Keycloak lightweight access tokens, the Okta org server, and ADFS minimal tokens. The id\_token stays authoritative; userinfo only fills gaps. Default `false`. |
75| `use_pkce` | No | Send a PKCE (S256) challenge on the authorization request. Default `true`. Set `false` only if your IdP rejects PKCE for this confidential client. |
76| `clock_skew_seconds` | No | Tolerate clock drift when validating id\_token time claims. Default `0`, which is strict. Raise if you see "token expired / not yet valid" errors right after sign-in due to host/IdP clock skew. |
77| `token_endpoint_auth_method` | No | Override the token-endpoint auth method. Accepts `client_secret_basic` or `client_secret_post`. Auto-negotiated by default. |
78| `id_token_signed_response_alg` | No | Expected id\_token signing algorithm. Default `RS256`. Set for IdPs that sign with ES256, PS256, or EdDSA. |
79| `additional_authorized_parties` | No | Extra `azp` values to accept beyond `client_id`, for Keycloak broker and token-exchange flows |
80| `discovery_url` | No | Fetch the discovery document from this URL instead of deriving it from `issuer`, for IdPs behind a proxy that rewrites the issuer host. The path must contain `/.well-known/`. |
81| `form_action_origins` | No | Additional origins for the `/device` page's `Content-Security-Policy: form-action` directive. The gateway already allows `'self'` and the discovered `authorization_endpoint` origin, but Chrome enforces `form-action` against the entire redirect chain. If your IdP redirects through a second host, such as Azure AD federated to ADFS, hub-spoke Okta, or a corporate SSO interceptor, list every origin the authorization request may redirect through. |
82| `ca_cert_pem` | No | PEM CA cert that replaces the system trust store for IdP requests only. Use for Keycloak or Dex behind corporate PKI. |
83
84### `session`
85
86The `session` block shapes the bearer tokens the gateway mints after sign-in: the secret that signs them and how long they live.
87
88| Field | Required | Description |
89| ------------ | -------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
90| `jwt_secret` | Yes | At least 32 bytes of entropy, for example from `openssl rand -base64 32`. Signs the gateway's HS256 bearer tokens. Accepts a single string or an array for rotation: index 0 signs and all entries verify. To rotate, prepend a new secret, wait `ttl_hours`, then drop the old one. |
91| `ttl_hours` | No | Gateway bearer token lifetime. Default `1`. The CLI silently refreshes before expiry when the IdP issues refresh tokens. A shorter lifetime deprovisions faster; a longer one makes fewer IdP round-trips. If your IdP can't issue refresh tokens because `offline_access` is unavailable, there is no silent refresh, so raise this to `8` or `12` to avoid sending developers back to the browser login every hour. |
92
93### `store`
94
95The `store` block points the gateway at its PostgreSQL database, which holds device grants and rate-limit counters.
96
97| Field | Required | Description |
98| ----------------- | -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
99| `postgres_url` | Yes | `postgres://` or `postgresql://` URL. Required: the device-grant rendezvous, where the browser callback writes and the polling CLI reads, needs cross-replica state. The gateway runs its own schema migrations at boot, so the role needs `CREATE TABLE` on the target schema. If your security policy prohibits DDL from the application role, run the migrations with an admin role, initially and again whenever a new release ships migrations, and grant the app role `SELECT, INSERT, UPDATE, DELETE` on the gateway's tables. See [Upgrades](/en/claude-apps-gateway-deploy#upgrades) and [Postgres](/en/claude-apps-gateway-deploy#postgres). |
100| `username` | No | Overrides the user in `postgres_url` |
101| `password` | No | Database credential. Set it here rather than in `postgres_url` so the credential stays out of the URL. Accepts any characters and takes precedence over URL credentials. |
102| `max_connections` | No | Postgres connection-pool size per replica. Default `5`, which is conservative and friendly to shared databases. With [spend limits](#admin) enabled, the hot path does a few operations per inference request, so raise it for a dedicated database under load, and keep replicas × this below the database's `max_connections`. |
103
104For local development, point `postgres_url` at a throwaway Postgres container, for example `docker run --rm -p 5432:5432 -e POSTGRES_HOST_AUTH_METHOD=trust postgres`.
105
106### `upstreams`
107
108`upstreams` is an ordered list. The gateway forwards inference to the first upstream that resolves the requested model. On `5xx`, `429`, or timeout it fails over to the next; other `4xx` doesn't, because those errors are attributable to the request rather than the upstream. Multiple upstreams of the same provider must set a distinct `name:`.
109
110Bedrock, Agent Platform, and Foundry clients are built once at startup, and their SDKs refresh credentials internally, so rotating cloud credentials doesn't require a restart. Static Anthropic API keys and bearers are read at startup; see [Anthropic API](#anthropic-api).
111
112#### Anthropic API
113
114The minimal Anthropic upstream is an API key from the [Claude Console](https://platform.claude.com):
115
116```yaml theme={null}
117upstreams:
118 - provider: anthropic
119 auth:
120 api_key: ${ANTHROPIC_API_KEY}
121 # OR an OAuth bearer (e.g. a Workload-Identity-Federation-exchanged token):
122 # oauth_token: ${file:/var/run/secrets/anthropic-oauth-token}
123 # base_url: https://api.anthropic.com # default; override for a forward proxy
124```
125
126The two credential forms differ in the header they send:
127
128* **`api_key`**: sends `x-api-key`. Rotate it in the Claude Console and update the env var.
129* **`oauth_token`**: sends `Authorization: Bearer`. Use the bearer form when your org issues short-lived tokens instead of long-lived API keys. The bearer is read once at startup, so refresh by remounting the secret and restarting.
130
131Instead of a static key or bearer, you can use Workload Identity Federation. Create a federation rule by following the [Workload Identity Federation guide](https://platform.claude.com/docs/en/manage-claude/workload-identity-federation), then mount your workload's OIDC JWT as a file, such as a Kubernetes projected service-account token or a CI platform's id-token. The gateway exchanges the JWT for a short-lived bearer and refreshes it automatically. The token file is re-read on every exchange, so rotated projected tokens are picked up without a restart.
132
133```yaml theme={null}
134upstreams:
135 - provider: anthropic
136 auth:
137 federation_rule_id: ${ANTHROPIC_FEDERATION_RULE_ID}
138 organization_id: ${ANTHROPIC_ORGANIZATION_ID}
139 identity_token_file: /var/run/secrets/anthropic/id-token
140 # workspace_id: wrkspc_... # required if the rule covers >1 workspace
141 # service_account_id: svac_... # optional expected-target check
142```
143
144#### Amazon Bedrock
145
146For the client-side Bedrock deployment that the gateway replaces or fronts, see [Claude Code on Amazon Bedrock](/en/amazon-bedrock). The gateway-side upstream:
147
148```yaml theme={null}
149upstreams:
150 - provider: bedrock
151 region: us-east-1
152 auth: {} # preferred: AWS default credential chain
153 # OR explicit credentials:
154 # auth:
155 # aws_access_key_id: ${AWS_AKID}
156 # aws_secret_access_key: ${AWS_SK}
157 # aws_session_token: ${AWS_ST}
158 # OR a Bedrock API bearer token:
159 # auth:
160 # aws_bearer_token: ${AWS_BEARER_TOKEN}
161 # Override the bedrock-runtime endpoint for FIPS or VPC-endpoint deployments:
162 # base_url: https://bedrock-runtime-fips.us-east-1.amazonaws.com
163```
164
165An empty `auth` block uses the AWS SDK's default credential chain: env vars, `~/.aws/credentials`, ECS task role, EC2 instance metadata, or IRSA on EKS. In production, give the gateway pod an IAM role instead of embedding static keys in a container image.
166
167| Setup | How |
168| --------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
169| IAM permissions | Grant the gateway's principal `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` on both the inference-profile ARNs and the underlying foundation-model ARNs. For the built-in catalog in US regions: `arn:aws:bedrock:<region>:<account>:inference-profile/us.anthropic.*` and `arn:aws:bedrock:*::foundation-model/anthropic.*`. |
170| Model access | In the Bedrock console, per region, request and enable model access for the Claude models you want. Cross-region inference profiles (`us.anthropic.*`) require model access in each region the profile spans. |
171| EKS (IRSA) | Create an IAM role with the policy above and a trust policy for your cluster's OIDC provider scoped to the gateway's service account. Annotate the service account with `eks.amazonaws.com/role-arn: arn:aws:iam::<acct>:role/claude-gateway`. `auth: {}` picks it up. |
172| ECS / EC2 | Attach the IAM role to the task definition or instance profile. `auth: {}` picks it up. |
173| Anywhere else | Pass credentials via the `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, and `AWS_SESSION_TOKEN` env vars, or set them explicitly in `auth:` with `${VAR}` expansion |
174| Region | `region:` is the API endpoint region. Cross-region inference profiles route across the geo (US, EU, APAC) regardless of which one you pick. For non-US regions or provisioned-throughput ARNs, add a [`models:`](#models) block with the right per-upstream IDs. |
175
176#### Google Cloud Agent Platform
177
178For the equivalent client-side setup, see [Claude Code on Google Cloud](/en/google-vertex-ai). The gateway-side upstream:
179
180```yaml theme={null}
181upstreams:
182 - provider: vertex
183 region: us-east5
184 project_id: example-prod
185 auth: {} # preferred: Application Default Credentials
186 # OR a service account key file:
187 # auth: { service_account_json: /secrets/sa.json }
188 # Override the aiplatform endpoint for Private Service Connect:
189 # base_url: https://us-east5-aiplatform.p.googleapis.com
190```
191
192An empty `auth` block uses Application Default Credentials: `GOOGLE_APPLICATION_CREDENTIALS`, GCE metadata, or GKE Workload Identity. Service-account JSON key files are supported but discouraged; use Workload Identity or attach a service account to the GCE or Cloud Run instance.
193
194Set `region: global` to use [Agent Platform's global endpoint](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/locations) instead of a regional one. Google then routes each request to an available region, so you don't track per-region model availability. Setting a specific region pins every request to it.
195
196| Setup | How |
197| ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
198| IAM permissions | Grant the gateway's service account `roles/aiplatform.user` on the project, or a custom role with `aiplatform.endpoints.predict`. Enable the Agent Platform API (`aiplatform.googleapis.com`). |
199| Model access | In Model Garden, enable the Claude models for your project. They publish to specific regions; check the model card for supported regions. |
200| GKE (Workload Identity) | Bind a GCP service account to the gateway's Kubernetes service account and annotate the KSA with `iam.gke.io/gcp-service-account: claude-gateway@<proj>.iam.gserviceaccount.com`. `auth: {}` picks it up. |
201| Cloud Run / GCE | Set the service's service account to one with `roles/aiplatform.user`. `auth: {}` picks it up. |
202| Anywhere else | `auth: { service_account_json: /secrets/sa.json }`, the path to a JSON key file mounted as a secret. The field takes a file path, not the key contents, so no `${file:…}` expansion is involved. |
203
204#### Microsoft Foundry
205
206For the client-side Foundry deployment, see [Claude Code on Microsoft Foundry](/en/microsoft-foundry). The gateway-side upstream:
207
208```yaml theme={null}
209upstreams:
210 - provider: foundry
211 resource: example-foundry # https://example-foundry.services.ai.azure.com
212 auth: { use_azure_ad: true } # preferred: DefaultAzureCredential / Managed Identity
213 # OR an API key:
214 # auth:
215 # api_key: ${FOUNDRY_API_KEY}
216```
217
218`use_azure_ad: true` resolves through `DefaultAzureCredential`: Managed Identity on AKS, ACI, or App Service; the Azure CLI; or environment credentials. API keys work but are project-wide and don't rotate automatically. Foundry's endpoint is derived from `resource:`; set the optional `base_url` to override it for sovereign clouds such as Azure Government.
219
220| Setup | How |
221| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
222| RBAC | Grant the gateway's identity `Azure AI User` or `Cognitive Services User` on the Foundry resource |
223| Deployments | Foundry uses admin-chosen deployment names, not canonical model IDs. Add a [`models:`](#models) block mapping each canonical ID to your deployment name. |
224| AKS (workload identity) | Federate a User-Assigned Managed Identity with the cluster's OIDC issuer and bind it to the gateway's service account. `use_azure_ad: true` picks it up via `WorkloadIdentityCredential`. |
225| ACI / App Service | Enable system-assigned or user-assigned managed identity on the resource. `use_azure_ad: true` picks it up. |
226| Anywhere else | `auth: { api_key: "${FOUNDRY_API_KEY}" }`. Quote `${…}` inside `{ }`. |
227
228#### Multiple upstreams
229
230The same provider can appear more than once with a distinct `name:`. This covers different regions, different accounts via different credential chains, provisioned throughput versus on-demand, and cross-provider fallback.
231
232The gateway tries upstreams in order. `5xx`, `429`, timeouts, and missing-endpoint (`501`) fail over; other `4xx` doesn't. `429` is per-upstream capacity, so provisioned-throughput (PT) exhaustion fails over to on-demand. An upstream that can't resolve the requested model is skipped without a network round-trip.
233
234This example routes a provisioned-throughput Bedrock allotment first, overflows to on-demand and a second account, and falls back to the Anthropic API last:
235
236```yaml theme={null}
237upstreams:
238 # Primary: provisioned throughput in your home region.
239 - name: bedrock-pt
240 provider: bedrock
241 region: us-east-1
242 auth: {}
243 # Overflow: on-demand cross-region.
244 - name: bedrock-od
245 provider: bedrock
246 region: us-west-2
247 auth: {}
248 # Different account: a separate Bedrock allotment via assumed-role creds.
249 - name: bedrock-acct2
250 provider: bedrock
251 region: us-east-1
252 auth:
253 aws_access_key_id: ${ACCT2_AKID}
254 aws_secret_access_key: ${ACCT2_SK}
255 # Last resort: direct Anthropic API.
256 - name: anthropic-fallback
257 provider: anthropic
258 auth:
259 api_key: ${ANTHROPIC_API_KEY}
260
261# Per-upstream model IDs are keyed on the upstream's `name:`; an upstream
262# without a `name:` defaults to its provider string (e.g. `bedrock`). Any
263# upstream not listed for a model is skipped, which is how you route a model
264# to provisioned throughput while everything else stays on-demand.
265models:
266 - id: claude-opus-4-8
267 label: Claude Opus 4.8
268 upstream_model:
269 bedrock-pt: arn:aws:bedrock:us-east-1:111111111111:provisioned-model/abcdef
270 bedrock-od: us.anthropic.claude-opus-4-8
271 bedrock-acct2: us.anthropic.claude-opus-4-8
272 anthropic-fallback: claude-opus-4-8
273```
274
275| Lever | How |
276| ---------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
277| Different regions | One Bedrock upstream per region, each with its own `region:`. With [`auto_include_builtin_models: true`](#models) the cross-region inference profiles route automatically; for region-pinned deployments use a `models:` block. |
278| Different accounts | One Bedrock upstream per account, each with its own credentials in `auth:`. The default chain (`auth: {}`) uses the pod's identity; for a second account, set explicit credentials or a bearer token. |
279| Provisioned throughput | Map the model to the provisioned-throughput ARN in `models:` for that upstream's name. Other upstreams keep the on-demand ID, so PT capacity is exhausted before failing over. |
280| VPC / FIPS endpoints | Set `base_url:` on the upstream to your VPC endpoint or FIPS endpoint URL |
281| Model-scoped routing | Omit an upstream from a model's `upstream_model:` map and that upstream is skipped for that model. For example, route Opus to provisioned throughput and Sonnet and Haiku to on-demand. |
282
283Failing over between cloud providers, or to the direct Anthropic API, changes which agreement, geography, and other terms govern the request.
284
285The CLI applies the same feature gating to gateways regardless of which upstream serves a given request, so failover doesn't send a body field an upstream would reject.
286
287## Optional sections
288
289### `admin`
290
291Optional. Enables `/v1/organizations/spend_limits`, which mirrors Anthropic's public Admin API, and per-developer spend enforcement on `/v1/messages`. See [Spend limits](/en/claude-apps-gateway-spend-limits) for how caps are set and enforced; this section covers the `gateway.yaml` keys that turn the feature on and tune it.
292
293```yaml theme={null}
294admin:
295 # Named static API keys for the admin endpoints, sent as x-api-key.
296 # The id appears in the audit log as admin-key:<id> so each key is
297 # attributable. Array for rotation: add the new key, roll clients,
298 # remove the old.
299 write_keys:
300 - { id: terraform, key: "${GATEWAY_ADMIN_WRITE_KEY_TF}" }
301 - { id: ci, key: "${GATEWAY_ADMIN_WRITE_KEY_CI}" }
302 read_keys:
303 - { id: reporting, key: "${GATEWAY_ADMIN_READ_KEY}" }
304 # IdP groups granted full admin via the normal gateway JWT (no API key).
305 admin_groups: [platform-finops]
306 blocked_message: request an increase at https://go.example.com/claude-limits
307```
308
309| Field | Required | Description |
310| ------------------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
311| `write_keys` | No | Array of `{id, key}`. An `x-api-key` matching one of these can list, set, and delete spend limits. Key values must be at least 32 characters; `id`s must be unique across `read_keys` and `write_keys`. |
312| `read_keys` | No | Array of `{id, key}`. Read-only: every `GET` endpoint, including listing caps, fetching one by ID, and reading [`/effective`](/en/claude-apps-gateway-spend-limits#%2Feffective) and [`/audit`](/en/claude-apps-gateway-spend-limits#%2Faudit). |
313| `admin_groups` | No | IdP group names. A gateway JWT whose `groups` claim includes one of these has full admin access, read and write, and audits as `oidc:<sub>`. Use this for human admins; use API keys for machines. |
314| `blocked_message` | No | Appended verbatim to the `429 billing_error` a blocked developer sees. Write the whole instruction, such as a URL or a Slack channel. Unset, the error is `spend limit reached`. |
315| `audit_retention_days` | No | Default `365`. Older `admin_audit` rows are swept. |
316| `spend_retention_months` | No | Default `13`. `spend` counter rows older than this are swept. The default keeps a full year plus the current partial month for year-over-year reporting. |
317| `identity_retention_days` | No | Default `90`. Last-seen TTL for `principal_emails` rows, which hold each developer's email, display name, and groups (PII). Deliberately shorter than spend retention so a deprovisioned identity ages out while its anonymous spend counters remain. |
318| `group_limit_mode` | No | `min` (default) or `max`. When a developer is in several groups with caps, `min` enforces the most restrictive and `max` the least. Used by both enforcement and `/effective`. |
319
320### `enforcement`
321
322The `enforcement` block controls how spend-limit checks behave when the store is unavailable.
323
324| Field | Required | Description |
325| ---------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
326| `fail_closed_on_error` | No | Default `false`. Spend enforcement fails open on a Postgres outage, so inference stays up. Set `true` to fail closed: over-cap developers are blocked, but so is everyone else if the store is unreachable. Has no effect without an [`admin:`](#admin) block. |
327
328### `models`
329
330The `models` block is an optional admin-curated model list, served at `/v1/models` and used to translate model IDs per upstream. It is required for non-US Bedrock regions, Bedrock provisioned-throughput ARNs, and Foundry deployment names.
331
332```yaml theme={null}
333auto_include_builtin_models: true # false: expose only the list below
334models:
335 - id: claude-opus-4-8
336 label: Claude Opus 4.8
337 # description: optional text shown in clients that surface it
338 upstream_model:
339 anthropic: claude-opus-4-8
340 bedrock: us.anthropic.claude-opus-4-8 # or an inference-profile ARN
341 foundry: your-opus-deployment-name
342```
343
344### `managed`
345
346The `managed` block defines role-based access policies keyed on IdP groups or email domain. Policies are evaluated in order; the first match is selected, then merged onto the `match: {}` catch-all base described below. They are served per-user at `GET /managed/settings` with ETag/304 caching.
347
348```yaml theme={null}
349managed:
350 policies:
351 # Specific groups first.
352 - match: { groups: [eng-contractors] }
353 cli:
354 availableModels: [claude-sonnet-4-6]
355 permissions: { deny: ["WebFetch", "WebSearch"] }
356 # Default catch-all last: matches everyone who authenticated.
357 - match: {}
358 cli:
359 availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]
360```
361
362A `match: {}` catch-all, conventionally listed last, is treated as a base layer. Every other policy inherits any key it doesn't set from the catch-all, so per-role entries only need to list what differs from the org default. The merge rules depend on the key type:
363
364* **Allow-lists**: `availableModels` and `permissions.allow`. A specific policy's list fully replaces the base's.
365* **Deny-lists and hook arrays**: `permissions.deny`, `permissions.ask`, `disabledMcpjsonServers`, `deniedMcpServers`, `blockedMarketplaces`, and every `hooks` event-type array. These take the union of base and policy, so an org-wide deny or audit hook can't be accidentally dropped by a per-role override.
366* **Record-typed keys**: `env`, `modelOverrides`, and `skillOverrides`. These shallow-merge, so a per-role `env` block overrides keys it sets and inherits the rest from the base.
367
368`availableModels` is also enforced server-side at `/v1/messages`, so a denied model returns `400` regardless of what the client sends.
369
370| Matcher | Behavior |
371| --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
372| `match: {}` | Matches every authenticated user. Start with one of these and add group-scoped policies above it later. |
373| `match: { groups: [a, b] }` | Matches if the JWT's `groups` claim contains any of the listed groups. Case-sensitive: groups must match the IdP's exact casing. |
374| `match: { email_domain: example.com }` | Matches the part after the last `@` in the JWT's `email` claim, case-insensitive. Accepts one domain per policy. |
375| `match: { groups: [a], email_domain: example.com }` | Both conditions must match |
376
377An authenticated user who matches no policy gets the gateway's defaults, which means every model in the catalog and no managed settings. Add a `match: {}` catch-all last if you want a guaranteed default policy.
378
379<Note>
380 The gateway keeps no user directory of its own. It authorizes each request from the user's IdP token, reading group membership from the token's `groups` claim and evaluating policies against it. There is no roster to enumerate and no accounts to pre-create, and therefore no SCIM endpoint, because there is nothing for SCIM to sync into.
381
382 Run user and group lifecycle management at the source of truth, which is your IdP's native SCIM provisioning or a dedicated identity-governance platform. Membership and deprovisioning governed there flow into the gateway automatically through the token. If you want SCIM provisioning of Claude accounts themselves, that is a [Claude for Enterprise](/en/admin-setup) capability.
383
384 Two propagation clocks apply:
385
386 * **Policy contents**: editing a policy and redeploying reaches connected clients on their next managed-settings poll, within an hour
387 * **Group membership**: changing a user's group membership changes which policy matches them. This takes effect on the next session re-mint, meaning the next silent refresh, bounded by `session.ttl_hours`.
388</Note>
389
390#### What goes in `cli`
391
392Each `cli` value is a complete Claude Code `managed-settings.json` document, the same schema you would deploy via MDM or `/etc/claude-code/managed-settings.json`, expressed here as YAML. The CLI applies the delivered document at the managed tier, above user and project settings.
393
394The gateway validates each document against the CLI's settings schema at boot, so an unrecognized top-level key or a recognized key with a malformed value fails boot with an error naming every offending key. Deliberately open parts of the schema still accept arbitrary values, because newer clients may recognize entries the gateway's schema doesn't. These open keys are `env`, `pluginConfigs`, and keys nested under `permissions`.
395
396Because validation uses the schema bundled with the gateway's installed version, putting a top-level settings key introduced by a newer Claude Code release into managed config requires upgrading the gateway first. Smoke-test a new policy on one client before rolling it out.
397
398The full key reference is in [Claude Code settings](/en/settings#available-settings). The keys most operators reach for first:
399
400```yaml theme={null}
401managed:
402 policies:
403 - match: {}
404 cli:
405 # Model access (also enforced server-side at /v1/messages)
406 availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]
407
408 # Permission policy
409 permissions:
410 deny:
411 - "WebFetch"
412 - "Read(./.env)"
413 - "Read(./secrets/**)"
414 disableBypassPermissionsMode: disable # blocks --dangerously-skip-permissions
415 allowManagedPermissionRulesOnly: true # ignore user/project permission rules
416
417 # Environment pushed into the CLI process. DISABLE_UPDATES blocks
418 # background and manual updates; DISABLE_AUTOUPDATER stops only
419 # background updates.
420 env:
421 DISABLE_UPDATES: "1" # pin versions via your own distribution
422
423 # Org-wide hooks. Hook commands run on developer machines, not the
424 # gateway, so the path must exist on every client OS in the policy.
425 hooks:
426 PostToolUse:
427 - matcher: "Edit|Write"
428 hooks:
429 - { type: command, command: /usr/local/bin/audit-edit.sh }
430```
431
432| Key | Enforced by | Effect |
433| ------------------------------------------ | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
434| `availableModels` | Gateway + CLI | Model allowlist. Also checked at `/v1/messages`, so a patched client can't bypass it. |
435| `permissions.allow` / `.deny` | CLI | Tool and command rules. See [Permissions](/en/permissions). |
436| `permissions.disableBypassPermissionsMode` | CLI | Set to `disable` to block [`bypassPermissions`](/en/permission-modes#skip-all-checks-with-bypasspermissions-mode), the mode that auto-approves every tool call, and the `--dangerously-skip-permissions` flag |
437| `allowManagedPermissionRulesOnly` | CLI | When `true`, user and project permission rules are ignored; only rules from this document apply |
438| `env` | CLI | Environment variables merged into the CLI process. Use for telemetry, auto-update, and model-name overrides. |
439| `hooks` | CLI | Org-wide [hooks](/en/hooks) |
440
441Because these settings arrive over the network, the CLI shows each developer a one-time security approval dialog before applying anything that can run a shell command or alter where traffic goes. The dialog covers:
442
443* `hooks`
444* `env` variables that aren't on the CLI's built-in safe list
445* shell-execution settings such as `apiKeyHelper` and `statusLine`
446* managed CLAUDE.md content
447
448The safe list determines which `env` variables apply without approval:
449
450* **On the safe list**: auto-update and model-name vars
451* **Not on the safe list**: proxy vars, base-URL vars, and `OTEL_EXPORTER_OTLP_ENDPOINT`
452
453The gateway's [telemetry](#telemetry) configuration pushes `OTEL_EXPORTER_OTLP_ENDPOINT`, so setting `telemetry.forward_to` triggers the dialog on each interactive client. Non-interactive runs with the `-p` flag skip the dialog and apply settings without approval. The dialog protects the developer's machine from a compromised or hostile gateway, not the organization from the developer, so the `-p` skip is intentional rather than a gap.
454
455If a developer declines, Claude Code exits rather than applying the policy. Pushing a new hook or non-safe env var to a broad policy therefore means an approval prompt on every matching developer's next startup.
456
457The `cli` key was named `settings` in earlier releases. That spelling is still accepted as an alias, but new deployments should use `cli`.
458
459#### Precedence with other managed sources
460
461If a device also has a local `managed-settings.json` or MDM-delivered policy, the managed sources don't merge. The highest-priority source provides all policy settings, ranked in this order with highest priority first:
462
4631. The [policy helper](/en/settings#compute-managed-settings-with-a-policy-helper)
4642. Gateway-delivered settings
4653. MDM, via the HKLM registry on Windows or a plist on macOS
4664. The `managed-settings.json` file
4675. The HKCU registry, on Windows only
468
469Embedding hosts can supply policy through the SDK `managedSettings` option. It is ignored by default and applies only when a managed source opts in with [`parentSettingsBehavior: "merge"`](/en/settings#available-settings), filtered so it can tighten policy but not loosen it.
470
471The exception is a small set of cross-source keys, honored when any admin source sets them; the user-writable HKCU tier is excluded:
472
473* `sandbox.network.allowManagedDomainsOnly` and `sandbox.filesystem.allowManagedReadPathsOnly`: when locked, the corresponding allowlists are unioned across sources
474* [`allowAllClaudeAiMcps`](/en/settings#available-settings): allow-only override for the claude.ai MCP server allowlist
475* `sandbox.bwrapPath` and `sandbox.socatPath`: filesystem paths to the [sandbox](/en/sandboxing) helper binaries
476
477`allowManagedPermissionRulesOnly` and `disableBypassPermissionsMode` are not cross-source, so only the winning source's value applies.
478
479Gateway policies apply to every Claude Code invocation on the machine, including non-interactive `claude -p` runs and sessions spawned by the Agent SDK. If the gateway is unreachable at startup, signed-in sessions exit with an error rather than running without their policy.
480
481<Warning>
482 `mcpServers` inside a policy's `cli` block is rejected at gateway boot. Per-group MCP distribution is not available; deploy MCP servers via the file-based `managed-mcp.json` on each device or let developers add them locally.
483</Warning>
484
485### `telemetry`
486
487The CLI sends OpenTelemetry Protocol (OTLP) over HTTP metrics, logs, and, when enabled, traces to the gateway, which relays them verbatim to each configured destination. See [Monitoring usage](/en/monitoring-usage) for the metrics and events the CLI emits.
488
489The CLI stamps each export with the authenticated user's identity, read from the gateway-issued JWT: the `user.id`, `user.email`, and `user.groups` attributes. Per-developer cost and usage attribution therefore works with no developer-side configuration.
490
491```yaml theme={null}
492telemetry:
493 forward_to:
494 - url: https://otel-collector.internal.example.com
495 headers:
496 Authorization: ${OTLP_TOKEN}
497 # Per-signal opt-in. Default: metrics only.
498 metrics: true
499 logs: false
500 traces: false
501 - url: https://api.datadoghq.com/api/v2/otlp
502 headers:
503 DD-API-KEY: ${DD_API_KEY}
504```
505
506<Warning>
507 Each destination opts into `metrics`, `logs`, and `traces` independently, and the default is metrics only. The signals differ in sensitivity:
508
509 * **Metrics**: aggregate counters such as token counts, request counts, and latency
510 * **Logs and traces**: can carry full bash commands, tool inputs, and file paths, covering anything Claude Code does on a developer's machine
511
512 Enable logs and traces only on destinations with the access controls and retention policy that data warrants.
513</Warning>
514
515Telemetry is off in the CLI by default. Configuring `telemetry.forward_to` together with `listen.public_url` turns it on. The gateway pushes five env vars to every connected client through `/managed/settings`:
516
517* `CLAUDE_CODE_ENABLE_TELEMETRY=1`
518* `OTEL_METRICS_EXPORTER=otlp`
519* `OTEL_LOGS_EXPORTER=otlp`
520* `OTEL_TRACES_EXPORTER=otlp`
521* `OTEL_EXPORTER_OTLP_ENDPOINT=<public_url>`
522
523The pushed endpoint is built from the public URL, so metrics and logs need no OTEL configuration from developers or policies. The pushed configuration is applied at the managed tier, overriding `OTEL_*` variables a developer sets locally.
524
525[Traces](/en/monitoring-usage#traces-beta) additionally require `CLAUDE_CODE_ENHANCED_TELEMETRY_BETA=1` on each client. The gateway doesn't push that variable, so set it through a managed policy's `env` block. It isn't on the CLI's safe list, so delivering it through a policy is covered by the same [security approval dialog](#managed) that the pushed OTLP endpoint already triggers.
526
527Both protobuf and JSON OTLP encodings are relayed, and any OpenTelemetry-compatible backend works as a destination.
528
529### HTTP tuning
530
531Four optional top-level blocks, `access_control`, `limits`, `timeouts`, and `rate_limits`, tune the HTTP surface. The defaults suit most deployments.
532
533| Block | Key | Default | Description |
534| ---------------- | ---------------------------------------------- | -------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
535| `access_control` | `allow_cidrs` / `deny_cidrs` | empty | Inbound IP allow/deny by client address, after `trusted_proxies` resolution. `deny_cidrs` is checked first; a client it matches is rejected even if `allow_cidrs` also matches. If `allow_cidrs` is non-empty the gateway is default-deny. `/healthz` and `/readyz` are exempt from `allow_cidrs`. |
536| `limits` | `max_request_bytes` | 32 MiB | Max inbound request body; oversize requests get `413` before the body is buffered. Raise for large file or image requests. |
537| `limits` | `max_request_header_bytes` | unset | When set, oversize headers return `431` |
538| `limits` | `max_url_length` | unset | When set, an over-long URL returns `414` |
539| `timeouts` | `upstream_ttfb_ms` | 120000 | Max wait for the upstream's response headers (time to first byte). The response body then streams with no wall-clock cap. Applies to the direct Anthropic upstream path; Bedrock, Agent Platform, and Foundry are bounded by their provider SDK's own timeout. |
540| `rate_limits` | `device_authorization.max` / `.window_seconds` | 30 / 600 | Per-IP rate limit on the unauthenticated device-authorization endpoint. Raise for a large org behind a shared egress IP or NAT. These limits apply only to the device-grant sign-in flow, not to `/v1/messages` inference. See [User-code brute-force resistance](/en/claude-apps-gateway-deploy#user-code-brute-force-resistance). |
541| `rate_limits` | `device_verify.max` / `.window_seconds` | 10 / 600 | Per-IP rate limit on `user_code` submissions at `/device` |
542
543## Complete example
544
545This full reference config exercises every core section; the [HTTP tuning blocks](#http-tuning) keep their defaults. Copy it, delete what you don't need, and fill in your values. The config in the [Quickstart](/en/claude-apps-gateway#quickstart) is a minimal version of this.
546
547```yaml gateway.yaml theme={null}
548# Run with:
549# claude gateway --config gateway.yaml
550#
551# Operational log verbosity is controlled by the CLAUDE_GATEWAY_LOG_LEVEL
552# environment variable (info | warn | error; default info). It does not
553# affect audit events, which are always emitted.
554
555listen:
556 host: 0.0.0.0
557 port: 8080
558 public_url: https://claude-gateway.internal.example.com
559 # Omit the tls block when running behind a TLS-terminating ingress.
560 # tls:
561 # cert: /certs/gateway.crt
562 # key: /certs/gateway.key
563 # trusted_proxies:
564 # - 10.0.0.0/8
565
566oidc:
567 issuer: https://example.okta.com
568 client_id: 0oa1example2
569 client_secret: ${OIDC_CLIENT_SECRET}
570 allowed_email_domains:
571 - example.com
572 # Required when the issuer is the Okta org server, whose id_tokens
573 # can omit email and groups; the gateway fills them from /userinfo.
574 userinfo_fallback: true
575 # allowed_groups: [claude-code-users]
576 # Okta emits groups only when the `groups` scope is requested and the
577 # app's groups claim filter allows them. The contractors policy below
578 # matches on groups, so the scope is requested here.
579 scopes: [openid, profile, email, offline_access, groups]
580 # extra_auth_params: { access_type: offline, prompt: consent } # Google
581 # groups_claim: groups # Entra app roles: use `roles`
582 # email_claim: email
583
584session:
585 jwt_secret: ${GATEWAY_JWT_SECRET} # openssl rand -base64 32
586 # ttl_hours: 1
587
588store:
589 postgres_url: ${GATEWAY_POSTGRES_URL}
590 # max_connections: 5
591
592# Enables /v1/organizations/spend_limits (mirrors the Anthropic Admin API)
593# and per-developer spend enforcement on /v1/messages. Omit to disable.
594# Caps themselves are set via the admin API, not here.
595# admin:
596# write_keys:
597# - { id: terraform, key: "${GATEWAY_ADMIN_WRITE_KEY_TF}" }
598# read_keys:
599# - { id: reporting, key: "${GATEWAY_ADMIN_READ_KEY}" }
600# admin_groups: [platform-finops]
601# blocked_message: request an increase at https://go.example.com/claude-limits
602# # audit_retention_days: 365
603# # spend_retention_months: 13
604# # identity_retention_days: 90
605# # group_limit_mode: min
606
607# enforcement:
608# fail_closed_on_error: false
609
610upstreams:
611 - provider: anthropic
612 auth:
613 api_key: ${ANTHROPIC_API_KEY}
614
615 # - provider: bedrock
616 # region: us-east-1
617 # auth: {}
618
619 # - provider: vertex
620 # region: us-east5
621 # project_id: example-prod
622 # auth: {}
623
624 # - provider: foundry
625 # resource: example-foundry
626 # auth: { use_azure_ad: true }
627
628auto_include_builtin_models: true
629models:
630 - id: claude-opus-4-8
631 label: Claude Opus 4.8
632 upstream_model:
633 anthropic: claude-opus-4-8
634 # bedrock: us.anthropic.claude-opus-4-8
635 # vertex: claude-opus-4-8
636 # foundry: <your-opus-deployment-name>
637 - id: claude-sonnet-4-6
638 label: Claude Sonnet 4.6
639 upstream_model:
640 anthropic: claude-sonnet-4-6
641 - id: claude-haiku-4-5
642 label: Claude Haiku 4.5
643 upstream_model:
644 anthropic: claude-haiku-4-5
645
646managed:
647 policies:
648 - match: { groups: [contractors] }
649 cli:
650 availableModels: [claude-haiku-4-5]
651 # Constrain the Default picker option to availableModels instead of
652 # the tier default, so contractors don't get a 400 on the default.
653 enforceAvailableModels: true
654 # allow auto-approves these tools; it does not block the rest.
655 # Add deny rules to restrict tools.
656 permissions: { allow: [Read, Grep] }
657 - match: {}
658 cli:
659 availableModels: [claude-opus-4-8, claude-sonnet-4-6, claude-haiku-4-5]
660 permissions:
661 allow: [Read, Grep, Bash, Edit]
662 deny: ["WebFetch"]
663 env: { HTTP_PROXY: http://proxy.example.com:8080 }
664
665telemetry:
666 forward_to:
667 - url: https://otel.internal.example.com:4318
668 headers:
669 Authorization: Bearer ${OTEL_TOKEN}
670```
671
672## Client-side managed settings
673
674Everything above configures the gateway server. Pointing developer machines at it is configured separately, on each device, through Claude Code's [managed settings](/en/settings#settings-files). The gateway can't push these keys itself, because they're what tell the client where the gateway is.
675
676For the CLI, set both keys in the per-OS `managed-settings.json`:
677
678```json theme={null}
679{
680 "forceLoginMethod": "gateway",
681 "forceLoginGatewayUrl": "https://claude-gateway.internal.example.com"
682}
683```
684
685Deploy that file to each device, typically via your MDM platform. The file path differs by platform:
686
687| Platform | Path |
688| ------------- | ----------------------------------------------------------------------------------------------------------------------------- |
689| macOS | `/Library/Application Support/ClaudeCode/managed-settings.json`, or the `com.anthropic.claudecode` managed preferences domain |
690| Linux and WSL | `/etc/claude-code/managed-settings.json` |
691| Windows | `C:\Program Files\ClaudeCode\managed-settings.json`, or Group Policy via the HKLM registry |
692
693`forceLoginGatewayUrl`, and the `"gateway"` value of `forceLoginMethod`, are honored only from the admin-controlled managed tier. A developer setting them in their own `~/.claude/settings.json` has no effect.
694
695## Related
696
697* [Claude apps gateway overview](/en/claude-apps-gateway): quickstart and developer connection
698* [Deployment guide](/en/claude-apps-gateway-deploy): IdP setup, container image, Kubernetes and Cloud Run, and operations
699* [Spend limits](/en/claude-apps-gateway-spend-limits): per-developer caps and the Admin API