If you have Google Cloud credentials and want to start using Claude Code through Vertex AI, the login wizard walks you through it. You complete the GCP-side prerequisites once per project; the wizard handles the Claude Code side.
Run claude. At the login prompt, select 3rd-party platform, then Google Vertex AI.
3
Follow the wizard prompts
Choose how you authenticate to Google Cloud: Application Default Credentials from gcloud, a service account key file, or credentials already in your environment. The wizard detects your project and region, verifies which Claude models your project can invoke, and lets you pin them. It saves the result to the env block of your user settings file, so you don't need to export environment variables yourself.
After you've signed in, run /setup-vertex any time to reopen the wizard and change your credentials, project, region, or model pins.
Region configuration
Claude Code supports Vertex AI global, multi-region, and regional endpoints. Set CLOUD_ML_REGION to global, a multi-region location such as eu or us, or a specific region such as us-east5. Claude Code selects the correct Vertex AI hostname for each form, including the aiplatform.eu.rep.googleapis.com and aiplatform.us.rep.googleapis.com hosts for multi-region locations.
Set up manually
To configure Vertex AI through environment variables instead of the wizard, for example in CI or a scripted enterprise rollout, follow the steps below.
1. Enable Vertex AI API
Enable the Vertex AI API in your GCP project:
# Set your project ID
gcloud config set project YOUR-PROJECT-ID# Enable Vertex AI API
gcloud services enable aiplatform.googleapis.com
# Enable Vertex AI integrationexport CLAUDE_CODE_USE_VERTEX=1export CLOUD_ML_REGION=global
export ANTHROPIC_VERTEX_PROJECT_ID=YOUR-PROJECT-ID# Optional: Override the Vertex endpoint URL for custom endpoints or gateways# export ANTHROPIC_VERTEX_BASE_URL=https://aiplatform.googleapis.com# Optional: Disable prompt caching if neededexport DISABLE_PROMPT_CACHING=1# Optional: Request 1-hour prompt cache TTL instead of the 5-minute defaultexport ENABLE_PROMPT_CACHING_1H=1# When CLOUD_ML_REGION=global, override region for models that don't support global endpointsexport VERTEX_REGION_CLAUDE_HAIKU_4_5=us-east5export VERTEX_REGION_CLAUDE_4_6_SONNET=europe-west1
Most model versions have a corresponding VERTEX_REGION_CLAUDE_* variable. See the Environment variables reference for the full list. Check Vertex Model Garden to determine which models support global endpoints versus regional only.
Prompt caching is enabled automatically. To disable it, set DISABLE_PROMPT_CACHING=1. To request a 1-hour cache TTL instead of the 5-minute default, set ENABLE_PROMPT_CACHING_1H=1; cache writes with a 1-hour TTL are billed at a higher rate. For heightened rate limits, contact Google Cloud support. When using Vertex AI, the /login and /logout commands are disabled since authentication is handled through Google Cloud credentials.
MCP tool search is disabled by default on Vertex AI because the endpoint does not accept the required beta header. All MCP tool definitions load upfront instead. To opt in, set ENABLE_TOOL_SEARCH=true.
5. Pin model versions
Set these environment variables to specific Vertex AI model IDs.
Without ANTHROPIC_DEFAULT_OPUS_MODEL, the opus alias on Vertex resolves to Opus 4.6. Set it to the Opus 4.7 ID to use the latest model:
When Claude Code starts with Vertex AI configured, it verifies that the models it intends to use are accessible in your project. This check requires Claude Code v2.1.98 or later.
If you have pinned a model version that is older than the current Claude Code default, and your project can invoke the newer version, Claude Code prompts you to update the pin. Accepting writes the new model ID to your user settings file and restarts Claude Code. Declining is remembered until the next default version change.
If you have not pinned a model and the current default is unavailable in your project, Claude Code falls back to the previous version for the current session and shows a notice. The fallback is not persisted. Enable the newer model in Model Garden or pin a version to make the choice permanent.
IAM configuration
Assign the required IAM permissions:
The roles/aiplatform.user role includes the required permissions:
aiplatform.endpoints.predict - Required for model invocation and token counting
For more restrictive permissions, create a custom role with only the permissions above.
Claude Opus 4.7, Opus 4.6, and Sonnet 4.6 support the 1M token context window on Vertex AI. Claude Code automatically enables the extended context window when you select a 1M model variant.
Verify the model is available in the location you specified. Some models are offered only on global or multi-region locations such as eu and us, not in specific regions
If using CLOUD_ML_REGION=global, check that your models support global endpoints in Model Garden under "Supported features". For models that don't support global endpoints, either:
Specify a supported model via ANTHROPIC_MODEL or ANTHROPIC_DEFAULT_HAIKU_MODEL, or
Set a region or multi-region location using VERTEX_REGION_<MODEL_NAME> environment variables
If you encounter 429 errors:
For regional endpoints, ensure the primary model and small/fast model are supported in your selected region
Consider switching to CLOUD_ML_REGION=global for better availability
286# Optional: Request 1-hour prompt cache TTL instead of the 5-minute default
287export ENABLE_PROMPT_CACHING_1H=1
288
286# When CLOUD_ML_REGION=global, override region for models that don't support global endpoints289# When CLOUD_ML_REGION=global, override region for models that don't support global endpoints
291Most model versions have a corresponding `VERTEX_REGION_CLAUDE_*` variable. See the [Environment variables reference](/en/env-vars) for the full list. Check [Vertex Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) to determine which models support global endpoints versus regional only.294Most model versions have a corresponding `VERTEX_REGION_CLAUDE_*` variable. See the [Environment variables reference](/en/env-vars) for the full list. Check [Vertex Model Garden](https://console.cloud.google.com/vertex-ai/model-garden) to determine which models support global endpoints versus regional only.
292295
293[Prompt caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching) is automaticallysupportedwhenyouspecifythe `cache_control` ephemeral flag. To disableit, set `DISABLE_PROMPT_CACHING=1`. For heightened rate limits, contact Google Cloud support. When using Vertex AI, the `/login` and `/logout` commands are disabled since authentication is handled through Google Cloud credentials.296[Prompt caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching) is enabledautomatically.Todisableit,set `DISABLE_PROMPT_CACHING=1`. To requesta 1-hour cache TTL instead of the 5-minute default, set `ENABLE_PROMPT_CACHING_1H=1`; cache writes with a 1-hour TTL are billed at a higher rate. For heightened rate limits, contact Google Cloud support. When using Vertex AI, the `/login` and `/logout` commands are disabled since authentication is handled through Google Cloud credentials.
297
298[MCP tool search](/en/mcp#scale-with-mcp-tool-search) is disabled by default on Vertex AI because the endpoint does not accept the required beta header. All MCP tool definitions load upfront instead. To opt in, set `ENABLE_TOOL_SEARCH=true`.
294299
295### 5. Pin model versions300### 5. Pin model versions