community/google-cloud-vertex-ai.md +184 −0 created
1#### Community Integrations
2
3# Google Cloud Vertex AI
4
5Access xAI’s Grok models through Google Cloud’s managed platform with enterprise security, governance, and unified billing.
6
7This guide walks through setting up and using Grok models on Google Cloud Vertex AI / Gemini Enterprise Agent Platform. Grok on Vertex AI is accessed as a partner model through the OpenAI-compatible API, including the Responses API and Chat Completions. Models are enabled through Model Garden.
8
9## Prerequisites
10
11Before you begin, ensure you have:
12
13* An active Google Cloud Platform (GCP) project with billing enabled.
14* Permissions to enable APIs and access Model Garden, such as the Vertex AI User or Project Editor role.
15* The `aiplatform.googleapis.com` API, or equivalent Agent Platform API, enabled in your project.
16* Google Cloud CLI (`gcloud`) installed and authenticated for Application Default Credentials (ADC).
17
18Set up ADC and your project:
19
20```bash customLanguage="bash"
21gcloud auth application-default login
22gcloud config set project YOUR_PROJECT_ID
23```
24
25Enable the required API if it is not already enabled:
26
27```bash customLanguage="bash"
28gcloud services enable aiplatform.googleapis.com
29```
30
31## Install required packages
32
33```bash customLanguage="bash"
34pip install -U openai google-cloud-aiplatform
35```
36
37## Enable Grok models in Model Garden
38
391. Go to the Google Cloud Console Model Garden, or search for “Model Garden” in the console.
402. Search for “Grok”, or browse by publisher xAI.
413. Select the desired Grok model, such as Grok 4.2 or Grok 4.3.
424. Review the model card for capabilities, quotas, pricing, and regions.
435. Click **Enable** or **Deploy / request access** if prompted.
446. Once enabled, the model becomes available for API calls.
45
46Use the model ID shown in Model Garden. Vertex model names may use a publisher prefix, for example:
47
48* `xai/grok-4.3`
49
50Model availability generally matches the xAI API, subject to Google Cloud regional availability and quotas.
51
52## Make your first API call
53
54Grok on Vertex uses the OpenAI-compatible interface. You can use the standard `openai` Python library.
55
56### Authentication
57
58Use Application Default Credentials. The client can pick up your `gcloud` auth or service account credentials.
59
60You may need to set the Vertex/OpenAI-compatible base URL or endpoint with an environment variable or directly in the client. Use the exact endpoint from the model card or Google documentation for the Agent Platform.
61
62```bash customLanguage="bash"
63export OPENAI_BASE_URL="https://YOUR_VERTEX_ENDPOINT"
64```
65
66### Responses API example
67
68```python customLanguage="pythonOpenAISDK"
69from openai import OpenAI
70
71client = OpenAI() # Uses ADC / env vars automatically
72
73response = client.responses.create(
74 model="xai/grok-4.3",
75 input="Explain the advantages of using Grok for agentic workflows with parallel tool calling.",
76 max_output_tokens=800,
77)
78
79print(response.output_text)
80```
81
82### Chat Completions example
83
84```python customLanguage="pythonOpenAISDK"
85from openai import OpenAI
86
87client = OpenAI()
88
89response = client.chat.completions.create(
90 model="xai/grok-4.3",
91 messages=[
92 {
93 "role": "user",
94 "content": "Which city has a higher temperature right now, Boston or New Delhi, and by how much in Fahrenheit?",
95 }
96 ],
97 tools=[
98 {
99 "type": "function",
100 "function": {
101 "name": "get_current_weather",
102 "description": "Get the current weather in a given location",
103 "parameters": {
104 "type": "object",
105 "properties": {
106 "location": {
107 "type": "string",
108 "description": "The city and state, e.g., San Francisco, CA",
109 },
110 "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]},
111 },
112 "required": ["location"],
113 },
114 },
115 }
116 ],
117 tool_choice="auto",
118)
119
120print(response.choices[0].message.content)
121```
122
123Streaming is supported on both interfaces for lower-latency experiences.
124
125## Function calling and tool use
126
127Grok excels at tool use and parallel function calling across the Responses API and Chat Completions interfaces. Define clear, strict schemas for tools so the model can select and call them reliably.
128
129## Data retention and compliance
130
131Data retention and processing for Grok models on Google Cloud are governed by Google Cloud Vertex AI policies.
132
133* Many deployments support Zero Data Retention (ZDR) options.
134* Review the specific model card and your organization’s Google Cloud data governance settings.
135* Activity logging can be enabled with Vertex AI request-response logging for audit and debugging purposes.
136
137See Google Cloud documentation on Vertex AI data governance and logging for details.
138
139## Feature support
140
141Supported capabilities include:
142
143* Responses API and Chat Completions.
144* Function calling and tool use, including parallel function calling.
145* Reasoning modes / extended thinking.
146* Structured outputs / JSON mode.
147* Streaming.
148* Fixed quotas and committed use discounts through Google Cloud.
149
150Context windows vary by model. Check the specific Grok model card in Model Garden for the current limit.
151
152## Global, multi-region, and regional endpoints
153
154Vertex AI / Gemini Enterprise Agent Platform offers flexible endpoint routing:
155
156* **Global endpoints:** maximum availability with dynamic routing; recommended for most use cases.
157* **Regional endpoints:** routing through specific regions for strict compliance requirements.
158
159## Best practices
160
161* Choose the Grok model and endpoint configuration that match your latency, throughput, and reasoning requirements.
162* Prefer Application Default Credentials and IAM roles over long-lived keys. Use service accounts for production workloads.
163* Monitor usage in Google Cloud Billing and Quotas pages. Request quota increases as needed.
164* Use clear tool schemas and explicit output formats.
165* Enable request logging and integrate with Google Cloud Monitoring / Logging.
166* When migrating from the direct xAI API, update the base URL, client configuration, and model prefix. Most prompts and tool definitions transfer with minimal changes.
167
168## Troubleshooting
169
170| Issue | What to check |
171|---|---|
172| Authentication errors | Run `gcloud auth application-default login` and verify project permissions. |
173| Model not found | Confirm the model is enabled in Model Garden and use the exact `xai/...` ID. |
174| Quota exceeded | Check quotas in the Google Cloud console and request increases as needed. |
175| Endpoint / base URL issues | Use the exact endpoint or environment variable from the model card or Google documentation. |
176
177Start in the Google Cloud console playground / Model Garden interface when available, then move to code.
178
179## Next steps
180
181* Explore enabled models in Model Garden.
182* Build agentic applications that use Grok’s tool-calling strengths.
183* Integrate with Google Cloud services such as Cloud Functions and Vertex AI Pipelines.
184* Review the full xAI Grok documentation and model cards for prompting tips and capabilities.