Documentation — Spybara

Files

advanced-api-usage
- batch-api.md
- priority-processing.md
model-capabilities
- audio
  - voice-agent.md
rate-limits.md

advanced-api-usage/batch-api.md +1 −1

Details

2 2

3# Batch API3# Batch API

4 4

5The Batch API lets you process large volumes of requests asynchronously with reduced pricing and higher rate limits. For pricing details, see [Batch API Pricing](/developers/pricing#batch-api-pricing).5The Batch API lets you process large volumes of requests asynchronously with reduced pricing and higher rate limits. For pricing details, see [Batch API Pricing](/developers/pricing#batch-api-pricing). If you need lower latency on real-time requests instead, see [Priority Processing](/developers/advanced-api-usage/priority-processing).

6 6

7## What is the Batch API?7## What is the Batch API?

8 8

advanced-api-usage/priority-processing.md +113 −0 created

Details

1#### Advanced API Usage

3# Priority Processing

5Priority Processing gives your xAI API requests higher scheduling priority, which typically results in lower time-to-first-token (TTFT) and faster inter-token latency (ITL), especially during periods of high demand. Add `service_tier: "priority"` to any request body to opt in—no capacity reservations or advance provisioning required. The parameter is supported on text inference endpoints: Chat Completions and Responses.

7When priority capacity is available, requests are scheduled ahead of standard traffic. The response always includes a `service_tier` field indicating whether priority was granted; check it to confirm.

9## How it works

11Add the `service_tier` field to any supported request. The API returns the tier that was actually used in the response, so you can confirm the upgrade took effect.

13The `service_tier` field accepts the following values:

15| Value | Meaning |

16|-------|---------|

17| `"default"` | Standard processing. This is the same as omitting the field entirely. |

18| `"priority"` | Request higher scheduling priority at a premium token price. |

20Priority requests are billed at a premium per-token rate. Cache discounts still apply to cached input tokens before the multiplier. For current per-model rates and the exact priority premium, see the [Pricing](/developers/pricing) page.

22## Quick start

24Pass `service_tier: "priority"` in your request body. The response includes a `service_tier` field confirming which tier was used.

26```bash customLanguage="bash"

27curl https://api.x.ai/v1/responses \

28 -H "Authorization: Bearer $XAI_API_KEY" \

29 -H "Content-Type: application/json" \

30 -d '{

31 "model": "grok-4.3",

32 "input": "Explain the Riemann hypothesis in one paragraph.",

33 "service_tier": "priority"

34 }'

35```

37```python customLanguage="pythonXAI"

38import os

40from xai_sdk import Client

41from xai_sdk.chat import user

43client = Client(api_key=os.getenv("XAI_API_KEY"))

45chat = client.chat.create(

46 model="grok-4.3",

47 service_tier="priority",

48)

49chat.append(user("Explain the Riemann hypothesis in one paragraph."))

51response = chat.sample()

53print(response.content)

54print(f"Tier used: {response.service_tier}")

55```

57```python customLanguage="pythonOpenAISDK"

58import os

59from openai import OpenAI

61client = OpenAI(

62 api_key=os.getenv("XAI_API_KEY"),

63 base_url="https://api.x.ai/v1",

64)

66response = client.responses.create(

67 model="grok-4.3",

68 input="Explain the Riemann hypothesis in one paragraph.",

69 service_tier="priority",

70)

72print(response.output_text)

73print(f"Tier used: {response.service_tier}")

74```

76```javascript customLanguage="javascriptOpenAISDK"

77import OpenAI from "openai";

79const client = new OpenAI({

80 apiKey: process.env.XAI_API_KEY,

81 baseURL: "https://api.x.ai/v1",

82});

84const response = await client.responses.create({

85 model: "grok-4.3",

86 input: "Explain the Riemann hypothesis in one paragraph.",

87 service_tier: "priority",

88});

90console.log(response.output_text);

91console.log(`Tier used: ${response.service_tier}`);

92```

94The response includes `"service_tier": "priority"` when the request was served at the priority tier, or `"service_tier": "default"` if it was served at the default tier instead. You are only billed at the priority rate when the response confirms `"priority"`.

96```json customLanguage="json"

97{

98 "id": "resp_abc123",

99 "model": "grok-4.3",

100 "service_tier": "priority",

101 "usage": {

102 "input_tokens": 42,

103 "output_tokens": 156,

104 "cost_in_usd_ticks": 37756000

105 }

106}

107```

108

109## Best practices

110

111* **Latency-sensitive paths first** — Priority Processing is most valuable for user-facing requests where response time directly affects experience. Background jobs, evaluations, and bulk processing are better served by the [Batch API](/developers/advanced-api-usage/batch-api).

112* **Monitor the `service_tier` field** — Log the returned tier to track how often your requests are served at priority versus default and to correlate with your latency metrics.

113* **Combine with prompt caching** — Cached input tokens are discounted before the priority multiplier is applied, so [prompt caching](/developers/advanced-api-usage/prompt-caching) and priority processing complement each other well.

model-capabilities/audio/voice-agent.md +88 −2

Details

1221 1221

1222This section outlines key recommendations for building low-latency, reliable, and natural-feeling voice experiences using the xAI Voice Agent API.1222This section outlines key recommendations for building low-latency, reliable, and natural-feeling voice experiences using the xAI Voice Agent API.

1223 1223

1224### Minimize Perceived Latency – Parallel Initialization1224### Minimize perceived latency with parallel initialization

1225 1225

1226**Start the WebSocket connection and microphone input streaming in parallel.**1226Start the WebSocket connection and microphone input streaming in parallel.

1227 1227

1228* Initiate the WebSocket connection (including authentication via ephemeral token or API key) **as early as possible** — ideally when the voice interface loads or the user opens the mic-enabled screen.1228* Initiate the WebSocket connection (including authentication via ephemeral token or API key) **as early as possible** — ideally when the voice interface loads or the user opens the mic-enabled screen.

1229* Simultaneously begin capturing microphone audio (using `getUserMedia` in browsers or equivalent APIs on mobile/native platforms).1229* Simultaneously begin capturing microphone audio (using `getUserMedia` in browsers or equivalent APIs on mobile/native platforms).

1338 1338

1339* **Domain Expertise** — Precise transcription of medical, legal, financial, and technical terminology — names, codes, and addresses.1339* **Domain Expertise** — Precise transcription of medical, legal, financial, and technical terminology — names, codes, and addresses.

1340 1340

1341## Telephony Providers

1342

1343Use Direct SIP to route calls from your carrier or PBX into a voice agent. Configure your provider to send calls to the xAI SIP host.

1344

1345| Value | Use |

1346|-------|-----|

1347| SIP host | `sip.voice.x.ai` |

1348| SIP URI | `sip:{number}@sip.voice.x.ai;transport=tls` |

1349

1350Replace `{number}` with your Direct SIP phone number. If you restrict inbound calls by source IP, add your provider's signaling CIDR ranges to the allowlist before testing.

1351

1352#### Twilio

1353

13541. In the Twilio Console, create a TwiML Bin and paste the following:

1355

1356```text

1357<Response>

1358 <Dial answerOnBridge="true">

1359 <Sip>sip:{number}@sip.voice.x.ai;transport=tls</Sip>

1360 </Dial>

1361</Response>

1362```

1363

13642. Open your number's Voice configuration and set the handler to this TwiML Bin.

13653. Place a test call to confirm the agent answers.

1366

1367#### Telnyx

1368

13691. In the Telnyx Portal, create a SIP Connection (FQDN) and an Outbound Voice Profile.

13702. Set the connection's outbound destination to `sip.voice.x.ai`.

13713. Place a test call to confirm the agent answers.

1372

1373#### Microsoft Teams

1374

13751. Stand up a Microsoft-certified SBC (Ribbon, AudioCodes, or Cisco CUBE) with an outbound SIP trunk to `sip.voice.x.ai`.

13762. In the Teams admin center, go to **Voice** → **Direct Routing** → **Add** and register the SBC's public FQDN as a PSTN gateway.

13773. From the Teams PowerShell module, create a voice route and routing policy (`New-CsOnlineVoiceRoute` / `New-CsOnlineVoiceRoutingPolicy`) and grant the policy to your users.

1378

1379#### Cisco Webex Calling

1380

13811. Stand up a Local Gateway (Cisco CUBE or equivalent SBC) with an outbound dial-peer to `sip.voice.x.ai`.

13822. In Webex Control Hub, go to **Calling** → **Locations** → **Add trunk**, choose **Premises-based**, and point it at the Local Gateway's FQDN.

13833. Under **Calling** → **Dial Plans**, route the relevant numbers or prefixes to the new trunk.

1384

1385#### Genesys Cloud

1386

13871. In Genesys Cloud Admin, go to **Telephony** → **Trunks** → **Create New** and pick an External Trunk of type SIP.

13882. Under **SIP Servers or Proxies**, add `sip.voice.x.ai`.

13893. In **Routing** → **Architect Flows**, add a Transfer to External action to the agent's SIP URI and assign a DID to the flow.

1390

1391#### NICE CXone

1392

13931. In CXone Admin, go to **Voice** → **SIP Connectivity** → **Create New** and pick External SIP.

13942. Set the trunk's Destination to `sip.voice.x.ai`.

13953. In Studio, add a SIP Transfer action targeting the agent's SIP URI and assign a DID to the script.

1396

1397#### Amazon Chime SDK

1398

13991. In the AWS console, open **Amazon Chime SDK** → **Voice Connectors** → **Create Voice Connector** and enable Encryption.

14002. On the connector's **Termination** tab, add `sip.voice.x.ai` and allowlist your origination CIDR ranges.

14013. Assign your DIDs, then add a SIP rule that bridges inbound calls out to the agent's SIP URI.

1402

1403#### Amazon Connect

1404

14051. Create an Amazon Chime SDK Voice Connector with Encryption enabled.

14062. On the Voice Connector's **Termination** tab, add `sip.voice.x.ai`.

14073. In your Connect contact flow, add a Lambda block that dials out through the Voice Connector to the agent's SIP URI, passing contact attributes as SIP headers.

1408

1409#### RingCentral

1410

14111. Confirm BYOC is enabled on your plan (Ultimate or Premium).

14122. In the admin portal, go to **Phone System** → **Phone Numbers** → **Carriers** → **Add Carrier** and set the endpoint to `sip.voice.x.ai`.

14133. Under **Phone Numbers**, route the DIDs you want through this carrier.

1414

1415#### Zoom Phone

1416

14171. Make sure BYOC is enabled on your Zoom account.

14182. In the Zoom admin, go to **Phone System Management** → **Carrier Configuration** → **Add Carrier Trunk** and set the trunk address to `sip.voice.x.ai`.

14193. Under the BYOC settings, assign the DIDs you want routed through this trunk.

1420

1421#### Generic SIP / Other

1422

14231. In your carrier or PBX, create an outbound route or SIP trunk.

14242. Point its destination at `sip.voice.x.ai`.

14253. Place a test call to confirm the agent answers.

1426

1341## Migrating from OpenAI Realtime1427## Migrating from OpenAI Realtime

1342 1428

1343If you have an existing application built on the [OpenAI Realtime API](https://developers.openai.com/api/docs/guides/realtime-conversations), switching to the Grok Voice Agent API requires only a few changes: update the base URL, swap your API key, and choose a Grok voice model.1429If you have an existing application built on the [OpenAI Realtime API](https://developers.openai.com/api/docs/guides/realtime-conversations), switching to the Grok Voice Agent API requires only a few changes: update the base URL, swap your API key, and choose a Grok voice model.

rate-limits.md +2 −2

Details

~~41| grok-imagine-image | 300 | 0 |~~

42| grok-imagine-image-quality | 300 | 0 |41| grok-imagine-image-quality | 300 | 0 |

~~43| grok-imagine-video-1.5-preview | 60 | 0 |~~42| grok-imagine-image | 300 | 0 |

44| grok-imagine-video | 70 | 0 |43| grok-imagine-video | 70 | 0 |

44| grok-imagine-video-1.5-preview | 60 | 0 |

45 45

46### What counts toward TPM46### What counts toward TPM

47 47

Documentation 2026-06-14 22:02 UTC to 2026-06-15 23:02 UTC