Documentation — Spybara

assistants/deep-dive.md +181 −175

Details

1# Assistants API deep dive1# Assistants API deep dive

2 2

~~3export const snippetFileCreate = {~~3## Overview

~~4 python: `~~4

5Don't start a new integration on the Assistants API. We've announced plans to deprecate it soon, as the Responses API now provides the same features and a more elegant integration.

7There are several concepts involved in building an app with the Assistants API, covered below in case it helps with your [migration to Responses](https://developers.openai.com/api/docs/guides/assistants/migration).

9## Creating assistants

11We recommend using OpenAI's <a href="/api/docs/models">latest models</a> with

12 the Assistants API for best results and maximum compatibility with tools.

14To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:

161. Use the `instructions` parameter to guide the personality of the Assistant and define its goals. Instructions are similar to system messages in the Chat Completions API.

172. Use the `tools` parameter to give the Assistant access to up to 128 tools. You can give it access to OpenAI built-in tools like `code_interpreter` and `file_search`, or call a third-party tools via a `function` calling.

183. Use the `tool_resources` parameter to give the tools like `code_interpreter` and `file_search` access to files. Files are uploaded using the `File` [upload endpoint](https://developers.openai.com/api/docs/api-reference/files/create) and must have the `purpose` set to `assistants` to be used with this API.

20For example, to create an Assistant that can create data visualization based on a `.csv` file, first upload a file.

22```python

5file = client.files.create(23file = client.files.create(

6 file=open("revenue-forecast.csv", "rb"),24 file=open("revenue-forecast.csv", "rb"),

7 purpose='assistants'25 purpose='assistants'

8)26)

~~9 `.trim(),~~27```

~~10 "node.js": `~~28

29```javascript

11const file = await openai.files.create({30const file = await openai.files.create({

12 file: fs.createReadStream("revenue-forecast.csv"),31 file: fs.createReadStream("revenue-forecast.csv"),

13 purpose: "assistants",32 purpose: "assistants",

14});33});

~~15 `.trim(),~~34```

~~16 curl: `~~35

~~17curl https://api.openai.com/v1/files \\~~36```bash

~~18 -H "Authorization: Bearer $OPENAI_API_KEY" \\~~37curl https://api.openai.com/v1/files \

~~19 -F purpose="assistants" \\~~38 -H "Authorization: Bearer $OPENAI_API_KEY" \

39 -F purpose="assistants" \

20 -F file="@revenue-forecast.csv"40 -F file="@revenue-forecast.csv"

~~21 `.trim(),~~41```

~~22};~~42

44Then, create the Assistant with the `code_interpreter` tool enabled and provide the file as a resource to the tool.

23 45

~~24export const snippetAssistantCreation = {~~46```python

~~25 python: `~~

26assistant = client.beta.assistants.create(47assistant = client.beta.assistants.create(

27 name="Data visualizer",48 name="Data visualizer",

28 description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",49 description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",

34 }55 }

35 }56 }

36)57)

~~37 `.trim(),~~58```

~~38 "node.js": `~~59

60```javascript

39const assistant = await openai.beta.assistants.create({61const assistant = await openai.beta.assistants.create({

40 name: "Data visualizer",62 name: "Data visualizer",

41 description: "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",63 description: "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",

47 }69 }

48 }70 }

49});71});

~~50 `.trim(),~~72```

~~51 curl: `~~73

~~52curl https://api.openai.com/v1/assistants \\~~74```bash

~~53 -H "Authorization: Bearer $OPENAI_API_KEY" \\~~75curl https://api.openai.com/v1/assistants \

~~54 -H "Content-Type: application/json" \\~~76 -H "Authorization: Bearer $OPENAI_API_KEY" \

~~55 -H "OpenAI-Beta: assistants=v2" \\~~77 -H "Content-Type: application/json" \

78 -H "OpenAI-Beta: assistants=v2" \

56 -d '{79 -d '{

57 "name": "Data visualizer",80 "name": "Data visualizer",

58 "description": "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",81 "description": "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",

64 }87 }

65 }88 }

66 }'89 }'

~~67 `.trim(),~~90```

~~68};~~91

93You can attach a maximum of 20 files to `code_interpreter` and 10,000 files to `file_search` (using `vector_store` [objects](https://developers.openai.com/api/docs/api-reference/vector-stores/object)). For vector stores created starting in November 2025, the `file_search` limit is 100,000,000 files.

95Each file can be at most 512 MB in size and have a maximum of 5,000,000 tokens. By default, each project can store up to 2.5 TB of files total. There is no organization-wide storage limit. You can reach out to our support team to increase this limit.

97## Managing Threads and Messages

99Threads and Messages represent a conversation session between an Assistant and a user. There is a limit of 100,000 Messages per Thread. Once the size of the Messages exceeds the context window of the model, the Thread will attempt to smartly truncate messages, before fully dropping the ones it considers the least important.

100

101You can create a Thread with an initial list of Messages like this:

69 102

~~70export const snippetThreadCreation = {~~103```python

~~71 python: `~~

72thread = client.beta.threads.create(104thread = client.beta.threads.create(

73 messages=[105 messages=[

74 {106 {

83 }115 }

84 ]116 ]

85)117)

~~86 `.trim(),~~118```

~~87 "node.js": `~~119

120```javascript

88const thread = await openai.beta.threads.create({121const thread = await openai.beta.threads.create({

89 messages: [122 messages: [

90 {123 {

99 }132 }

100 ]133 ]

101});134});

102 `.trim(),135```

103 curl: `136

104curl https://api.openai.com/v1/threads \\137```bash

105 -H "Authorization: Bearer $OPENAI_API_KEY" \\138curl https://api.openai.com/v1/threads \

106 -H "Content-Type: application/json" \\139 -H "Authorization: Bearer $OPENAI_API_KEY" \

107 -H "OpenAI-Beta: assistants=v2" \\140 -H "Content-Type: application/json" \

141 -H "OpenAI-Beta: assistants=v2" \

108 -d '{142 -d '{

109 "messages": [143 "messages": [

110 {144 {

119 }153 }

120 ]154 ]

121 }'155 }'

122 `.trim(),156```

123};157

158

159Messages can contain text, images, or file attachment. Message `attachments` are helper methods that add files to a thread's `tool_resources`. You can also choose to add files to the `thread.tool_resources` directly.

160

161### Creating image input content

162

163Message content can contain either external image URLs or File IDs uploaded via the [File API](https://developers.openai.com/api/docs/api-reference/files/create). Only [models](https://developers.openai.com/api/docs/models) with Vision support can accept image input. Supported image content types include png, jpg, gif, and webp. When creating image files, pass `purpose="vision"` to allow you to later download and display the input content. Projects are limited to 2.5 TB total file storage, and there is no organization-wide storage limit. Please contact us to request a limit increase.

164

165Tools cannot access image content unless specified. To pass image files to Code Interpreter, add the file ID in the message `attachments` list to allow the tool to read and analyze the input. Image URLs cannot be downloaded in Code Interpreter today.

124 166

125export const snippetImageCreation = {167```python

126 python: `

127file = client.files.create(168file = client.files.create(

128 file=open("myimage.png", "rb"),169 file=open("myimage.png", "rb"),

129 purpose="vision"170 purpose="vision"

149 }190 }

150 ]191 ]

151)192)

152 `.trim(),193```

153 "node.js": `

154 194

195```javascript

196import fs from "fs";

155const file = await openai.files.create({197const file = await openai.files.create({

156 file: fs.createReadStream("myimage.png"),198 file: fs.createReadStream("myimage.png"),

157 purpose: "vision",199 purpose: "vision",

177 }219 }

178 ]220 ]

179});221});

180 `.trim(),222```

181 curl: `223

224```bash

182# Upload a file with an "vision" purpose225# Upload a file with an "vision" purpose

183curl https://api.openai.com/v1/files \\226curl https://api.openai.com/v1/files \

184 -H "Authorization: Bearer $OPENAI_API_KEY" \\227 -H "Authorization: Bearer $OPENAI_API_KEY" \

185 -F purpose="vision" \\228 -F purpose="vision" \

186 -F file="@/path/to/myimage.png"229 -F file="@/path/to/myimage.png"

187 230

188## Pass the file ID in the content231## Pass the file ID in the content

189 232

190curl https://api.openai.com/v1/threads \\233curl https://api.openai.com/v1/threads \

191-H "Authorization: Bearer $OPENAI_API_KEY" \\234-H "Authorization: Bearer $OPENAI_API_KEY" \

192-H "Content-Type: application/json" \\235-H "Content-Type: application/json" \

193-H "OpenAI-Beta: assistants=v2" \\236-H "OpenAI-Beta: assistants=v2" \

194-d '{237-d '{

195"messages": [238"messages": [

196{239{

212}255}

213]256]

214}'257}'

215`.trim(),258```

216};259

260

261#### Low or high fidelity image understanding

262

263By controlling the `detail` parameter, which has three options, `low`, `high`, or `auto`, you have control over how the model processes the image and generates its textual understanding.

217 264

218export const snippetLowHighFidelity = {265- `low` will enable the "low res" mode. The model will receive a low-res 512px x 512px version of the image, and represent the image with a budget of 85 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.

219 python: `266- `high` will enable "high res" mode, which first allows the model to see the low res image and then creates detailed crops of input images based on the input image size. Use the [pricing calculator](https://openai.com/api/pricing/) to see token counts for various image sizes.

267

268```python

220thread = client.beta.threads.create(269thread = client.beta.threads.create(

221 messages=[270 messages=[

222 {271 {

237 }286 }

238 ]287 ]

239)288)

240 `.trim(),289```

241 "node.js": `290

291```javascript

242const thread = await openai.beta.threads.create({292const thread = await openai.beta.threads.create({

243 messages: [293 messages: [

244 {294 {

259 }309 }

260 ]310 ]

261});311});

262 `.trim(),312```

263 curl: `313

264curl https://api.openai.com/v1/threads \\314```bash

265 -H "Authorization: Bearer $OPENAI_API_KEY" \\315curl https://api.openai.com/v1/threads \

266 -H "Content-Type: application/json" \\316 -H "Authorization: Bearer $OPENAI_API_KEY" \

267 -H "OpenAI-Beta: assistants=v2" \\317 -H "Content-Type: application/json" \

318 -H "OpenAI-Beta: assistants=v2" \

268 -d '{319 -d '{

269 "messages": [320 "messages": [

270 {321 {

285 }336 }

286 ]337 ]

287 }'338 }'

288 `.trim(),339```

289};340

341

342### Context window management

343

344The Assistants API automatically manages the truncation to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.

345

346#### Max Completion and Max Prompt Tokens

347

348To control the token usage in a single Run, set `max_prompt_tokens` and `max_completion_tokens` when creating the Run. These limits apply to the total number of tokens used in all completions throughout the Run's lifecycle.

349

350For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.

351

352If a completion reaches the `max_completion_tokens` limit, the Run will terminate with a status of `incomplete`, and details will be provided in the `incomplete_details` field of the Run object.

290 353

291export const snippetMessageAnnotations = {354When using the File Search tool, we recommend setting the max_prompt_tokens to

292 python: `355 no less than 20,000. For longer conversations or multiple interactions with

356 File Search, consider increasing this limit to 50,000, or ideally, removing

357 the max_prompt_tokens limits altogether to get the highest quality results.

358

359#### Truncation Strategy

360

361You may also specify a truncation strategy to control how your thread should be rendered into the model's context window.

362Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.

363

364### Message annotations

365

366Messages created by Assistants may contain [`annotations`](https://developers.openai.com/api/docs/api-reference/messages/object#messages/object-content) within the `content` array of the object. Annotations provide information around how you should annotate the text in the Message.

367

368There are two types of Annotations:

369

3701. `file_citation`: File citations are created by the [`file_search`](https://developers.openai.com/api/docs/assistants/tools/file-search) tool and define references to a specific file that was uploaded and used by the Assistant to generate the response.

3712. `file_path`: File path annotations are created by the [`code_interpreter`](https://developers.openai.com/api/docs/assistants/tools/code-interpreter) tool and contain references to the files generated by the tool.

372

373When annotations are present in the Message object, you'll see illegible model-generated substrings in the text that you should replace with the annotations. These strings may look something like `【13†source】` or `sandbox:/mnt/data/file.csv`. Here’s an example python code snippet that replaces these strings with the annotations.

374

375```python

293# Retrieve the message object376# Retrieve the message object

294message = client.beta.threads.messages.retrieve(377message = client.beta.threads.messages.retrieve(

295 thread_id="...",378 thread_id="...",

318 401

319# Add footnotes to the end of the message before displaying to user402# Add footnotes to the end of the message before displaying to user

320 403

321message_content.value += '\\n' + '\\n'.join(citations)404message_content.value += '\n' + '\n'.join(citations)

322`.trim(),405```

323};

324 406

325export const snippetRunCreate = {407

326 python: `408## Runs and Run Steps

409

410When you have all the context you need from your user in the Thread, you can run the Thread with an Assistant of your choice.

411

412```python

327run = client.beta.threads.runs.create(413run = client.beta.threads.runs.create(

328 thread_id=thread.id,414 thread_id=thread.id,

329 assistant_id=assistant.id415 assistant_id=assistant.id

330)416)

331 `.trim(),417```

332 "node.js": `418

419```javascript

333const run = await openai.beta.threads.runs.create(420const run = await openai.beta.threads.runs.create(

334 thread.id,421 thread.id,

335 { assistant_id: assistant.id }422 { assistant_id: assistant.id }

336);423);

337 `.trim(),424```

338 curl: `425

339curl https://api.openai.com/v1/threads/THREAD_ID/runs \\426```bash

340 -H "Authorization: Bearer $OPENAI_API_KEY" \\427curl https://api.openai.com/v1/threads/THREAD_ID/runs \

341 -H "Content-Type: application/json" \\428 -H "Authorization: Bearer $OPENAI_API_KEY" \

342 -H "OpenAI-Beta: assistants=v2" \\429 -H "Content-Type: application/json" \

430 -H "OpenAI-Beta: assistants=v2" \

343 -d '{431 -d '{

344 "assistant_id": "asst_ToSF7Gb04YMj8AMMm50ZLLtY"432 "assistant_id": "asst_ToSF7Gb04YMj8AMMm50ZLLtY"

345 }'433 }'

346 `.trim(),434```

347};435

348 436

349export const snippetRunOverride = {437By default, a Run will use the `model` and `tools` configuration specified in Assistant object, but you can override most of these when creating the Run for added flexibility:

350 python: `438

439```python

351run = client.beta.threads.runs.create(440run = client.beta.threads.runs.create(

352 thread_id=thread.id,441 thread_id=thread.id,

353 assistant_id=assistant.id,442 assistant_id=assistant.id,

355 instructions="New instructions that override the Assistant instructions",444 instructions="New instructions that override the Assistant instructions",

356 tools=[{"type": "code_interpreter"}, {"type": "file_search"}]445 tools=[{"type": "code_interpreter"}, {"type": "file_search"}]

357)446)

358 `.trim(),447```

359 "node.js": `448

449```javascript

360const run = await openai.beta.threads.runs.create(450const run = await openai.beta.threads.runs.create(

361 thread.id,451 thread.id,

362 {452 {

366 tools: [{"type": "code_interpreter"}, {"type": "file_search"}]456 tools: [{"type": "code_interpreter"}, {"type": "file_search"}]

367 }457 }

368);458);

369 `.trim(),459```

370 curl: `460

371curl https://api.openai.com/v1/threads/THREAD_ID/runs \\461```bash

372 -H "Authorization: Bearer $OPENAI_API_KEY" \\462curl https://api.openai.com/v1/threads/THREAD_ID/runs \

373 -H "Content-Type: application/json" \\463 -H "Authorization: Bearer $OPENAI_API_KEY" \

374 -H "OpenAI-Beta: assistants=v2" \\464 -H "Content-Type: application/json" \

465 -H "OpenAI-Beta: assistants=v2" \

375 -d '{466 -d '{

376 "assistant_id": "ASSISTANT_ID",467 "assistant_id": "ASSISTANT_ID",

377 "model": "gpt-4o",468 "model": "gpt-4o",

378 "instructions": "New instructions that override the Assistant instructions",469 "instructions": "New instructions that override the Assistant instructions",

379 "tools": [{"type": "code_interpreter"}, {"type": "file_search"}]470 "tools": [{"type": "code_interpreter"}, {"type": "file_search"}]

380 }'471 }'

381 `.trim(),472```

382};

~~383~~

384## Overview

~~385~~

386Don't start a new integration on the Assistants API. We've announced plans to deprecate it soon, as the Responses API now provides the same features and a more elegant integration.

~~387~~

388There are several concepts involved in building an app with the Assistants API, covered below in case it helps with your [migration to Responses](https://developers.openai.com/api/docs/guides/assistants/migration).

~~389~~

390## Creating assistants

~~391~~

392We recommend using OpenAI's <a href="/api/docs/models">latest models</a> with

393 the Assistants API for best results and maximum compatibility with tools.

~~394~~

395To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:

~~396~~

3971. Use the `instructions` parameter to guide the personality of the Assistant and define its goals. Instructions are similar to system messages in the Chat Completions API.

3982. Use the `tools` parameter to give the Assistant access to up to 128 tools. You can give it access to OpenAI built-in tools like `code_interpreter` and `file_search`, or call a third-party tools via a `function` calling.

3993. Use the `tool_resources` parameter to give the tools like `code_interpreter` and `file_search` access to files. Files are uploaded using the `File` [upload endpoint](https://developers.openai.com/api/docs/api-reference/files/create) and must have the `purpose` set to `assistants` to be used with this API.

~~400~~

401For example, to create an Assistant that can create data visualization based on a `.csv` file, first upload a file.

~~402~~

403Then, create the Assistant with the `code_interpreter` tool enabled and provide the file as a resource to the tool.

~~404~~

405You can attach a maximum of 20 files to `code_interpreter` and 10,000 files to `file_search` (using `vector_store` [objects](https://developers.openai.com/api/docs/api-reference/vector-stores/object)). For vector stores created starting in November 2025, the `file_search` limit is 100,000,000 files.

~~406~~

407Each file can be at most 512 MB in size and have a maximum of 5,000,000 tokens. By default, each project can store up to 2.5 TB of files total. There is no organization-wide storage limit. You can reach out to our support team to increase this limit.

~~408~~

409## Managing Threads and Messages

~~410~~

411Threads and Messages represent a conversation session between an Assistant and a user. There is a limit of 100,000 Messages per Thread. Once the size of the Messages exceeds the context window of the model, the Thread will attempt to smartly truncate messages, before fully dropping the ones it considers the least important.

~~412~~

413You can create a Thread with an initial list of Messages like this:

~~414~~

415Messages can contain text, images, or file attachment. Message `attachments` are helper methods that add files to a thread's `tool_resources`. You can also choose to add files to the `thread.tool_resources` directly.

~~416~~

417### Creating image input content

~~418~~

419Message content can contain either external image URLs or File IDs uploaded via the [File API](https://developers.openai.com/api/docs/api-reference/files/create). Only [models](https://developers.openai.com/api/docs/models) with Vision support can accept image input. Supported image content types include png, jpg, gif, and webp. When creating image files, pass `purpose="vision"` to allow you to later download and display the input content. Projects are limited to 2.5 TB total file storage, and there is no organization-wide storage limit. Please contact us to request a limit increase.

420 473

421Tools cannot access image content unless specified. To pass image files to Code Interpreter, add the file ID in the message `attachments` list to allow the tool to read and analyze the input. Image URLs cannot be downloaded in Code Interpreter today.

~~422~~

423#### Low or high fidelity image understanding

~~424~~

425By controlling the `detail` parameter, which has three options, `low`, `high`, or `auto`, you have control over how the model processes the image and generates its textual understanding.

~~426~~

427- `low` will enable the "low res" mode. The model will receive a low-res 512px x 512px version of the image, and represent the image with a budget of 85 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.

428- `high` will enable "high res" mode, which first allows the model to see the low res image and then creates detailed crops of input images based on the input image size. Use the [pricing calculator](https://openai.com/api/pricing/) to see token counts for various image sizes.

~~429~~

430### Context window management

~~431~~

432The Assistants API automatically manages the truncation to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.

~~433~~

434#### Max Completion and Max Prompt Tokens

~~435~~

436To control the token usage in a single Run, set `max_prompt_tokens` and `max_completion_tokens` when creating the Run. These limits apply to the total number of tokens used in all completions throughout the Run's lifecycle.

~~437~~

438For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.

~~439~~

440If a completion reaches the `max_completion_tokens` limit, the Run will terminate with a status of `incomplete`, and details will be provided in the `incomplete_details` field of the Run object.

~~441~~

442When using the File Search tool, we recommend setting the max_prompt_tokens to

443 no less than 20,000. For longer conversations or multiple interactions with

444 File Search, consider increasing this limit to 50,000, or ideally, removing

445 the max_prompt_tokens limits altogether to get the highest quality results.

~~446~~

447#### Truncation Strategy

~~448~~

449You may also specify a truncation strategy to control how your thread should be rendered into the model's context window.

450Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.

~~451~~

452### Message annotations

~~453~~

454Messages created by Assistants may contain [`annotations`](https://developers.openai.com/api/docs/api-reference/messages/object#messages/object-content) within the `content` array of the object. Annotations provide information around how you should annotate the text in the Message.

~~455~~

456There are two types of Annotations:

~~457~~

4581. `file_citation`: File citations are created by the [`file_search`](https://developers.openai.com/api/docs/assistants/tools/file-search) tool and define references to a specific file that was uploaded and used by the Assistant to generate the response.

4592. `file_path`: File path annotations are created by the [`code_interpreter`](https://developers.openai.com/api/docs/assistants/tools/code-interpreter) tool and contain references to the files generated by the tool.

~~460~~

461When annotations are present in the Message object, you'll see illegible model-generated substrings in the text that you should replace with the annotations. These strings may look something like `【13†source】` or `sandbox:/mnt/data/file.csv`. Here’s an example python code snippet that replaces these strings with the annotations.

~~462~~

463## Runs and Run Steps

~~464~~

465When you have all the context you need from your user in the Thread, you can run the Thread with an Assistant of your choice.

~~466~~

467By default, a Run will use the `model` and `tools` configuration specified in Assistant object, but you can override most of these when creating the Run for added flexibility:

468 474

469Note: `tool_resources` associated with the Assistant cannot be overridden during Run creation. You must use the [modify Assistant](https://developers.openai.com/api/docs/api-reference/assistants/modifyAssistant) endpoint to do this.475Note: `tool_resources` associated with the Assistant cannot be overridden during Run creation. You must use the [modify Assistant](https://developers.openai.com/api/docs/api-reference/assistants/modifyAssistant) endpoint to do this.

470 476

assistants/migration.md +1 −1

Details

322 input=[{ "role": "user", "content": message.content }]323 input=[{ "role": "user", "content": message.content }]

323 )324 )

324 325

325 return { "content": response.output_text }'326 return { "content": response.output_text }

326```327```

327 328

328 329

assistants/tools/code-interpreter.md +134 −117

Details

1# Assistants Code Interpreter1# Assistants Code Interpreter

2 2

~~3export const snippetEnablingCodeInterpreter = {~~3## Overview

~~4 python: `~~4

5Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment. This tool can process files with diverse data and formatting, and generate files with data and images of graphs. Code Interpreter allows your Assistant to run code iteratively to solve challenging code and math problems. When your Assistant writes code that fails to run, it can iterate on this code by attempting to run different code until the code execution succeeds.

7See a quickstart of how to get started with Code Interpreter [here](https://developers.openai.com/api/docs/assistants/overview#step-1-create-an-assistant?context=with-streaming).

9## How it works

11Code Interpreter is charged at $0.03 per session. If your Assistant calls Code Interpreter simultaneously in two different threads (e.g., one thread per end-user), two Code Interpreter sessions are created. Each session is active by default for one hour, which means that you only pay for one session per if users interact with Code Interpreter in the same thread for up to one hour.

13### Enabling Code Interpreter

15Pass `code_interpreter` in the `tools` parameter of the Assistant object to enable Code Interpreter:

17```python

5assistant = client.beta.assistants.create(18assistant = client.beta.assistants.create(

6 instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",19 instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",

7 model="gpt-4o",20 model="gpt-4o",

8 tools=[{"type": "code_interpreter"}]21 tools=[{"type": "code_interpreter"}]

9)22)

~~10 `.trim(),~~23```

~~11 "node.js": `~~24

25```javascript

12const assistant = await openai.beta.assistants.create({26const assistant = await openai.beta.assistants.create({

13 instructions: "You are a personal math tutor. When asked a math question, write and run code to answer the question.",27 instructions: "You are a personal math tutor. When asked a math question, write and run code to answer the question.",

14 model: "gpt-4o",28 model: "gpt-4o",

15 tools: [{"type": "code_interpreter"}]29 tools: [{"type": "code_interpreter"}]

16});30});

~~17 `.trim(),~~31```

~~18 curl: `~~32

~~19curl https://api.openai.com/v1/assistants \\~~33```bash

~~20 -u :$OPENAI_API_KEY \\~~34curl https://api.openai.com/v1/assistants \

~~21 -H 'Content-Type: application/json' \\~~35 -u :$OPENAI_API_KEY \

~~22 -H 'OpenAI-Beta: assistants=v2' \\~~36 -H 'Content-Type: application/json' \

37 -H 'OpenAI-Beta: assistants=v2' \

23 -d '{38 -d '{

24 "instructions": "You are a personal math tutor. When asked a math question, write and run code to answer the question.",39 "instructions": "You are a personal math tutor. When asked a math question, write and run code to answer the question.",

25 "tools": [40 "tools": [

27 ],42 ],

28 "model": "gpt-4o"43 "model": "gpt-4o"

29 }'44 }'

~~30 `.trim(),~~45```

~~31};~~

32 46

~~33export const snippetPassingFilesAssistant = {~~47

~~34 python: `~~48The model then decides when to invoke Code Interpreter in a Run based on the nature of the user request. This behavior can be promoted by prompting in the Assistant's `instructions` (e.g., “write code to solve this problem”).

50### Passing files to Code Interpreter

52Files that are passed at the Assistant level are accessible by all Runs with this Assistant:

54```python

35# Upload a file with an "assistants" purpose55# Upload a file with an "assistants" purpose

36file = client.files.create(56file = client.files.create(

37 file=open("mydata.csv", "rb"),57 file=open("mydata.csv", "rb"),

38 purpose='assistants'58 purpose='assistants'

~~39)\n~~59)

40# Create an assistant using the file ID61# Create an assistant using the file ID

41assistant = client.beta.assistants.create(62assistant = client.beta.assistants.create(

42 instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",63 instructions="You are a personal math tutor. When asked a math question, write and run code to answer the question.",

48 }69 }

49 }70 }

50)71)

~~51 `.trim(),~~72```

~~52 "node.js": `~~73

74```javascript

53// Upload a file with an "assistants" purpose75// Upload a file with an "assistants" purpose

54const file = await openai.files.create({76const file = await openai.files.create({

55 file: fs.createReadStream("mydata.csv"),77 file: fs.createReadStream("mydata.csv"),

56 purpose: "assistants",78 purpose: "assistants",

~~57});\n~~79});

58// Create an assistant using the file ID81// Create an assistant using the file ID

59const assistant = await openai.beta.assistants.create({82const assistant = await openai.beta.assistants.create({

60 instructions: "You are a personal math tutor. When asked a math question, write and run code to answer the question.",83 instructions: "You are a personal math tutor. When asked a math question, write and run code to answer the question.",

66 }89 }

67 }90 }

68});91});

~~69 `.trim(),~~92```

~~70 curl: `~~93

94```bash

71# Upload a file with an "assistants" purpose95# Upload a file with an "assistants" purpose

~~72curl https://api.openai.com/v1/files \\~~96curl https://api.openai.com/v1/files \

~~73 -H "Authorization: Bearer $OPENAI_API_KEY" \\~~97 -H "Authorization: Bearer $OPENAI_API_KEY" \

~~74 -F purpose="assistants" \\~~98 -F purpose="assistants" \

~~75 -F file="@/path/to/mydata.csv"\n~~99 -F file="@/path/to/mydata.csv"

100

76# Create an assistant using the file ID101# Create an assistant using the file ID

~~77curl https://api.openai.com/v1/assistants \\~~102curl https://api.openai.com/v1/assistants \

~~78 -u :$OPENAI_API_KEY \\~~103 -u :$OPENAI_API_KEY \

~~79 -H 'Content-Type: application/json' \\~~104 -H 'Content-Type: application/json' \

~~80 -H 'OpenAI-Beta: assistants=v2' \\~~105 -H 'OpenAI-Beta: assistants=v2' \

81 -d '{106 -d '{

82 "instructions": "You are a personal math tutor. When asked a math question, write and run code to answer the question.",107 "instructions": "You are a personal math tutor. When asked a math question, write and run code to answer the question.",

83 "tools": [{"type": "code_interpreter"}],108 "tools": [{"type": "code_interpreter"}],

88 }113 }

89 }114 }

90 }'115 }'

~~91 `.trim(),~~116```

~~92};~~117

93 118

~~94export const snippetPassingFilesThread = {~~119Files can also be passed at the Thread level. These files are only accessible in the specific Thread. Upload the File using the [File upload](https://developers.openai.com/api/docs/api-reference/files/create) endpoint and then pass the File ID as part of the Message creation request:

~~95 python: `~~120

121```python

96thread = client.beta.threads.create(122thread = client.beta.threads.create(

97 messages=[123 messages=[

98 {124 {

99 "role": "user",125 "role": "user",

100 "content": "I need to solve the equation \`3x + 11 = 14\`. Can you help me?",126 "content": "I need to solve the equation `3x + 11 = 14`. Can you help me?",

101 "attachments": [127 "attachments": [

102 {128 {

103 "file_id": file.id,129 "file_id": file.id,

107 }133 }

108 ]134 ]

109)135)

110 `.trim(),136```

111 "node.js": `137

138```javascript

112const thread = await openai.beta.threads.create({139const thread = await openai.beta.threads.create({

113 messages: [140 messages: [

114 {141 {

115 "role": "user",142 "role": "user",

116 "content": "I need to solve the equation \`3x + 11 = 14\`. Can you help me?",143 "content": "I need to solve the equation `3x + 11 = 14`. Can you help me?",

117 "attachments": [144 "attachments": [

118 {145 {

119 file_id: file.id,146 file_id: file.id,

123 }150 }

124 ]151 ]

125});152});

126 `.trim(),153```

127 curl: `154

128curl https://api.openai.com/v1/threads/thread_abc123/messages \\155```bash

129 -u :$OPENAI_API_KEY \\156curl https://api.openai.com/v1/threads/thread_abc123/messages \

130 -H 'Content-Type: application/json' \\157 -u :$OPENAI_API_KEY \

131 -H 'OpenAI-Beta: assistants=v2' \\158 -H 'Content-Type: application/json' \

159 -H 'OpenAI-Beta: assistants=v2' \

132 -d '{160 -d '{

133 "role": "user",161 "role": "user",

134 "content": "I need to solve the equation \`3x + 11 = 14\`. Can you help me?",162 "content": "I need to solve the equation `3x + 11 = 14`. Can you help me?",

135 "attachments": [163 "attachments": [

136 {164 {

137 "file_id": "file-ACq8OjcLQm2eIG0BvRM4z5qX",165 "file_id": "file-ACq8OjcLQm2eIG0BvRM4z5qX",

139 }167 }

140 ]168 ]

141 }'169 }'

142 `.trim(),170```

143};

~~144~~

145export const snippetReadingImages = {

146 python: `

147from openai import OpenAI\n

148client = OpenAI()\n

149image_data = client.files.content("file-abc123")

150image_data_bytes = image_data.read()\n

151with open("./my-image.png", "wb") as file:

152 file.write(image_data_bytes)

153 `.trim(),

154 "node.js": `

~~155~~

156import OpenAI from "openai";\n

157const openai = new OpenAI();\n

158async function main() {

159 const response = await openai.files.content("file-abc123");\n

160 // Extract the binary data from the Response object

161 const image_data = await response.arrayBuffer();\n

162 // Convert the binary data to a Buffer

163 const image_data_buffer = Buffer.from(image_data);\n

164 // Save the image to a specific location

165 fs.writeFileSync("./my-image.png", image_data_buffer);

166}\n

167main();

168 `.trim(),

169 curl: `

170curl https://api.openai.com/v1/files/file-abc123/content \\

171 -H "Authorization: Bearer $OPENAI_API_KEY" \\

172 --output image.png

173 `.trim(),

174};

~~175~~

176export const snippetInputOutputLogs = {

177 python: `

178run_steps = client.beta.threads.runs.steps.list(

179 thread_id=thread.id,

180 run_id=run.id

181)

182 `.trim(),

183 "node.js": `

184const runSteps = await openai.beta.threads.runs.steps.list(

185 thread.id,

186 run.id

187);

188 `.trim(),

189 curl: `

190curl https://api.openai.com/v1/threads/thread_abc123/runs/RUN_ID/steps \\

191 -H "Authorization: Bearer $OPENAI_API_KEY" \\

192 -H "OpenAI-Beta: assistants=v2" \\

193 `.trim(),

194};

~~195~~

196## Overview

~~197~~

198Code Interpreter allows Assistants to write and run Python code in a sandboxed execution environment. This tool can process files with diverse data and formatting, and generate files with data and images of graphs. Code Interpreter allows your Assistant to run code iteratively to solve challenging code and math problems. When your Assistant writes code that fails to run, it can iterate on this code by attempting to run different code until the code execution succeeds.

~~199~~

200See a quickstart of how to get started with Code Interpreter [here](https://developers.openai.com/api/docs/assistants/overview#step-1-create-an-assistant?context=with-streaming).

~~201~~

202## How it works

~~203~~

204Code Interpreter is charged at $0.03 per session. If your Assistant calls Code Interpreter simultaneously in two different threads (e.g., one thread per end-user), two Code Interpreter sessions are created. Each session is active by default for one hour, which means that you only pay for one session per if users interact with Code Interpreter in the same thread for up to one hour.

~~205~~

206### Enabling Code Interpreter

~~207~~

208Pass `code_interpreter` in the `tools` parameter of the Assistant object to enable Code Interpreter:

~~209~~

210The model then decides when to invoke Code Interpreter in a Run based on the nature of the user request. This behavior can be promoted by prompting in the Assistant's `instructions` (e.g., “write code to solve this problem”).

~~211~~

212### Passing files to Code Interpreter

~~213~~

214Files that are passed at the Assistant level are accessible by all Runs with this Assistant:

215 171

216Files can also be passed at the Thread level. These files are only accessible in the specific Thread. Upload the File using the [File upload](https://developers.openai.com/api/docs/api-reference/files/create) endpoint and then pass the File ID as part of the Message creation request:

217 172

218Files have a maximum size of 512 MB. Code Interpreter supports a variety of file formats including `.csv`, `.pdf`, `.json` and many more. More details on the file extensions (and their corresponding MIME-types) supported can be found in the [Supported files](#supported-files) section below.173Files have a maximum size of 512 MB. Code Interpreter supports a variety of file formats including `.csv`, `.pdf`, `.json` and many more. More details on the file extensions (and their corresponding MIME-types) supported can be found in the [Supported files](#supported-files) section below.

219 174

247 202

248The file content can then be downloaded by passing the file ID to the Files API:203The file content can then be downloaded by passing the file ID to the Files API:

249 204

205```python

206from openai import OpenAI

207

208client = OpenAI()

209

210image_data = client.files.content("file-abc123")

211image_data_bytes = image_data.read()

212

213with open("./my-image.png", "wb") as file:

214 file.write(image_data_bytes)

215```

216

217```javascript

218import fs from "fs";

219import OpenAI from "openai";

220

221const openai = new OpenAI();

222

223async function main() {

224 const response = await openai.files.content("file-abc123");

225

226 // Extract the binary data from the Response object

227 const image_data = await response.arrayBuffer();

228

229 // Convert the binary data to a Buffer

230 const image_data_buffer = Buffer.from(image_data);

231

232 // Save the image to a specific location

233 fs.writeFileSync("./my-image.png", image_data_buffer);

234}

235

236main();

237```

238

239```bash

240curl https://api.openai.com/v1/files/file-abc123/content \

241 -H "Authorization: Bearer $OPENAI_API_KEY" \

242 --output image.png

243```

244

245

250When Code Interpreter references a file path (e.g., ”Download this csv file”), file paths are listed as annotations. You can convert these annotations into links to download the file:246When Code Interpreter references a file path (e.g., ”Download this csv file”), file paths are listed as annotations. You can convert these annotations into links to download the file:

251 247

252```json248```json

278 274

279By listing the steps of a Run that called Code Interpreter, you can inspect the code `input` and `outputs` logs of Code Interpreter:275By listing the steps of a Run that called Code Interpreter, you can inspect the code `input` and `outputs` logs of Code Interpreter:

280 276

277```python

278run_steps = client.beta.threads.runs.steps.list(

279 thread_id=thread.id,

280 run_id=run.id

281)

282```

283

284```javascript

285const runSteps = await openai.beta.threads.runs.steps.list(

286 thread.id,

287 run.id

288);

289```

290

291```bash

292curl https://api.openai.com/v1/threads/thread_abc123/runs/RUN_ID/steps \

293 -H "Authorization: Bearer $OPENAI_API_KEY" \

294 -H "OpenAI-Beta: assistants=v2" \

295```

296

297

281```bash298```bash

282{299{

283 "object": "list",300 "object": "list",

assistants/tools/file-search.md +281 −279

Details

1# Assistants File Search1# Assistants File Search

2 2

~~3export const snippetStep1 = {~~3## Overview

~~4 python: `~~4

5File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries.

7## Quickstart

9In this example, we’ll create an assistant that can help answer questions about companies’ financial statements.

11### Step 1: Create a new Assistant with File Search Enabled

13Create a new assistant with `file_search` enabled in the `tools` parameter of the Assistant.

15```python

5from openai import OpenAI16from openai import OpenAI

6 17

7client = OpenAI()18client = OpenAI()

12model="gpt-4o",23model="gpt-4o",

13tools=[{"type": "file_search"}],24tools=[{"type": "file_search"}],

14)25)

~~15`.trim(),~~26```

~~16 "node.js": `~~

17 27

28```javascript

29import OpenAI from "openai";

18const openai = new OpenAI();30const openai = new OpenAI();

19 31

20async function main() {32async function main() {

27}39}

28 40

29main();41main();

~~30`.trim(),~~42```

~~31 curl: `~~43

~~32curl https://api.openai.com/v1/assistants \\~~44```bash

~~33-H "Content-Type: application/json" \\~~45curl https://api.openai.com/v1/assistants \

~~34-H "Authorization: Bearer $OPENAI_API_KEY" \\~~46-H "Content-Type: application/json" \

~~35-H "OpenAI-Beta: assistants=v2" \\~~47-H "Authorization: Bearer $OPENAI_API_KEY" \

48-H "OpenAI-Beta: assistants=v2" \

36-d '{49-d '{

37"name": "Financial Analyst Assistant",50"name": "Financial Analyst Assistant",

38"instructions": "You are an expert financial analyst. Use you knowledge base to answer questions about audited financial statements.",51"instructions": "You are an expert financial analyst. Use you knowledge base to answer questions about audited financial statements.",

39"tools": [{"type": "file_search"}],52"tools": [{"type": "file_search"}],

40"model": "gpt-4o"53"model": "gpt-4o"

41}'54}'

~~42`.trim(),~~55```

~~43};~~

44 56

~~45export const snippetStep2 = {~~57

~~46 python: `~~58Once the `file_search` tool is enabled, the model decides when to retrieve content based on user messages.

60### Step 2: Upload files and add them to a Vector Store

62To access your files, the `file_search` tool uses the Vector Store object.

63Upload your files and create a Vector Store to contain them.

64Once the Vector Store is created, you should poll its status until all files are out of the `in_progress` state to

65ensure that all content has finished processing. The SDK provides helpers to uploading and polling in one shot.

67```python

47# Create a vector store called "Financial Statements"68# Create a vector store called "Financial Statements"

48vector_store = client.vector_stores.create(name="Financial Statements")69vector_store = client.vector_stores.create(name="Financial Statements")

49 70

64 85

65print(file_batch.status)86print(file_batch.status)

66print(file_batch.file_counts)87print(file_batch.file_counts)

~~67`.trim(),~~88```

~~68 "node.js": `~~89

90```javascript

69const fileStreams = ["edgar/goog-10k.pdf", "edgar/brka-10k.txt"].map((path) =>91const fileStreams = ["edgar/goog-10k.pdf", "edgar/brka-10k.txt"].map((path) =>

70fs.createReadStream(path),92fs.createReadStream(path),

71);93);

76});98});

77 99

78await openai.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, fileStreams)100await openai.vectorStores.fileBatches.uploadAndPoll(vectorStore.id, fileStreams)

~~79`.trim(),~~101```

~~80};~~102

81 103

~~82export const snippetStep3 = {~~104### Step 3: Update the assistant to use the new Vector Store

~~83 python: `~~105

106To make the files accessible to your assistant, update the assistant’s `tool_resources` with the new `vector_store` id.

107

108```python

84assistant = client.beta.assistants.update(109assistant = client.beta.assistants.update(

85 assistant_id=assistant.id,110 assistant_id=assistant.id,

86 tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},111 tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}},

87)112)

~~88 `.trim(),~~113```

~~89 "node.js": `~~114

115```javascript

90await openai.beta.assistants.update(assistant.id, {116await openai.beta.assistants.update(assistant.id, {

91 tool_resources: { file_search: { vector_store_ids: [vectorStore.id] } },117 tool_resources: { file_search: { vector_store_ids: [vectorStore.id] } },

92});118});

~~93 `.trim(),~~119```

~~94};~~120

121

122### Step 4: Create a thread

123

124You can also attach files as Message attachments on your thread. Doing so will create another `vector_store` associated with the thread, or, if there is already a vector store attached to this thread, attach the new files to the existing thread vector store. When you create a Run on this thread, the file search tool will query both the `vector_store` from your assistant and the `vector_store` on the thread.

125

126In this example, the user attached a copy of Apple’s latest 10-K filing.

95 127

~~96export const snippetStep4 = {~~128```python

~~97 python: `~~

98# Upload the user provided file to OpenAI129# Upload the user provided file to OpenAI

99message_file = client.files.create(130message_file = client.files.create(

100 file=open("edgar/aapl-10k.pdf", "rb"), purpose="assistants"131 file=open("edgar/aapl-10k.pdf", "rb"), purpose="assistants"

117# The thread now has a vector store with that file in its tool resources.148# The thread now has a vector store with that file in its tool resources.

118 149

119print(thread.tool_resources.file_search)150print(thread.tool_resources.file_search)

120`.trim(),151```

121 "node.js": `152

153```javascript

122// A user wants to attach a file to a specific message, let's upload it.154// A user wants to attach a file to a specific message, let's upload it.

123const aapl10k = await openai.files.create({155const aapl10k = await openai.files.create({

124file: fs.createReadStream("edgar/aapl-10k.pdf"),156file: fs.createReadStream("edgar/aapl-10k.pdf"),

139 171

140// The thread now has a vector store in its tool resources.172// The thread now has a vector store in its tool resources.

141console.log(thread.tool_resources?.file_search);173console.log(thread.tool_resources?.file_search);

142`.trim(),174```

143};175

176

177Vector stores created using message attachments have a default expiration policy of 7 days after they were last active (defined as the last time the vector store was part of a run). This default exists to help you manage your vector storage costs. You can override these expiration policies at any time. Learn more [here](#managing-costs-with-expiration-policies).

178

179### Step 5: Create a run and check the output

144 180

145export const snippetStep5WithStreaming = {181Now, create a Run and observe that the model uses the File Search tool to provide a response to the user’s question.

146 python: `182

183

184

185<div data-content-switcher-pane data-value="streaming">

186 <div class="hidden">With streaming</div>

187 ```python

147from typing_extensions import override188from typing_extensions import override

148from openai import AssistantEventHandler, OpenAI189from openai import AssistantEventHandler, OpenAI

149 190

152class EventHandler(AssistantEventHandler):193class EventHandler(AssistantEventHandler):

153@override194@override

154def on_text_created(self, text) -> None:195def on_text_created(self, text) -> None:

155print(f"\\nassistant > ", end="", flush=True)196print(f"\nassistant > ", end="", flush=True)

156 197

157 @override198 @override

158 def on_tool_call_created(self, tool_call):199 def on_tool_call_created(self, tool_call):

159 print(f"\\nassistant > {tool_call.type}\\n", flush=True)200 print(f"\nassistant > {tool_call.type}\n", flush=True)

160 201

161 @override202 @override

162 def on_message_done(self, message) -> None:203 def on_message_done(self, message) -> None:

173 citations.append(f"[{index}] {cited_file.filename}")214 citations.append(f"[{index}] {cited_file.filename}")

174 215

175 print(message_content.value)216 print(message_content.value)

176 print("\\n".join(citations))217 print("\n".join(citations))

177 218

178# Then, we use the stream SDK helper219# Then, we use the stream SDK helper

179 220

188event_handler=EventHandler(),229event_handler=EventHandler(),

189) as stream:230) as stream:

190stream.until_done()231stream.until_done()

191`.trim(),232```

192 "node.js": `233

234```javascript

193const stream = openai.beta.threads.runs235const stream = openai.beta.threads.runs

194.stream(thread.id, {236.stream(thread.id, {

195assistant_id: assistant.id,237assistant_id: assistant.id,

214 }256 }

215 257

216 console.log(text.value);258 console.log(text.value);

217 console.log(citations.join("\\n"));259 console.log(citations.join("\n"));

218 }260 }

261```

219 262

220`.trim(),263 </div>

221};264 <div data-content-switcher-pane data-value="without-streaming" hidden>

~~222~~ 265 <div class="hidden">Without streaming</div>

223export const snippetStep5WithoutStreaming = {266 ```python

224 python: `

225# Use the create and poll SDK helper to create a run and poll the status of267# Use the create and poll SDK helper to create a run and poll the status of

226# the run until it's in a terminal state.268# the run until it's in a terminal state.

227 269

241citations.append(f"[{index}] {cited_file.filename}")283citations.append(f"[{index}] {cited_file.filename}")

242 284

243print(message_content.value)285print(message_content.value)

244print("\\n".join(citations))286print("\n".join(citations))

245`.trim(),287```

246 "node.js": `288

289```javascript

247const run = await openai.beta.threads.runs.createAndPoll(thread.id, {290const run = await openai.beta.threads.runs.createAndPoll(thread.id, {

248assistant_id: assistant.id,291assistant_id: assistant.id,

249});292});

270}313}

271 314

272console.log(text.value);315console.log(text.value);

273console.log(citations.join("\\n"));316console.log(citations.join("\n"));

274}317}

275`.trim(),318```

276};319

320 </div>

321

322

323

324Your new assistant will query both attached vector stores (one containing `goog-10k.pdf` and `brka-10k.txt`, and the other containing `aapl-10k.pdf`) and return this result from `aapl-10k.pdf`.

325

326To retrieve the contents of the file search results that were used by the model, use the `include` query parameter and provide a value of `step_details.tool_calls[*].file_search.results[*].content` in the format `?include[]=step_details.tool_calls[*].file_search.results[*].content`.

327

328---

329

330## How it works

331

332The `file_search` tool implements several retrieval best practices out of the box to help you extract the right data from your files and augment the model’s responses. The `file_search` tool:

333

334- Rewrites user queries to optimize them for search.

335- Breaks down complex user queries into multiple searches it can run in parallel.

336- Runs both keyword and semantic searches across both assistant and thread vector stores.

337- Reranks search results to pick the most relevant ones before generating the final response.

338

339By default, the `file_search` tool uses the following settings but these can be [configured](#customizing-file-search-settings) to suit your needs:

340

341- Chunk size: 800 tokens

342- Chunk overlap: 400 tokens

343- Embedding model: `text-embedding-3-large` at 256 dimensions

344- Maximum number of chunks added to context: 20 (could be fewer)

345- Ranker: `auto` (OpenAI will choose which ranker to use)

346- Score threshold: 0 minimum ranking score

347

348**Known Limitations**

349

350We have a few known limitations we're working on adding support for in the coming months:

351

3521. Support for deterministic pre-search filtering using custom metadata.

3532. Support for parsing images within documents (including images of charts, graphs, tables etc.)

3543. Support for retrievals over structured file formats (like `csv` or `jsonl`).

3554. Better support for summarization — the tool today is optimized for search queries.

356

357## Vector stores

358

359Vector Store objects give the File Search tool the ability to search your files. Adding a file to a `vector_store` automatically parses, chunks, embeds and stores the file in a vector database that's capable of both keyword and semantic search. Each `vector_store` can hold up to 10,000 files. For vector stores created starting in November 2025, this limit is 100,000,000 files. Vector stores can be attached to both Assistants and Threads. Today, you can attach at most one vector store to an assistant and at most one vector store to a thread.

360

361#### Creating vector stores and adding files

277 362

278export const snippetCreatingVectorStores = {363You can create a vector store and add files to it in a single API call:

279 python: `364

365```python

280vector_store = client.vector_stores.create(366vector_store = client.vector_stores.create(

281 name="Product Documentation",367 name="Product Documentation",

282 file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5']368 file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5']

283)369)

284 `.trim(),370```

285 "node.js": `371

372```javascript

286const vectorStore = await openai.vectorStores.create({373const vectorStore = await openai.vectorStores.create({

287 name: "Product Documentation",374 name: "Product Documentation",

288 file_ids: ['file_1', 'file_2', 'file_3', 'file_4', 'file_5']375 file_ids: ['file_1', 'file_2', 'file_3', 'file_4', 'file_5']

289});376});

290 `.trim(),377```

291};378

292 379

293export const snippetVectorStoresAddFile = {380Adding files to vector stores is an async operation. To ensure the operation is complete, we recommend that you use the 'create and poll' helpers in our official SDKs. If you're not using the SDKs, you can retrieve the `vector_store` object and monitor its [`file_counts`](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-file_counts) property to see the result of the file ingestion operation.

294 python: `381

382Files can also be added to a vector store after it's created by [creating vector store files](https://developers.openai.com/api/docs/api-reference/vector-stores/createFile).

383

384Adding files is rate limited per vector store ID. Requests to `/vector_stores/{vector_store_id}/files` and `/vector_stores/{vector_store_id}/file_batches` share a per-vector-store limit of 300 requests per minute.

385

386```python

295file = client.vector_stores.files.create_and_poll(387file = client.vector_stores.files.create_and_poll(

296 vector_store_id="vs_abc123",388 vector_store_id="vs_abc123",

297 file_id="file-abc123"389 file_id="file-abc123"

298)390)

299 `.trim(),391```

300 "node.js": `392

393```javascript

301const file = await openai.vectorStores.files.createAndPoll(394const file = await openai.vectorStores.files.createAndPoll(

302 "vs_abc123",395 "vs_abc123",

303 { file_id: "file-abc123" }396 { file_id: "file-abc123" }

304);397);

305 `.trim(),398```

306};399

307 400

308export const snippetVectorStoresAddBatch = {401Alternatively, you can add several files to a vector store by [creating batches](https://developers.openai.com/api/docs/api-reference/vector-stores/createBatch) of up to 500 files.

309 python: `402

403Batch creation accepts either a simple list of `file_ids` or a `files` array made up of objects with a `file_id` plus optional `attributes` and `chunking_strategy`. Use `files` when you need per-file metadata or chunking settings, and note that `file_ids` and `files` are mutually exclusive in a single request.

404

405For high-throughput ingestion into one vector store, prefer file batches whenever possible to reduce request volume and improve latency.

406

407```python

310batch = client.vector_stores.file_batches.create_and_poll(408batch = client.vector_stores.file_batches.create_and_poll(

311 vector_store_id="vs_abc123",409 vector_store_id="vs_abc123",

312 files=[410 files=[

324 }422 }

325 ]423 ]

326)424)

327 `.trim(),425```

328 "node.js": `426

427```javascript

329const batch = await openai.vectorStores.fileBatches.createAndPoll(428const batch = await openai.vectorStores.fileBatches.createAndPoll(

330 "vs_abc123",429 "vs_abc123",

331 {430 {

345 ],444 ],

346 },445 },

347);446);

348 `.trim(),447```

349};448

449

450Similarly, these files can be removed from a vector store by either:

451

452- Deleting the [vector store file object](https://developers.openai.com/api/docs/api-reference/vector-stores/deleteFile) or,

453- By deleting the underlying [file object](https://developers.openai.com/api/docs/api-reference/files/delete) (which removes the file it from all `vector_store` and `code_interpreter` configurations across all assistants and threads in your organization)

350 454

351export const snippetAttachingVectorStores = {455The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).

352 python: `456

457File Search supports a variety of file formats including `.pdf`, `.md`, and `.docx`. More details on the file extensions (and their corresponding MIME-types) supported can be found in the [Supported files](#supported-files) section below.

458

459#### Attaching vector stores

460

461You can attach vector stores to your Assistant or Thread using the `tool_resources` parameter.

462

463```python

353assistant = client.beta.assistants.create(464assistant = client.beta.assistants.create(

354 instructions="You are a helpful product support assistant and you answer questions based on the files provided to you.",465 instructions="You are a helpful product support assistant and you answer questions based on the files provided to you.",

355 model="gpt-4o",466 model="gpt-4o",

369}480}

370}481}

371)482)

372`.trim(),483```

373 "node.js": `484

485```javascript

374const assistant = await openai.beta.assistants.create({486const assistant = await openai.beta.assistants.create({

375instructions: "You are a helpful product support assistant and you answer questions based on the files provided to you.",487instructions: "You are a helpful product support assistant and you answer questions based on the files provided to you.",

376model: "gpt-4o",488model: "gpt-4o",

390}502}

391}503}

392});504});

393`.trim(),505```

394};506

507

508You can also attach a vector store to Threads or Assistants after they're created by updating them with the right `tool_resources`.

509

510#### Ensuring vector store readiness before creating runs

511

512We highly recommend that you ensure all files in a `vector_store` are fully processed before you create a run. This will ensure that all the data in your `vector_store` is searchable. You can check for `vector_store` readiness by using the polling helpers in our SDKs, or by manually polling the `vector_store` object to ensure the [`status`](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-status) is `completed`.

513

514As a fallback, we've built a **60 second maximum wait** in the Run object when the **thread’s** vector store contains files that are still being processed. This is to ensure that any files your users upload in a thread a fully searchable before the run proceeds. This fallback wait _does not_ apply to the assistant's vector store.

515

516#### Customizing File Search settings

517

518You can customize how the `file_search` tool chunks your data and how many chunks it returns to the model context.

519

520**Chunking configuration**

395 521

396export const snippetFileSearchChunks = {522By default, `max_chunk_size_tokens` is set to `800` and `chunk_overlap_tokens` is set to `400`, meaning every file is indexed by being split up into 800-token chunks, with 400-token overlap between consecutive chunks.

397 python: `523

524You can adjust this by setting [`chunking_strategy`](https://developers.openai.com/api/docs/api-reference/vector-stores-files/createFile#vector-stores-files-createfile-chunking_strategy) when adding files to the vector store. There are certain limitations to `chunking_strategy`:

525

526- `max_chunk_size_tokens` must be between 100 and 4096 inclusive.

527- `chunk_overlap_tokens` must be non-negative and should not exceed `max_chunk_size_tokens / 2`.

528

529**Number of chunks**

530

531By default, the `file_search` tool outputs up to 20 chunks for `gpt-4*` and o-series models and up to 5 chunks for `gpt-3.5-turbo`. You can adjust this by setting [`file_search.max_num_results`](https://developers.openai.com/api/docs/api-reference/assistants/createAssistant#assistants-createassistant-tools) in the tool when creating the assistant or the run.

532

533Note that the `file_search` tool may output fewer than this number for a myriad of reasons:

534

535- The total number of chunks is fewer than `max_num_results`.

536- The total token size of all the retrieved chunks exceeds the token "budget" assigned to the `file_search` tool. The `file_search` tool currently has a token budget of:

537 - 4,000 tokens for `gpt-3.5-turbo`

538 - 16,000 tokens for `gpt-4*` models

539 - 16,000 tokens for o-series models

540

541#### Improve file search result relevance with chunk ranking

542

543By default, the file search tool will return all search results to the model that it thinks have any level of relevance when generating a response. However, if responses are generated using content that has low relevance, it can lead to lower quality responses. You can adjust this behavior by both inspecting the file search results that are returned when generating responses, and then tuning the behavior of the file search tool's ranker to change how relevant results must be before they are used to generate a response.

544

545**Inspecting file search chunks**

546

547The first step in improving the quality of your file search results is inspecting the current behavior of your assistant. Most often, this will involve investigating responses from your assistant that are not not performing well. You can get [granular information about a past run step](https://developers.openai.com/api/docs/api-reference/run-steps/getRunStep) using the REST API, specifically using the `include` query parameter to get the file chunks that are being used to generate results.

548

549Include file search results in response when creating a run

550

551```python

398from openai import OpenAI552from openai import OpenAI

399client = OpenAI()553client = OpenAI()

400 554

406)560)

407 561

408print(run_step)562print(run_step)

409`.trim(),563```

410 "node.js": `

411 564

565```javascript

566import OpenAI from "openai";

412const openai = new OpenAI();567const openai = new OpenAI();

413 568

414const runStep = await openai.beta.threads.runs.steps.retrieve(569const runStep = await openai.beta.threads.runs.steps.retrieve(

421);576);

422 577

423console.log(runStep);578console.log(runStep);

424`.trim(),579```

425 curl: `580

426curl -g https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123?include[]=step_details.tool_calls[*].file_search.results[*].content \\581```bash

427-H "Authorization: Bearer $OPENAI_API_KEY" \\582curl -g https://api.openai.com/v1/threads/thread_abc123/runs/run_abc123/steps/step_abc123?include[]=step_details.tool_calls[*].file_search.results[*].content \

428-H "Content-Type: application/json" \\583-H "Authorization: Bearer $OPENAI_API_KEY" \

584-H "Content-Type: application/json" \

429-H "OpenAI-Beta: assistants=v2"585-H "OpenAI-Beta: assistants=v2"

430`.trim(),586```

431};587

588

589You can then log and inspect the search results used during the run step, and determine whether or not they are consistently relevant to the responses your assistant should generate.

590

591**Configure ranking options**

592

593If you have determined that your file search results are not sufficiently relevant to generate high quality responses, you can adjust the settings of the result ranker used to choose which search results should be used to generate responses. You can adjust this setting [`file_search.ranking_options`](https://developers.openai.com/api/docs/api-reference/assistants/createAssistant#assistants-createassistant-tools) in the tool when **creating the assistant** or **creating the run**.

594

595The settings you can configure are:

596

597- `ranker` - Which ranker to use in determining which chunks to use. The available values are `auto`, which uses the latest available ranker, and `default_2024_08_21`.

598- `score_threshold` - a ranking between 0.0 and 1.0, with 1.0 being the highest ranking. A higher number will constrain the file chunks used to generate a result to only chunks with a higher possible relevance, at the cost of potentially leaving out relevant chunks.

599- `hybrid_search.embedding_weight` (also referred to as `rrf_embedding_weight`) - determines how much weight to give to semantic similarity when combining dense (embedding) and sparse (text) rankings with [reciprocal rank fusion](https://en.wikipedia.org/wiki/Reciprocal_rank_fusion). Increase this weight to favor chunks that are close in embedding space.

600- `hybrid_search.text_weight` (also referred to as `rrf_text_weight`) - determines how much weight to give to keyword/text matching when hybrid search is enabled. Increase this weight to favor chunks that share exact terms with the query.

601

602At least one of `hybrid_search.embedding_weight` or `hybrid_search.text_weight` must be greater than zero when hybrid search is configured.

603

604#### Managing costs with expiration policies

605

606The `file_search` tool uses the `vector_stores` object as its resource and you will be billed based on the [size](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-bytes) of the `vector_store` objects created. The size of the vector store object is the sum of all the parsed chunks from your files and their corresponding embeddings.

432 607

433export const snippetExpiration = {608You first GB is free and beyond that, usage is billed at $0.10/GB/day of vector storage. There are no other costs associated with vector store operations.

434 python: `609

610In order to help you manage the costs associated with these `vector_store` objects, we have added support for expiration policies in the `vector_store` object. You can set these policies when creating or updating the `vector_store` object.

611

612```python

435vector_store = client.vector_stores.create_and_poll(613vector_store = client.vector_stores.create_and_poll(

436 name="Product Documentation",614 name="Product Documentation",

437 file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],615 file_ids=['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],

440 "days": 7618 "days": 7

441 }619 }

442)620)

443 `.trim(),621```

444 "node.js": `622

623```javascript

445let vectorStore = await openai.vectorStores.create({624let vectorStore = await openai.vectorStores.create({

446 name: "rag-store",625 name: "rag-store",

447 file_ids: ['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],626 file_ids: ['file_1', 'file_2', 'file_3', 'file_4', 'file_5'],

450 days: 7629 days: 7

451 }630 }

452});631});

453 `.trim(),632```

454};633

455 634

456export const snippetRecreatingVectorStore = {635**Thread vector stores have default expiration policies**

457 python: `636

637Vector stores created using thread helpers (like [`tool_resources.file_search.vector_stores`](https://developers.openai.com/api/docs/api-reference/threads/createThread#threads-createthread-tool_resources) in Threads or [message.attachments](https://developers.openai.com/api/docs/api-reference/messages/createMessage#messages-createmessage-attachments) in Messages) have a default expiration policy of 7 days after they were last active (defined as the last time the vector store was part of a run).

638

639When a vector store expires, runs on that thread will fail. To fix this, you can simply recreate a new `vector_store` with the same files and reattach it to the thread.

640

641```python

458all_files = list(client.vector_stores.files.list("vs_expired"))642all_files = list(client.vector_stores.files.list("vs_expired"))

459 643

460vector_store = client.vector_stores.create(name="rag-store")644vector_store = client.vector_stores.create(name="rag-store")

467client.vector_stores.file_batches.create_and_poll(651client.vector_stores.file_batches.create_and_poll(

468vector_store_id=vector_store.id, file_ids=[file.id for file in file_batch]652vector_store_id=vector_store.id, file_ids=[file.id for file in file_batch]

469)653)

470`.trim(),654```

471 "node.js": `655

656```javascript

472const fileIds = [];657const fileIds = [];

473for await (const file of openai.vectorStores.files.list(658for await (const file of openai.vectorStores.files.list(

474"vs_toWTk90YblRLCkbE2xSVoJlF",659"vs_toWTk90YblRLCkbE2xSVoJlF",

483tool_resources: { file_search: { vector_store_ids: [vectorStore.id] } },668tool_resources: { file_search: { vector_store_ids: [vectorStore.id] } },

484});669});

485 670

486for (const fileBatch of \_.chunk(fileIds, 100)) {671for (const fileBatch of _.chunk(fileIds, 100)) {

487await openai.vectorStores.fileBatches.create(vectorStore.id, {672await openai.vectorStores.fileBatches.create(vectorStore.id, {

488file_ids: fileBatch,673file_ids: fileBatch,

489});674});

490}675}

491`.trim(),676```

492};

~~493~~

494## Overview

~~495~~

496File Search augments the Assistant with knowledge from outside its model, such as proprietary product information or documents provided by your users. OpenAI automatically parses and chunks your documents, creates and stores the embeddings, and use both vector and keyword search to retrieve relevant content to answer user queries.

~~497~~

498## Quickstart

~~499~~

500In this example, we’ll create an assistant that can help answer questions about companies’ financial statements.

~~501~~

502### Step 1: Create a new Assistant with File Search Enabled

~~503~~

504Create a new assistant with `file_search` enabled in the `tools` parameter of the Assistant.

~~505~~

506Once the `file_search` tool is enabled, the model decides when to retrieve content based on user messages.

~~507~~

508### Step 2: Upload files and add them to a Vector Store

~~509~~

510To access your files, the `file_search` tool uses the Vector Store object.

511Upload your files and create a Vector Store to contain them.

512Once the Vector Store is created, you should poll its status until all files are out of the `in_progress` state to

513ensure that all content has finished processing. The SDK provides helpers to uploading and polling in one shot.

~~514~~

515### Step 3: Update the assistant to use the new Vector Store

~~516~~

517To make the files accessible to your assistant, update the assistant’s `tool_resources` with the new `vector_store` id.

~~518~~

519### Step 4: Create a thread

~~520~~

521You can also attach files as Message attachments on your thread. Doing so will create another `vector_store` associated with the thread, or, if there is already a vector store attached to this thread, attach the new files to the existing thread vector store. When you create a Run on this thread, the file search tool will query both the `vector_store` from your assistant and the `vector_store` on the thread.

~~522~~

523In this example, the user attached a copy of Apple’s latest 10-K filing.

~~524~~

525Vector stores created using message attachments have a default expiration policy of 7 days after they were last active (defined as the last time the vector store was part of a run). This default exists to help you manage your vector storage costs. You can override these expiration policies at any time. Learn more [here](#managing-costs-with-expiration-policies).

~~526~~

527### Step 5: Create a run and check the output

~~528~~

529Now, create a Run and observe that the model uses the File Search tool to provide a response to the user’s question.

~~530~~

~~531~~

~~532~~

533<div data-content-switcher-pane data-value="streaming">

534 <div class="hidden">With streaming</div>

535 </div>

536 <div data-content-switcher-pane data-value="without-streaming" hidden>

537 <div class="hidden">Without streaming</div>

538 </div>

~~539~~

~~540~~

~~541~~

542Your new assistant will query both attached vector stores (one containing `goog-10k.pdf` and `brka-10k.txt`, and the other containing `aapl-10k.pdf`) and return this result from `aapl-10k.pdf`.

~~543~~

544To retrieve the contents of the file search results that were used by the model, use the `include` query parameter and provide a value of `step_details.tool_calls[*].file_search.results[*].content` in the format `?include[]=step_details.tool_calls[*].file_search.results[*].content`.

~~545~~

~~546~~

547## How it works

~~548~~

549The `file_search` tool implements several retrieval best practices out of the box to help you extract the right data from your files and augment the model’s responses. The `file_search` tool:

~~550~~

551- Rewrites user queries to optimize them for search.

552- Breaks down complex user queries into multiple searches it can run in parallel.

553- Runs both keyword and semantic searches across both assistant and thread vector stores.

554- Reranks search results to pick the most relevant ones before generating the final response.

~~555~~

556By default, the `file_search` tool uses the following settings but these can be [configured](#customizing-file-search-settings) to suit your needs:

~~557~~

558- Chunk size: 800 tokens

559- Chunk overlap: 400 tokens

560- Embedding model: `text-embedding-3-large` at 256 dimensions

561- Maximum number of chunks added to context: 20 (could be fewer)

562- Ranker: `auto` (OpenAI will choose which ranker to use)

563- Score threshold: 0 minimum ranking score

~~564~~

565**Known Limitations**

~~566~~

567We have a few known limitations we're working on adding support for in the coming months:

~~568~~

5691. Support for deterministic pre-search filtering using custom metadata.

5702. Support for parsing images within documents (including images of charts, graphs, tables etc.)

5713. Support for retrievals over structured file formats (like `csv` or `jsonl`).

5724. Better support for summarization — the tool today is optimized for search queries.

~~573~~

574## Vector stores

~~575~~

576Vector Store objects give the File Search tool the ability to search your files. Adding a file to a `vector_store` automatically parses, chunks, embeds and stores the file in a vector database that's capable of both keyword and semantic search. Each `vector_store` can hold up to 10,000 files. For vector stores created starting in November 2025, this limit is 100,000,000 files. Vector stores can be attached to both Assistants and Threads. Today, you can attach at most one vector store to an assistant and at most one vector store to a thread.

577 677

578#### Creating vector stores and adding files

~~579~~

580You can create a vector store and add files to it in a single API call:

~~581~~

582Adding files to vector stores is an async operation. To ensure the operation is complete, we recommend that you use the 'create and poll' helpers in our official SDKs. If you're not using the SDKs, you can retrieve the `vector_store` object and monitor its [`file_counts`](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-file_counts) property to see the result of the file ingestion operation.

~~583~~

584Files can also be added to a vector store after it's created by [creating vector store files](https://developers.openai.com/api/docs/api-reference/vector-stores/createFile).

~~585~~

586Adding files is rate limited per vector store ID. Requests to `/vector_stores/{vector_store_id}/files` and `/vector_stores/{vector_store_id}/file_batches` share a per-vector-store limit of 300 requests per minute.

~~587~~

588Alternatively, you can add several files to a vector store by [creating batches](https://developers.openai.com/api/docs/api-reference/vector-stores/createBatch) of up to 500 files.

~~589~~

590Batch creation accepts either a simple list of `file_ids` or a `files` array made up of objects with a `file_id` plus optional `attributes` and `chunking_strategy`. Use `files` when you need per-file metadata or chunking settings, and note that `file_ids` and `files` are mutually exclusive in a single request.

~~591~~

592For high-throughput ingestion into one vector store, prefer file batches whenever possible to reduce request volume and improve latency.

~~593~~

594Similarly, these files can be removed from a vector store by either:

~~595~~

596- Deleting the [vector store file object](https://developers.openai.com/api/docs/api-reference/vector-stores/deleteFile) or,

597- By deleting the underlying [file object](https://developers.openai.com/api/docs/api-reference/files/delete) (which removes the file it from all `vector_store` and `code_interpreter` configurations across all assistants and threads in your organization)

~~598~~

599The maximum file size is 512 MB. Each file should contain no more than 5,000,000 tokens per file (computed automatically when you attach a file).

~~600~~

601File Search supports a variety of file formats including `.pdf`, `.md`, and `.docx`. More details on the file extensions (and their corresponding MIME-types) supported can be found in the [Supported files](#supported-files) section below.

~~602~~

603#### Attaching vector stores

~~604~~

605You can attach vector stores to your Assistant or Thread using the `tool_resources` parameter.

~~606~~

607You can also attach a vector store to Threads or Assistants after they're created by updating them with the right `tool_resources`.

~~608~~

609#### Ensuring vector store readiness before creating runs

~~610~~

611We highly recommend that you ensure all files in a `vector_store` are fully processed before you create a run. This will ensure that all the data in your `vector_store` is searchable. You can check for `vector_store` readiness by using the polling helpers in our SDKs, or by manually polling the `vector_store` object to ensure the [`status`](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-status) is `completed`.

~~612~~

613As a fallback, we've built a **60 second maximum wait** in the Run object when the **thread’s** vector store contains files that are still being processed. This is to ensure that any files your users upload in a thread a fully searchable before the run proceeds. This fallback wait _does not_ apply to the assistant's vector store.

~~614~~

615#### Customizing File Search settings

~~616~~

617You can customize how the `file_search` tool chunks your data and how many chunks it returns to the model context.

~~618~~

619**Chunking configuration**

~~620~~

621By default, `max_chunk_size_tokens` is set to `800` and `chunk_overlap_tokens` is set to `400`, meaning every file is indexed by being split up into 800-token chunks, with 400-token overlap between consecutive chunks.

~~622~~

623You can adjust this by setting [`chunking_strategy`](https://developers.openai.com/api/docs/api-reference/vector-stores-files/createFile#vector-stores-files-createfile-chunking_strategy) when adding files to the vector store. There are certain limitations to `chunking_strategy`:

~~624~~

625- `max_chunk_size_tokens` must be between 100 and 4096 inclusive.

626- `chunk_overlap_tokens` must be non-negative and should not exceed `max_chunk_size_tokens / 2`.

~~627~~

628**Number of chunks**

~~629~~

630By default, the `file_search` tool outputs up to 20 chunks for `gpt-4*` and o-series models and up to 5 chunks for `gpt-3.5-turbo`. You can adjust this by setting [`file_search.max_num_results`](https://developers.openai.com/api/docs/api-reference/assistants/createAssistant#assistants-createassistant-tools) in the tool when creating the assistant or the run.

~~631~~

632Note that the `file_search` tool may output fewer than this number for a myriad of reasons:

~~633~~

634- The total number of chunks is fewer than `max_num_results`.

635- The total token size of all the retrieved chunks exceeds the token "budget" assigned to the `file_search` tool. The `file_search` tool currently has a token budget of:

636 - 4,000 tokens for `gpt-3.5-turbo`

637 - 16,000 tokens for `gpt-4*` models

638 - 16,000 tokens for o-series models

~~639~~

640#### Improve file search result relevance with chunk ranking

~~641~~

642By default, the file search tool will return all search results to the model that it thinks have any level of relevance when generating a response. However, if responses are generated using content that has low relevance, it can lead to lower quality responses. You can adjust this behavior by both inspecting the file search results that are returned when generating responses, and then tuning the behavior of the file search tool's ranker to change how relevant results must be before they are used to generate a response.

~~643~~

644**Inspecting file search chunks**

~~645~~

646The first step in improving the quality of your file search results is inspecting the current behavior of your assistant. Most often, this will involve investigating responses from your assistant that are not not performing well. You can get [granular information about a past run step](https://developers.openai.com/api/docs/api-reference/run-steps/getRunStep) using the REST API, specifically using the `include` query parameter to get the file chunks that are being used to generate results.

~~647~~

648You can then log and inspect the search results used during the run step, and determine whether or not they are consistently relevant to the responses your assistant should generate.

~~649~~

650**Configure ranking options**

~~651~~

652If you have determined that your file search results are not sufficiently relevant to generate high quality responses, you can adjust the settings of the result ranker used to choose which search results should be used to generate responses. You can adjust this setting [`file_search.ranking_options`](https://developers.openai.com/api/docs/api-reference/assistants/createAssistant#assistants-createassistant-tools) in the tool when **creating the assistant** or **creating the run**.

~~653~~

654The settings you can configure are:

~~655~~

656- `ranker` - Which ranker to use in determining which chunks to use. The available values are `auto`, which uses the latest available ranker, and `default_2024_08_21`.

657- `score_threshold` - a ranking between 0.0 and 1.0, with 1.0 being the highest ranking. A higher number will constrain the file chunks used to generate a result to only chunks with a higher possible relevance, at the cost of potentially leaving out relevant chunks.

658- `hybrid_search.embedding_weight` (also referred to as `rrf_embedding_weight`) - determines how much weight to give to semantic similarity when combining dense (embedding) and sparse (text) rankings with [reciprocal rank fusion](https://en.wikipedia.org/wiki/Reciprocal_rank_fusion). Increase this weight to favor chunks that are close in embedding space.

659- `hybrid_search.text_weight` (also referred to as `rrf_text_weight`) - determines how much weight to give to keyword/text matching when hybrid search is enabled. Increase this weight to favor chunks that share exact terms with the query.

~~660~~

661At least one of `hybrid_search.embedding_weight` or `hybrid_search.text_weight` must be greater than zero when hybrid search is configured.

~~662~~

663#### Managing costs with expiration policies

~~664~~

665The `file_search` tool uses the `vector_stores` object as its resource and you will be billed based on the [size](https://developers.openai.com/api/docs/api-reference/vector-stores/object#vector-stores/object-bytes) of the `vector_store` objects created. The size of the vector store object is the sum of all the parsed chunks from your files and their corresponding embeddings.

~~666~~

667You first GB is free and beyond that, usage is billed at $0.10/GB/day of vector storage. There are no other costs associated with vector store operations.

~~668~~

669In order to help you manage the costs associated with these `vector_store` objects, we have added support for expiration policies in the `vector_store` object. You can set these policies when creating or updating the `vector_store` object.

~~670~~

671**Thread vector stores have default expiration policies**

~~672~~

673Vector stores created using thread helpers (like [`tool_resources.file_search.vector_stores`](https://developers.openai.com/api/docs/api-reference/threads/createThread#threads-createthread-tool_resources) in Threads or [message.attachments](https://developers.openai.com/api/docs/api-reference/messages/createMessage#messages-createmessage-attachments) in Messages) have a default expiration policy of 7 days after they were last active (defined as the last time the vector store was part of a run).

~~674~~

675When a vector store expires, runs on that thread will fail. To fix this, you can simply recreate a new `vector_store` with the same files and reattach it to the thread.

676 678

677## Supported files679## Supported files

678 680

assistants/tools/function-calling.md +83 −86

Details

1# Assistants Function Calling1# Assistants Function Calling

2 2

~~3export const snippetDefineFunctions = {~~3## Overview

~~4 python: `~~4

5Similar to the Chat Completions API, the Assistants API supports function calling. Function calling allows you to describe functions to the Assistants API and have it intelligently return the functions that need to be called along with their arguments.

7## Quickstart

9In this example, we'll create a weather assistant and define two functions,

10`get_current_temperature` and `get_rain_probability`, as tools that the Assistant can call.

11Depending on the user query, the model will invoke parallel function calling if using our

12latest models released on or after Nov 6, 2023.

13In our example that uses parallel function calling, we will ask the Assistant what the weather in

14San Francisco is like today and the chances of rain. We also show how to output the Assistant's response with streaming.

16With the launch of Structured Outputs, you can now use the parameter `strict:

17 true` when using function calling with the Assistants API. For more

18 information, refer to the [Function calling

19 guide](https://developers.openai.com/api/docs/guides/function-calling#function-calling-with-structured-outputs).

20 Please note that Structured Outputs are not supported in the Assistants API

21 when using vision.

23### Step 1: Define functions

25When creating your assistant, you will first define the functions under the `tools` param of the assistant.

27```python

5from openai import OpenAI28from openai import OpenAI

6client = OpenAI()29client = OpenAI()

7 30

50}73}

51]74]

52)75)

~~53`.trim(),~~76```

~~54 "node.js": `~~77

78```javascript

55const assistant = await client.beta.assistants.create({79const assistant = await client.beta.assistants.create({

56model: "gpt-4o",80model: "gpt-4o",

57instructions:81instructions:

99},123},

100],124],

101});125});

102`.trim(),126```

103};127

104 128

105export const snippetCreateThread = {129### Step 2: Create a Thread and add Messages

106 python: `130

131Create a Thread when a user starts a conversation and add Messages to the Thread as the user asks questions.

132

133```python

107thread = client.beta.threads.create()134thread = client.beta.threads.create()

108message = client.beta.threads.messages.create(135message = client.beta.threads.messages.create(

109 thread_id=thread.id,136 thread_id=thread.id,

110 role="user",137 role="user",

111 content="What's the weather in San Francisco today and the likelihood it'll rain?",138 content="What's the weather in San Francisco today and the likelihood it'll rain?",

112)139)

113 `.trim(),140```

114 "node.js": `141

142```javascript

115const thread = await client.beta.threads.create();143const thread = await client.beta.threads.create();

116const message = client.beta.threads.messages.create(thread.id, {144const message = client.beta.threads.messages.create(thread.id, {

117 role: "user",145 role: "user",

118 content: "What's the weather in San Francisco today and the likelihood it'll rain?",146 content: "What's the weather in San Francisco today and the likelihood it'll rain?",

119});147});

120 `.trim(),148```

121};149

150

151### Step 3: Initiate a Run

122 152

123export const snippetRunObject = {153When you initiate a Run on a Thread containing a user Message that triggers one or more functions,

124 json: `154the Run will enter a `pending` status. After it processes, the run will enter a `requires_action` state which you can

155verify by checking the Run’s `status`. This indicates that you need to run tools and submit their outputs to the

156Assistant to continue Run execution. In our case, we will see two `tool_calls`, which indicates that the

157user query resulted in parallel function calling.

158

159Note that a runs expire ten minutes after creation. Be sure to submit your

160 tool outputs before the 10 min mark.

161

162You will see two `tool_calls` within `required_action`, which indicates the user query triggered parallel function calling.

163

164```json

125{165{

126 "id": "run_qJL1kI9xxWlfE0z1yfL0fGg9",166 "id": "run_qJL1kI9xxWlfE0z1yfL0fGg9",

127 ...167 ...

151 "type": "submit_tool_outputs"191 "type": "submit_tool_outputs"

152 }192 }

153}193}

154 `.trim(),194```

155};195

196<figcaption>Run object truncated here for readability</figcaption>

197<br />

156 198

157export const snippetStructuredOutputs = {199How you initiate a Run and submit `tool_calls` will differ depending on whether you are using streaming or not,

158 python: `200although in both cases all `tool_calls` need to be submitted at the same time.

201You can then complete the Run by submitting the tool outputs from the functions you called.

202Pass each `tool_call_id` referenced in the `required_action` object to match outputs to each function call.

203

204

205

206<div data-content-switcher-pane data-value="streaming">

207 <div class="hidden">With streaming</div>

208 </div>

209 <div data-content-switcher-pane data-value="without-streaming" hidden>

210 <div class="hidden">Without streaming</div>

211 </div>

212

213

214

215### Using Structured Outputs

216

217When you enable [Structured Outputs](https://developers.openai.com/api/docs/guides/structured-outputs) by supplying `strict: true`, the OpenAI API will pre-process your supplied schema on your first request, and then use this artifact to constrain the model to your schema.

218

219```python

159from openai import OpenAI220from openai import OpenAI

160client = OpenAI()221client = OpenAI()

161 222

216}277}

217]278]

218)279)

219`.trim(),280```

220 "node.js": `281

282```javascript

221const assistant = await client.beta.assistants.create({283const assistant = await client.beta.assistants.create({

222model: "gpt-4o-2024-08-06",284model: "gpt-4o-2024-08-06",

223instructions:285instructions:

277},339},

278],340],

279});341});

280`.trim(),

281};

~~282~~

283## Overview

~~284~~

285Similar to the Chat Completions API, the Assistants API supports function calling. Function calling allows you to describe functions to the Assistants API and have it intelligently return the functions that need to be called along with their arguments.

~~286~~

287## Quickstart

~~288~~

289In this example, we'll create a weather assistant and define two functions,

290`get_current_temperature` and `get_rain_probability`, as tools that the Assistant can call.

291Depending on the user query, the model will invoke parallel function calling if using our

292latest models released on or after Nov 6, 2023.

293In our example that uses parallel function calling, we will ask the Assistant what the weather in

294San Francisco is like today and the chances of rain. We also show how to output the Assistant's response with streaming.

~~295~~

296With the launch of Structured Outputs, you can now use the parameter `strict:

297 true` when using function calling with the Assistants API. For more

298 information, refer to the [Function calling

299 guide](https://developers.openai.com/api/docs/guides/function-calling#function-calling-with-structured-outputs).

300 Please note that Structured Outputs are not supported in the Assistants API

301 when using vision.

~~302~~

303### Step 1: Define functions

~~304~~

305When creating your assistant, you will first define the functions under the `tools` param of the assistant.

~~306~~

307### Step 2: Create a Thread and add Messages

~~308~~

309Create a Thread when a user starts a conversation and add Messages to the Thread as the user asks questions.

~~310~~

311### Step 3: Initiate a Run

~~312~~

313When you initiate a Run on a Thread containing a user Message that triggers one or more functions,

314the Run will enter a `pending` status. After it processes, the run will enter a `requires_action` state which you can

315verify by checking the Run’s `status`. This indicates that you need to run tools and submit their outputs to the

316Assistant to continue Run execution. In our case, we will see two `tool_calls`, which indicates that the

317user query resulted in parallel function calling.

~~318~~

319Note that a runs expire ten minutes after creation. Be sure to submit your

320 tool outputs before the 10 min mark.

~~321~~

322You will see two `tool_calls` within `required_action`, which indicates the user query triggered parallel function calling.

~~323~~

324<figcaption>Run object truncated here for readability</figcaption>

325<br />

~~326~~

327How you initiate a Run and submit `tool_calls` will differ depending on whether you are using streaming or not,

328although in both cases all `tool_calls` need to be submitted at the same time.

329You can then complete the Run by submitting the tool outputs from the functions you called.

330Pass each `tool_call_id` referenced in the `required_action` object to match outputs to each function call.

~~331~~

~~332~~

~~333~~

334<div data-content-switcher-pane data-value="streaming">

335 <div class="hidden">With streaming</div>

336 </div>

337 <div data-content-switcher-pane data-value="without-streaming" hidden>

338 <div class="hidden">Without streaming</div>

339 </div>

~~340~~

~~341~~

~~342~~

343### Using Structured Outputs

~~344~~

345When you enable [Structured Outputs](https://developers.openai.com/api/docs/guides/structured-outputs) by supplying `strict: true`, the OpenAI API will pre-process your supplied schema on your first request, and then use this artifact to constrain the model to your schema.

342```

guides/admin-apis.md +403 −0

Details

18 18

19Set `OPENAI_ADMIN_KEY`, then initialize the SDK for your language.19Set `OPENAI_ADMIN_KEY`, then initialize the SDK for your language.

20 20

21Set up the SDK with an Admin API key

23```javascript

24import OpenAI from "openai";

26const client = new OpenAI({

27 adminAPIKey: process.env.OPENAI_ADMIN_KEY,

28});

29```

31```python

32import os

33from openai import OpenAI

35client = OpenAI(

36 admin_api_key=os.environ["OPENAI_ADMIN_KEY"],

37)

38```

40```go

41package main

43import (

44 "os"

46 "github.com/openai/openai-go/v3"

47 "github.com/openai/openai-go/v3/option"

48)

50func main() {

51 client := openai.NewClient(

52 option.WithAdminAPIKey(os.Getenv("OPENAI_ADMIN_KEY")),

53 )

55 _ = client

56}

57```

59```ruby

60require "openai"

62openai = OpenAI::Client.new(

63 admin_api_key: ENV.fetch("OPENAI_ADMIN_KEY")

64)

65```

67```java

68import com.openai.client.OpenAIClient;

69import com.openai.client.okhttp.OpenAIOkHttpClient;

71OpenAIClient client = OpenAIOkHttpClient.builder()

72 .adminApiKey(System.getenv("OPENAI_ADMIN_KEY"))

73 .build();

74```

21## Restrict model access for projects77## Restrict model access for projects

22 78

23Use project model permissions to set an allowlist or denylist for a project. Set `mode` to `allow_list` to allow only the listed models, or set `mode` to `deny_list` to block the listed models while allowing other available models. Model IDs must be visible to the organization, including visible fine-tuned model snapshots.79Use project model permissions to set an allowlist or denylist for a project. Set `mode` to `allow_list` to allow only the listed models, or set `mode` to `deny_list` to block the listed models while allowing other available models. Model IDs must be visible to the organization, including visible fine-tuned model snapshots.

24 80

81Set a project model allowlist/denylist

83```javascript

84const modelPermissions =

85 await client.admin.organization.projects.modelPermissions.update("proj_abc", {

86 mode: "allow_list",

87 model_ids: ["gpt-4.1", "o3"],

88 });

90console.log(modelPermissions.mode);

91```

93```python

94model_permissions = client.admin.organization.projects.model_permissions.update(

95 "proj_abc",

96 mode="allow_list",

97 model_ids=["gpt-4.1", "o3"],

98)

100print(model_permissions.mode)

101```

102

103```go

104ctx := context.Background()

105

106modelPermissions, err := client.Admin.Organization.Projects.ModelPermissions.Update(

107 ctx,

108 "proj_abc",

109 openai.AdminOrganizationProjectModelPermissionUpdateParams{

110 Mode: openai.AdminOrganizationProjectModelPermissionUpdateParamsModeAllowList,

111 ModelIDs: []string{"gpt-4.1", "o3"},

112 },

113)

114if err != nil {

115 panic(err)

116}

117

118println(modelPermissions.Mode)

119```

120

121```ruby

122model_permissions = openai.admin.organization.projects.model_permissions.update(

123 "proj_abc",

124 mode: :allow_list,

125 model_ids: ["gpt-4.1", "o3"]

126)

127

128puts(model_permissions.mode)

129```

130

131```java

132import com.openai.models.admin.organization.projects.modelpermissions.ModelPermissionUpdateParams;

133import com.openai.models.admin.organization.projects.modelpermissions.ProjectModelPermissions;

134import java.util.List;

135

136ProjectModelPermissions modelPermissions = client.admin()

137 .organization()

138 .projects()

139 .modelPermissions()

140 .update(

141 "proj_abc",

142 ModelPermissionUpdateParams.builder()

143 .mode(ModelPermissionUpdateParams.Mode.ALLOW_LIST)

144 .modelIds(List.of("gpt-4.1", "o3"))

145 .build()

146 );

147

148System.out.println(modelPermissions.mode());

149```

150

151

25## Manage spend limit alerts152## Manage spend limit alerts

26 153

27Use project spend alerts to notify your team when project spend reaches a threshold. Threshold amounts are specified in cents.154Use project spend alerts to notify your team when project spend reaches a threshold. Threshold amounts are specified in cents.

28 155

156Create a project spend limit alert

157

158```javascript

159const spendAlert =

160 await client.admin.organization.projects.spendAlerts.create("proj_abc", {

161 currency: "USD",

162 interval: "month",

163 notification_channel: {

164 recipients: ["billing@example.com"],

165 type: "email",

166 subject_prefix: "[OpenAI spend]",

167 },

168 threshold_amount: 50000,

169 });

170

171console.log(spendAlert.id);

172```

173

174```python

175spend_alert = client.admin.organization.projects.spend_alerts.create(

176 "proj_abc",

177 currency="USD",

178 interval="month",

179 notification_channel={

180 "recipients": ["billing@example.com"],

181 "type": "email",

182 "subject_prefix": "[OpenAI spend]",

183 },

184 threshold_amount=50000,

185)

186

187print(spend_alert.id)

188```

189

190```go

191ctx := context.Background()

192

193spendAlert, err := client.Admin.Organization.Projects.SpendAlerts.New(

194 ctx,

195 "proj_abc",

196 openai.AdminOrganizationProjectSpendAlertNewParams{

197 Currency: openai.AdminOrganizationProjectSpendAlertNewParamsCurrencyUsd,

198 Interval: openai.AdminOrganizationProjectSpendAlertNewParamsIntervalMonth,

199 NotificationChannel: openai.AdminOrganizationProjectSpendAlertNewParamsNotificationChannel{

200 Recipients: []string{"billing@example.com"},

201 Type: "email",

202 SubjectPrefix: openai.String("[OpenAI spend]"),

203 },

204 ThresholdAmount: 50000,

205 },

206)

207if err != nil {

208 panic(err)

209}

210

211println(spendAlert.ID)

212```

213

214```ruby

215spend_alert = openai.admin.organization.projects.spend_alerts.create(

216 "proj_abc",

217 currency: :USD,

218 interval: :month,

219 notification_channel: {

220 recipients: ["billing@example.com"],

221 type: :email,

222 subject_prefix: "[OpenAI spend]"

223 },

224 threshold_amount: 50_000

225)

226

227puts(spend_alert.id)

228```

229

230```java

231import com.openai.models.admin.organization.projects.spendalerts.ProjectSpendAlert;

232import com.openai.models.admin.organization.projects.spendalerts.SpendAlertCreateParams;

233

234ProjectSpendAlert spendAlert = client.admin()

235 .organization()

236 .projects()

237 .spendAlerts()

238 .create(

239 "proj_abc",

240 SpendAlertCreateParams.builder()

241 .currency(SpendAlertCreateParams.Currency.USD)

242 .interval(SpendAlertCreateParams.Interval.MONTH)

243 .notificationChannel(

244 SpendAlertCreateParams.NotificationChannel.builder()

245 .addRecipient("billing@example.com")

246 .subjectPrefix("[OpenAI spend]")

247 .build()

248 )

249 .thresholdAmount(50000L)

250 .build()

251 );

252

253System.out.println(spendAlert.id());

254```

255

256

29## Manage data retention257## Manage data retention

30 258

31Use project data retention controls to override or inherit the organization's retention policy for a project. Set `retention_type` to `organization_default` to inherit the organization setting.259Use project data retention controls to override or inherit the organization's retention policy for a project. Set `retention_type` to `organization_default` to inherit the organization setting.

32 260

261Set project data retention

262

263```javascript

264const dataRetention =

265 await client.admin.organization.projects.dataRetention.update("proj_abc", {

266 retention_type: "organization_default",

267 });

268

269console.log(dataRetention.type);

270```

271

272```python

273data_retention = client.admin.organization.projects.data_retention.update(

274 "proj_abc",

275 retention_type="organization_default",

276)

277

278print(data_retention.type)

279```

280

281```go

282ctx := context.Background()

283

284dataRetention, err := client.Admin.Organization.Projects.DataRetention.Update(

285 ctx,

286 "proj_abc",

287 openai.AdminOrganizationProjectDataRetentionUpdateParams{

288 RetentionType: openai.AdminOrganizationProjectDataRetentionUpdateParamsRetentionTypeOrganizationDefault,

289 },

290)

291if err != nil {

292 panic(err)

293}

294

295println(dataRetention.Type)

296```

297

298```ruby

299data_retention = openai.admin.organization.projects.data_retention.update(

300 "proj_abc",

301 retention_type: :organization_default

302)

303

304puts(data_retention.type)

305```

306

307```java

308import com.openai.models.admin.organization.projects.dataretention.DataRetentionUpdateParams;

309import com.openai.models.admin.organization.projects.dataretention.ProjectDataRetention;

310

311ProjectDataRetention dataRetention = client.admin()

312 .organization()

313 .projects()

314 .dataRetention()

315 .update(

316 "proj_abc",

317 DataRetentionUpdateParams.builder()

318 .retentionType(DataRetentionUpdateParams.RetentionType.ORGANIZATION_DEFAULT)

319 .build()

320 );

321

322System.out.println(dataRetention.type());

323```

324

325

33## Invite a user by email326## Invite a user by email

34 327

35Use the Invites endpoint to send an organization invitation to an email address.328Use the Invites endpoint to send an organization invitation to an email address.

36 329

330Invite a user by email

331

332```javascript

333const invite = await client.admin.organization.invites.create({

334 email: "user@example.com",

335 role: "reader",

336});

337

338console.log(invite.id);

339```

340

341```python

342invite = client.admin.organization.invites.create(

343 email="user@example.com",

344 role="reader",

345)

346

347print(invite.id)

348```

349

350```go

351ctx := context.Background()

352

353invite, err := client.Admin.Organization.Invites.New(ctx, openai.AdminOrganizationInviteNewParams{

354 Email: "user@example.com",

355 Role: openai.AdminOrganizationInviteNewParamsRoleReader,

356})

357if err != nil {

358 panic(err)

359}

360

361println(invite.ID)

362```

363

364```ruby

365invite = openai.admin.organization.invites.create(

366 email: "user@example.com",

367 role: :reader

368)

369

370puts(invite.id)

371```

372

373```java

374import com.openai.models.admin.organization.invites.Invite;

375import com.openai.models.admin.organization.invites.InviteCreateParams;

376

377Invite invite = client.admin().organization().invites().create(

378 InviteCreateParams.builder()

379 .email("user@example.com")

380 .role(InviteCreateParams.Role.READER)

381 .build()

382);

383

384System.out.println(invite.id());

385```

386

387

37## Retrieve audit logs388## Retrieve audit logs

38 389

39Use the Audit Logs endpoint to list recent user actions and configuration changes for the organization.390Use the Audit Logs endpoint to list recent user actions and configuration changes for the organization.

391

392Retrieve audit logs

393

394```javascript

395const auditLogs = await client.admin.organization.auditLogs.list({

396 limit: 10,

397});

398

399console.log(auditLogs.data);

400```

401

402```python

403audit_logs = client.admin.organization.audit_logs.list(limit=10)

404

405for audit_log in audit_logs.data:

406 print(audit_log.id)

407```

408

409```go

410ctx := context.Background()

411

412auditLogs, err := client.Admin.Organization.AuditLogs.List(ctx, openai.AdminOrganizationAuditLogListParams{

413 Limit: openai.Int(10),

414})

415if err != nil {

416 panic(err)

417}

418

419for _, auditLog := range auditLogs.Data {

420 println(auditLog.ID)

421}

422```

423

424```ruby

425audit_logs = openai.admin.organization.audit_logs.list(limit: 10)

426

427audit_logs.data.each do |audit_log|

428 puts(audit_log.id)

429end

430```

431

432```java

433import com.openai.models.admin.organization.auditlogs.AuditLogListParams;

434

435var page = client.admin().organization().auditLogs().list(

436 AuditLogListParams.builder()

437 .limit(10L)

438 .build()

439);

440

441page.data().forEach(auditLog -> System.out.println(auditLog.id()));

442```

guides/amazon-bedrock.md +9 −4

Details

35 35

36- Instantiate `BedrockOpenAI` instead of the default `OpenAI` client. The client36- Instantiate `BedrockOpenAI` instead of the default `OpenAI` client. The client

37 derives the regional Mantle base URL from the AWS Region.37 derives the regional Mantle base URL from the AWS Region.

~~38- For the initial `openai.gpt-5.5` deployment, use `us-east-2`. This resolves to~~38- This guide's examples use `us-east-2`, which resolves to

39 `https://bedrock-mantle.us-east-2.api.aws/openai/v1`.39 `https://bedrock-mantle.us-east-2.api.aws/openai/v1`.

40- Use a Bedrock model ID with the `openai.` prefix, such as40- Use a Bedrock model ID with the `openai.` prefix, such as

41 `openai.gpt-5.5`.41 `openai.gpt-5.5`.

148before rollout.148before rollout.

149 149

150Amazon Bedrock provides Responses API-compatible inference for supported OpenAI150Amazon Bedrock provides Responses API-compatible inference for supported OpenAI

151models in supported AWS Regions in the United States. AWS manages authentication,151models in supported AWS Regions. AWS manages authentication, account access,

152account access, procurement, and billing.152procurement, and billing.

153 153

154AWS Regions are physical deployment locations, which differ from OpenAI data154AWS Regions are physical deployment locations, which differ from OpenAI data

155residency jurisdictions. Teams with residency requirements should evaluate the155residency jurisdictions. Teams with residency requirements should evaluate the

162the initial Amazon Bedrock offering. It excludes transient availability and162the initial Amazon Bedrock offering. It excludes transient availability and

163service status.163service status.

164 164

165The information below represents feature availability as of June 1, 2026.165The information below represents feature availability as of June 8, 2026.

166 Model and Region availability can also change. For the latest information, see166 Model and Region availability can also change. For the latest information, see

167 the [AWS documentation for OpenAI models in Amazon167 the [AWS documentation for OpenAI models in Amazon

168 Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/model-cards-openai.html)168 Bedrock](https://docs.aws.amazon.com/bedrock/latest/userguide/model-cards-openai.html)

182| Context window | Model-dependent | 272,000 tokens for GPT-5.4 and GPT-5.5 |

195support. Hosted tools run through OpenAI-operated service infrastructure and196support. Hosted tools run through OpenAI-operated service infrastructure and

196are unavailable in the initial Amazon Bedrock offering.197are unavailable in the initial Amazon Bedrock offering.

197 198

199GPT-5.4 and GPT-5.5 have a 272,000-token context window on Amazon Bedrock.

200Amazon Bedrock rejects requests that exceed this limit. See the AWS model cards

201for current model-specific limits.

202

198Treat feature parity as workload-specific. If your application depends on a203Treat feature parity as workload-specific. If your application depends on a

199specific tool, response mode, or service tier, test that behavior through204specific tool, response mode, or service tier, test that behavior through

200Bedrock before you commit to the deployment path.205Bedrock before you commit to the deployment path.

guides/citation-formatting.md +191 −189

Details

1# Citation Formatting1# Citation Formatting

2 2

~~3export const parseCitationsExample = {~~

~~4 python: [~~

~~5 "import re",~~

~~6 "from typing import Iterable, TypedDict",~~

~~7 "",~~

~~8 'CITATION_START = "\\ue200"',~~

~~9 'CITATION_DELIMITER = "\\ue202"',~~

~~10 'CITATION_STOP = "\\ue201"',~~

~~11 "",~~

~~12 'SOURCE_ID_RE = re.compile(r"^[A-Za-z0-9_-]+$")',~~

~~13 'LINE_LOCATOR_RE = re.compile(r"^L\\\\d+(?:-L\\\\d+)?$")',~~

~~14 "",~~

~~15 "",~~

~~16 "class Citation(TypedDict):",~~

~~17 " raw: str",~~

~~18 " family: str",~~

~~19 " source_ids: list[str]",~~

~~20 " locator: str | None",~~

~~21 " start: int",~~

~~22 " end: int",~~

~~23 "",~~

~~24 "",~~

~~25 "def extract_citations(",~~

~~26 " text: str,",~~

~~27 " *,",~~

~~28 ' families: tuple[str, ...] = ("cite",),',~~

~~29 ") -> list[Citation]:",~~

~~30 ' """',~~

~~31 " Extract citations such as:",~~

~~32 "",~~

~~33 " {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_STOP}",~~

~~34 " {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_DELIMITER}L8-L13{CITATION_STOP}",~~

~~35 " {CITATION_START}cite{CITATION_DELIMITER}turn0search0{CITATION_DELIMITER}turn1news2{CITATION_STOP}",~~

~~36 ' """',~~

~~37 " if not families:",~~

~~38 " return []",~~

~~39 "",~~

~~40 ' family_pattern = "|".join(re.escape(family) for family in families)',~~

~~41 " token_re = re.compile(",~~

~~42 ' rf"{re.escape(CITATION_START)}"',~~

~~43 ' rf"(?P<family>{family_pattern})"',~~

~~44 ' rf"{re.escape(CITATION_DELIMITER)}"',~~

~~45 ' rf"(?P<body>.*?)"',~~

~~46 ' rf"{re.escape(CITATION_STOP)}",',~~

~~47 " re.DOTALL,",~~

~~48 " )",~~

~~49 "",~~

~~50 " citations: list[Citation] = []",~~

~~51 "",~~

~~52 " for match in token_re.finditer(text):",~~

~~53 ' parts = [part.strip() for part in match.group("body").split(CITATION_DELIMITER)]',~~

~~54 " parts = [part for part in parts if part]",~~

~~55 "",~~

~~56 " if not parts:",~~

~~57 " continue",~~

~~58 "",~~

~~59 " locator = None",~~

~~60 " if LINE_LOCATOR_RE.fullmatch(parts[-1]):",~~

~~61 " locator = parts.pop()",~~

~~62 "",~~

~~63 " if not parts or any(not SOURCE_ID_RE.fullmatch(part) for part in parts):",~~

~~64 " continue",~~

~~65 "",~~

~~66 " citations.append(",~~

~~67 " {",~~

~~68 ' "raw": match.group(0),',~~

~~69 ' "family": match.group("family"),',~~

~~70 ' "source_ids": parts,',~~

~~71 ' "locator": locator,',~~

~~72 ' "start": match.start(),',~~

~~73 ' "end": match.end(),',~~

~~74 " }",~~

~~75 " )",~~

~~76 "",~~

~~77 " return citations",~~

~~78 "",~~

~~79 "",~~

~~80 "def strip_citations(text: str, citations: Iterable[Citation]) -> str:",~~

~~81 ' """',~~

~~82 " Remove raw citation markers from text using offsets returned by",~~

~~83 " extract_citations().",~~

~~84 ' """',~~

~~85 " clean_text = text",~~

~~86 "",~~

~~87 ' for citation in sorted(citations, key=lambda item: item["start"], reverse=True):',~~

~~88 ' clean_text = clean_text[: citation["start"]] + clean_text[citation["end"] :]',~~

~~89 "",~~

~~90 " return clean_text",~~

~~91 ].join("\n"),~~

~~92 "node.js": [~~

~~93 'const CITATION_START = "\\uE200";',~~

~~94 'const CITATION_DELIMITER = "\\uE202";',~~

~~95 'const CITATION_STOP = "\\uE201";',~~

~~96 "",~~

~~97 "const SOURCE_ID_RE = /^[A-Za-z0-9_-]+$/;",~~

~~98 "const LINE_LOCATOR_RE = /^L\\d+(?:-L\\d+)?$/;",~~

~~99 "",~~

100 "/**",

101 " * @typedef {Object} Citation",

102 " * @property {string} raw",

103 " * @property {string} family",

104 " * @property {string[]} source_ids",

105 " * @property {string | null} locator",

106 " * @property {number} start",

107 " * @property {number} end",

108 " */",

109 "",

110 "/**",

111 " * Extract citations such as:",

112 " *",

113 " * {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_STOP}",

114 " * {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_DELIMITER}L8-L13{CITATION_STOP}",

115 " * {CITATION_START}cite{CITATION_DELIMITER}turn0search0{CITATION_DELIMITER}turn1news2{CITATION_STOP}",

116 " *",

117 " * @param {string} text",

118 " * @param {{ families?: string[] }} [options]",

119 " * @returns {Citation[]}",

120 " */",

121 'function extractCitations(text, { families = ["cite"] } = {}) {',

122 " if (families.length === 0) {",

123 " return [];",

124 " }",

125 "",

126 " const familyPattern = families",

127 ' .map((family) => family.replace(/[.*+?^${}()|[\\]\\\\]/g, "\\\\$&"))',

128 ' .join("|");',

129 "",

130 " const tokenRe = new RegExp(",

131 " `${CITATION_START}(?<family>${familyPattern})${CITATION_DELIMITER}(?<body>[\\\\s\\\\S]*?)${CITATION_STOP}`,",

132 ' "g"',

133 " );",

134 "",

135 " /** @type {Citation[]} */",

136 " const citations = [];",

137 "",

138 " for (const match of text.matchAll(tokenRe)) {",

139 ' const body = match.groups?.body ?? "";',

140 " const parts = body",

141 " .split(CITATION_DELIMITER)",

142 " .map((part) => part.trim())",

143 " .filter(Boolean);",

144 "",

145 " if (parts.length === 0) {",

146 " continue;",

147 " }",

148 "",

149 " let locator = null;",

150 " const lastPart = parts[parts.length - 1];",

151 " if (LINE_LOCATOR_RE.test(lastPart)) {",

152 " locator = parts.pop() ?? null;",

153 " }",

154 "",

155 " if (parts.length === 0 || parts.some((part) => !SOURCE_ID_RE.test(part))) {",

156 " continue;",

157 " }",

158 "",

159 " citations.push({",

160 " raw: match[0],",

161 ' family: match.groups?.family ?? "",',

162 " source_ids: parts,",

163 " locator,",

164 " start: match.index ?? 0,",

165 " end: (match.index ?? 0) + match[0].length,",

166 " });",

167 " }",

168 "",

169 " return citations;",

170 "}",

171 "",

172 "/**",

173 " * @param {string} text",

174 " * @param {Iterable<Citation>} citations",

175 " * @returns {string}",

176 " */",

177 "function stripCitations(text, citations) {",

178 " let cleanText = text;",

179 " const sortedCitations = Array.from(citations).sort(",

180 " (left, right) => right.start - left.start",

181 " );",

182 "",

183 " for (const citation of sortedCitations) {",

184 " cleanText = cleanText.slice(0, citation.start) + cleanText.slice(citation.end);",

185 " }",

186 "",

187 " return cleanText;",

188 "}",

189 ].join("\n"),

190};

~~191~~

192Reliable citations build trust and help readers verify the accuracy of responses. This guide provides practical guidance on how to prepare citable material and instruct the model to format citations effectively, using patterns that are familiar to OpenAI models.3Reliable citations build trust and help readers verify the accuracy of responses. This guide provides practical guidance on how to prepare citable material and instruct the model to format citations effectively, using patterns that are familiar to OpenAI models.

193 4

194## Overview5## Overview

382 193

383Post-processor examples194Post-processor examples

384 195

196Citation parsing helpers

197

198```python

199import re

200from typing import Iterable, TypedDict

201

202CITATION_START = "\ue200"

203CITATION_DELIMITER = "\ue202"

204CITATION_STOP = "\ue201"

205

206SOURCE_ID_RE = re.compile(r"^[A-Za-z0-9_-]+$")

207LINE_LOCATOR_RE = re.compile(r"^L\\d+(?:-L\\d+)?$")

208

209

210class Citation(TypedDict):

211 raw: str

212 family: str

213 source_ids: list[str]

214 locator: str | None

215 start: int

216 end: int

217

218

219def extract_citations(

220 text: str,

221 *,

222 families: tuple[str, ...] = ("cite",),

223) -> list[Citation]:

224 """

225 Extract citations such as:

226

227 {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_STOP}

228 {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_DELIMITER}L8-L13{CITATION_STOP}

229 {CITATION_START}cite{CITATION_DELIMITER}turn0search0{CITATION_DELIMITER}turn1news2{CITATION_STOP}

230 """

231 if not families:

232 return []

233

234 family_pattern = "|".join(re.escape(family) for family in families)

235 token_re = re.compile(

236 rf"{re.escape(CITATION_START)}"

237 rf"(?P<family>{family_pattern})"

238 rf"{re.escape(CITATION_DELIMITER)}"

239 rf"(?P<body>.*?)"

240 rf"{re.escape(CITATION_STOP)}",

241 re.DOTALL,

242 )

243

244 citations: list[Citation] = []

245

246 for match in token_re.finditer(text):

247 parts = [part.strip() for part in match.group("body").split(CITATION_DELIMITER)]

248 parts = [part for part in parts if part]

249

250 if not parts:

251 continue

252

253 locator = None

254 if LINE_LOCATOR_RE.fullmatch(parts[-1]):

255 locator = parts.pop()

256

257 if not parts or any(not SOURCE_ID_RE.fullmatch(part) for part in parts):

258 continue

259

260 citations.append(

261 {

262 "raw": match.group(0),

263 "family": match.group("family"),

264 "source_ids": parts,

265 "locator": locator,

266 "start": match.start(),

267 "end": match.end(),

268 }

269 )

270

271 return citations

272

273

274def strip_citations(text: str, citations: Iterable[Citation]) -> str:

275 """

276 Remove raw citation markers from text using offsets returned by

277 extract_citations().

278 """

279 clean_text = text

280

281 for citation in sorted(citations, key=lambda item: item["start"], reverse=True):

282 clean_text = clean_text[: citation["start"]] + clean_text[citation["end"] :]

283

284 return clean_text

285```

286

287```javascript

288const CITATION_START = "\uE200";

289const CITATION_DELIMITER = "\uE202";

290const CITATION_STOP = "\uE201";

291

292const SOURCE_ID_RE = /^[A-Za-z0-9_-]+$/;

293const LINE_LOCATOR_RE = /^L\d+(?:-L\d+)?$/;

294

295/**

296 * @typedef {Object} Citation

297 * @property {string} raw

298 * @property {string} family

299 * @property {string[]} source_ids

300 * @property {string | null} locator

301 * @property {number} start

302 * @property {number} end

303 */

304

305/**

306 * Extract citations such as:

307 *

308 * {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_STOP}

309 * {CITATION_START}cite{CITATION_DELIMITER}turn0file0{CITATION_DELIMITER}L8-L13{CITATION_STOP}

310 * {CITATION_START}cite{CITATION_DELIMITER}turn0search0{CITATION_DELIMITER}turn1news2{CITATION_STOP}

311 *

312 * @param {string} text

313 * @param {{ families?: string[] }} [options]

314 * @returns {Citation[]}

315 */

316function extractCitations(text, { families = ["cite"] } = {}) {

317 if (families.length === 0) {

318 return [];

319 }

320

321 const familyPattern = families

322 .map((family) => family.replace(/[.*+?^${}()|[\]\\]/g, "\\$&"))

323 .join("|");

324

325 const tokenRe = new RegExp(

326 `${CITATION_START}(?<family>${familyPattern})${CITATION_DELIMITER}(?<body>[\\s\\S]*?)${CITATION_STOP}`,

327 "g"

328 );

329

330 /** @type {Citation[]} */

331 const citations = [];

332

333 for (const match of text.matchAll(tokenRe)) {

334 const body = match.groups?.body ?? "";

335 const parts = body

336 .split(CITATION_DELIMITER)

337 .map((part) => part.trim())

338 .filter(Boolean);

339

340 if (parts.length === 0) {

341 continue;

342 }

343

344 let locator = null;

345 const lastPart = parts[parts.length - 1];

346 if (LINE_LOCATOR_RE.test(lastPart)) {

347 locator = parts.pop() ?? null;

348 }

349

350 if (parts.length === 0 || parts.some((part) => !SOURCE_ID_RE.test(part))) {

351 continue;

352 }

353

354 citations.push({

355 raw: match[0],

356 family: match.groups?.family ?? "",

357 source_ids: parts,

358 locator,

359 start: match.index ?? 0,

360 end: (match.index ?? 0) + match[0].length,

361 });

362 }

363

364 return citations;

365}

366

367/**

368 * @param {string} text

369 * @param {Iterable<Citation>} citations

370 * @returns {string}

371 */

372function stripCitations(text, citations) {

373 let cleanText = text;

374 const sortedCitations = Array.from(citations).sort(

375 (left, right) => right.start - left.start

376 );

377

378 for (const citation of sortedCitations) {

379 cleanText = cleanText.slice(0, citation.start) + cleanText.slice(citation.end);

380 }

381

382 return cleanText;

383}

384```

385

386

385If your source IDs use a different shape, update `SOURCE_ID_RE` to match your387If your source IDs use a different shape, update `SOURCE_ID_RE` to match your

386system.388system.

387 389

guides/completions.md +9 −9

Details

1# Completions API1# Completions API

2 2

~~3export const snippetLegacyCompletions = {~~3The completions API endpoint received its final update in July 2023 and has a different interface than the new Chat Completions endpoint. Instead of the input being a list of messages, the input is a freeform text string called a `prompt`.

~~4 python: `~~4

5An example legacy Completions API call looks like the following:

7```python

5from openai import OpenAI8from openai import OpenAI

6client = OpenAI()9client = OpenAI()

7 10

9model="gpt-3.5-turbo-instruct",12model="gpt-3.5-turbo-instruct",

10prompt="Write a tagline for an ice cream shop."13prompt="Write a tagline for an ice cream shop."

11)14)

~~12`.trim(),~~15```

~~13 "node.js": `~~16

17```javascript

14const completion = await openai.completions.create({18const completion = await openai.completions.create({

15model: 'gpt-3.5-turbo-instruct',19model: 'gpt-3.5-turbo-instruct',

16prompt: 'Write a tagline for an ice cream shop.'20prompt: 'Write a tagline for an ice cream shop.'

17});21});

~~18`.trim(),~~22```

~~19};~~

21The completions API endpoint received its final update in July 2023 and has a different interface than the new Chat Completions endpoint. Instead of the input being a list of messages, the input is a freeform text string called a `prompt`.

22 23

~~23An example legacy Completions API call looks like the following:~~

24 24

25See the full [API reference documentation](https://platform.openai.com/docs/api-reference/completions) to learn more.25See the full [API reference documentation](https://platform.openai.com/docs/api-reference/completions) to learn more.

26 26

guides/deep-research.md +424 −0

Details

8 8

9To use deep research, use the [Responses API](https://developers.openai.com/api/docs/api-reference/responses) with the model set to `o3-deep-research` or `o4-mini-deep-research`. You must include at least one data source: web search, remote MCP servers, or file search with vector stores. You can also include the [code interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter) tool to allow the model to perform complex analysis by writing code.9To use deep research, use the [Responses API](https://developers.openai.com/api/docs/api-reference/responses) with the model set to `o3-deep-research` or `o4-mini-deep-research`. You must include at least one data source: web search, remote MCP servers, or file search with vector stores. You can also include the [code interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter) tool to allow the model to perform complex analysis by writing code.

10 10

11Kick off a deep research task

13```python

14from openai import OpenAI

15client = OpenAI(timeout=3600)

17input_text = """

18Research the economic impact of semaglutide on global healthcare systems.

19Do:

20- Include specific figures, trends, statistics, and measurable outcomes.

21- Prioritize reliable, up-to-date sources: peer-reviewed research, health

22 organizations (e.g., WHO, CDC), regulatory agencies, or pharmaceutical

23 earnings reports.

24- Include inline citations and return all source metadata.

26Be analytical, avoid generalities, and ensure that each section supports

27data-backed reasoning that could inform healthcare policy or financial modeling.

28"""

30response = client.responses.create(

31 model="o3-deep-research",

32 input=input_text,

33 background=True,

34 tools=[

35 {"type": "web_search_preview"},

36 {

37 "type": "file_search",

38 "vector_store_ids": [

39 "vs_68870b8868b88191894165101435eef6",

40 "vs_12345abcde6789fghijk101112131415"

41 ]

42 },

43 {

44 "type": "code_interpreter",

45 "container": {"type": "auto"}

46 },

47 ],

48)

51print(response.output_text)

52```

54```javascript

55import OpenAI from "openai";

56const openai = new OpenAI({ timeout: 3600 * 1000 });

59const input = `

60Research the economic impact of semaglutide on global healthcare systems.

61Do:

62- Include specific figures, trends, statistics, and measurable outcomes.

63- Prioritize reliable, up-to-date sources: peer-reviewed research, health

64 organizations (e.g., WHO, CDC), regulatory agencies, or pharmaceutical

65 earnings reports.

66- Include inline citations and return all source metadata.

68Be analytical, avoid generalities, and ensure that each section supports

69data-backed reasoning that could inform healthcare policy or financial modeling.

70`;

72const response = await openai.responses.create({

73 model: "o3-deep-research",

74 input,

75 background: true,

76 tools: [

77 { type: "web_search_preview" },

78 {

79 type: "file_search",

80 vector_store_ids: [

81 "vs_68870b8868b88191894165101435eef6",

82 "vs_12345abcde6789fghijk101112131415"

83 ],

84 },

85 { type: "code_interpreter", container: { type: "auto" } },

86 ],

87});

89console.log(response);

90```

92```bash

93curl https://api.openai.com/v1/responses \

94 -H "Authorization: Bearer $OPENAI_API_KEY" \

95 -H "Content-Type: application/json" \

96 -d '{

97 "model": "o3-deep-research",

98 "input": "Research the economic impact of semaglutide on global healthcare systems. Include specific figures, trends, statistics, and measurable outcomes. Prioritize reliable, up-to-date sources: peer-reviewed research, health organizations (e.g., WHO, CDC), regulatory agencies, or pharmaceutical earnings reports. Include inline citations and return all source metadata. Be analytical, avoid generalities, and ensure that each section supports data-backed reasoning that could inform healthcare policy or financial modeling.",

99 "background": true,

100 "tools": [

101 { "type": "web_search_preview" },

102 {

103 "type": "file_search",

104 "vector_store_ids": [

105 "vs_68870b8868b88191894165101435eef6",

106 "vs_12345abcde6789fghijk101112131415"

107 ]

108 },

109 { "type": "code_interpreter", "container": { "type": "auto" } }

110 ]

111 }'

112```

113

114

11Deep research requests can take a long time, so we recommend running them in [background mode](https://developers.openai.com/api/docs/guides/background). You can configure a [webhook](https://developers.openai.com/api/docs/guides/webhooks) that will be notified when a background request is complete. Background mode retains response data for roughly 10 minutes so that polling works reliably, which makes it incompatible with Zero Data Retention (ZDR) requirements. We continue to accept `background=true` on ZDR credentials for legacy reasons, but you should leave it off if you require ZDR. Modified Abuse Monitoring (MAM) projects can safely use background mode.115Deep research requests can take a long time, so we recommend running them in [background mode](https://developers.openai.com/api/docs/guides/background). You can configure a [webhook](https://developers.openai.com/api/docs/guides/webhooks) that will be notified when a background request is complete. Background mode retains response data for roughly 10 minutes so that polling works reliably, which makes it incompatible with Zero Data Retention (ZDR) requirements. We continue to accept `background=true` on ZDR credentials for legacy reasons, but you should leave it off if you require ZDR. Modified Abuse Monitoring (MAM) projects can safely use background mode.

12 116

13### Output structure117### Output structure

80 184

81Deep research via the Responses API does not include a clarification or prompt rewriting step. As a developer, you can configure this processing step to rewrite the user prompt or ask a set of clarifying questions, since the model expects fully-formed prompts up front and will not ask for additional context or fill in missing information; it simply starts researching based on the input it receives. These steps are optional: if you have a sufficiently detailed prompt, there's no need to clarify or rewrite it. Below we include an examples of asking clarifying questions and rewriting the prompt before passing it to the deep research models.185Deep research via the Responses API does not include a clarification or prompt rewriting step. As a developer, you can configure this processing step to rewrite the user prompt or ask a set of clarifying questions, since the model expects fully-formed prompts up front and will not ask for additional context or fill in missing information; it simply starts researching based on the input it receives. These steps are optional: if you have a sufficiently detailed prompt, there's no need to clarify or rewrite it. Below we include an examples of asking clarifying questions and rewriting the prompt before passing it to the deep research models.

82 186

187Asking clarifying questions using a faster, smaller model

188

189```python

190from openai import OpenAI

191client = OpenAI()

192

193instructions = """

194You are talking to a user who is asking for a research task to be conducted. Your job is to gather more information from the user to successfully complete the task.

195

196GUIDELINES:

197- Be concise while gathering all necessary information**

198- Make sure to gather all the information needed to carry out the research task in a concise, well-structured manner.

199- Use bullet points or numbered lists if appropriate for clarity.

200- Don't ask for unnecessary information, or information that the user has already provided.

201

202IMPORTANT: Do NOT conduct any research yourself, just gather information that will be given to a researcher to conduct the research task.

203"""

204

205input_text = "Research surfboards for me. I'm interested in ...";

206

207response = client.responses.create(

208 model="gpt-5.5",

209 input=input_text,

210 instructions=instructions,

211)

212

213print(response.output_text)

214```

215

216```javascript

217import OpenAI from "openai";

218const openai = new OpenAI();

219

220const instructions = `

221You are talking to a user who is asking for a research task to be conducted. Your job is to gather more information from the user to successfully complete the task.

222

223GUIDELINES:

224- Be concise while gathering all necessary information**

225- Make sure to gather all the information needed to carry out the research task in a concise, well-structured manner.

226- Use bullet points or numbered lists if appropriate for clarity.

227- Don't ask for unnecessary information, or information that the user has already provided.

228

229IMPORTANT: Do NOT conduct any research yourself, just gather information that will be given to a researcher to conduct the research task.

230`;

231

232const input = "Research surfboards for me. I'm interested in ...";

233

234const response = await openai.responses.create({

235model: "gpt-5.5",

236input,

237instructions,

238});

239

240console.log(response.output_text);

241```

242

243```bash

244curl https://api.openai.com/v1/responses \

245-H "Authorization: Bearer $OPENAI_API_KEY" \

246-H "Content-Type: application/json" \

247-d '{

248 "model": "gpt-5.5",

249 "input": "Research surfboards for me. Im interested in ...",

250 "instructions": "You are talking to a user who is asking for a research task to be conducted. Your job is to gather more information from the user to successfully complete the task. GUIDELINES: - Be concise while gathering all necessary information** - Make sure to gather all the information needed to carry out the research task in a concise, well-structured manner. - Use bullet points or numbered lists if appropriate for clarity. - Don't ask for unnecessary information, or information that the user has already provided. IMPORTANT: Do NOT conduct any research yourself, just gather information that will be given to a researcher to conduct the research task."

251}'

252```

253

254

255Enrich a user prompt using a faster, smaller model

256

257```python

258from openai import OpenAI

259client = OpenAI()

260

261instructions = """

262You will be given a research task by a user. Your job is to produce a set of

263instructions for a researcher that will complete the task. Do NOT complete the

264task yourself, just provide instructions on how to complete it.

265

266GUIDELINES:

2671. **Maximize Specificity and Detail**

268- Include all known user preferences and explicitly list key attributes or

269 dimensions to consider.

270- It is of utmost importance that all details from the user are included in

271 the instructions.

272

2732. **Fill in Unstated But Necessary Dimensions as Open-Ended**

274- If certain attributes are essential for a meaningful output but the user

275 has not provided them, explicitly state that they are open-ended or default

276 to no specific constraint.

277

2783. **Avoid Unwarranted Assumptions**

279- If the user has not provided a particular detail, do not invent one.

280- Instead, state the lack of specification and guide the researcher to treat

281 it as flexible or accept all possible options.

282

2834. **Use the First Person**

284- Phrase the request from the perspective of the user.

285

2865. **Tables**

287- If you determine that including a table will help illustrate, organize, or

288 enhance the information in the research output, you must explicitly request

289 that the researcher provide them.

290

291Examples:

292- Product Comparison (Consumer): When comparing different smartphone models,

293 request a table listing each model's features, price, and consumer ratings

294 side-by-side.

295- Project Tracking (Work): When outlining project deliverables, create a table

296 showing tasks, deadlines, responsible team members, and status updates.

297- Budget Planning (Consumer): When creating a personal or household budget,

298 request a table detailing income sources, monthly expenses, and savings goals.

299- Competitor Analysis (Work): When evaluating competitor products, request a

300 table with key metrics, such as market share, pricing, and main differentiators.

301

3026. **Headers and Formatting**

303- You should include the expected output format in the prompt.

304- If the user is asking for content that would be best returned in a

305 structured format (e.g. a report, plan, etc.), ask the researcher to format

306 as a report with the appropriate headers and formatting that ensures clarity

307 and structure.

308

3097. **Language**

310- If the user input is in a language other than English, tell the researcher

311 to respond in this language, unless the user query explicitly asks for the

312 response in a different language.

313

3148. **Sources**

315- If specific sources should be prioritized, specify them in the prompt.

316- For product and travel research, prefer linking directly to official or

317 primary websites (e.g., official brand sites, manufacturer pages, or

318 reputable e-commerce platforms like Amazon for user reviews) rather than

319 aggregator sites or SEO-heavy blogs.

320- For academic or scientific queries, prefer linking directly to the original

321 paper or official journal publication rather than survey papers or secondary

322 summaries.

323- If the query is in a specific language, prioritize sources published in that

324 language.

325"""

326

327input_text = "Research surfboards for me. I'm interested in ..."

328

329response = client.responses.create(

330 model="gpt-5.5",

331 input=input_text,

332 instructions=instructions,

333)

334

335print(response.output_text)

336```

337

338```javascript

339import OpenAI from "openai";

340const openai = new OpenAI();

341

342const instructions = `

343You will be given a research task by a user. Your job is to produce a set of

344instructions for a researcher that will complete the task. Do NOT complete the

345task yourself, just provide instructions on how to complete it.

346

347GUIDELINES:

3481. **Maximize Specificity and Detail**

349- Include all known user preferences and explicitly list key attributes or

350 dimensions to consider.

351- It is of utmost importance that all details from the user are included in

352 the instructions.

353

3542. **Fill in Unstated But Necessary Dimensions as Open-Ended**

355- If certain attributes are essential for a meaningful output but the user

356 has not provided them, explicitly state that they are open-ended or default

357 to no specific constraint.

358

3593. **Avoid Unwarranted Assumptions**

360- If the user has not provided a particular detail, do not invent one.

361- Instead, state the lack of specification and guide the researcher to treat

362 it as flexible or accept all possible options.

363

3644. **Use the First Person**

365- Phrase the request from the perspective of the user.

366

3675. **Tables**

368- If you determine that including a table will help illustrate, organize, or

369 enhance the information in the research output, you must explicitly request

370 that the researcher provide them.

371

372Examples:

373- Product Comparison (Consumer): When comparing different smartphone models,

374 request a table listing each model's features, price, and consumer ratings

375 side-by-side.

376- Project Tracking (Work): When outlining project deliverables, create a table

377 showing tasks, deadlines, responsible team members, and status updates.

378- Budget Planning (Consumer): When creating a personal or household budget,

379 request a table detailing income sources, monthly expenses, and savings goals.

380- Competitor Analysis (Work): When evaluating competitor products, request a

381 table with key metrics, such as market share, pricing, and main differentiators.

382

3836. **Headers and Formatting**

384- You should include the expected output format in the prompt.

385- If the user is asking for content that would be best returned in a

386 structured format (e.g. a report, plan, etc.), ask the researcher to format

387 as a report with the appropriate headers and formatting that ensures clarity

388 and structure.

389

3907. **Language**

391- If the user input is in a language other than English, tell the researcher

392 to respond in this language, unless the user query explicitly asks for the

393 response in a different language.

394

3958. **Sources**

396- If specific sources should be prioritized, specify them in the prompt.

397- For product and travel research, prefer linking directly to official or

398 primary websites (e.g., official brand sites, manufacturer pages, or

399 reputable e-commerce platforms like Amazon for user reviews) rather than

400 aggregator sites or SEO-heavy blogs.

401- For academic or scientific queries, prefer linking directly to the original

402 paper or official journal publication rather than survey papers or secondary

403 summaries.

404- If the query is in a specific language, prioritize sources published in that

405 language.

406`;

407

408const input = "Research surfboards for me. I'm interested in ...";

409

410const response = await openai.responses.create({

411 model: "gpt-5.5",

412 input,

413 instructions,

414});

415

416console.log(response.output_text);

417```

418

419```bash

420curl https://api.openai.com/v1/responses \

421 -H "Authorization: Bearer $OPENAI_API_KEY" \

422 -H "Content-Type: application/json" \

423 -d '{

424 "model": "gpt-5.5",

425 "input": "Research surfboards for me. Im interested in ...",

426 "instructions": "You are a helpful assistant that generates a prompt for a deep research task. Examine the users prompt and generate a set of clarifying questions that will help the deep research model generate a better response."

427 }'

428```

429

430

83## Research with your own data431## Research with your own data

84 432

85Deep research models are designed to access both public and private data sources, but they require a specific setup for private or internal data. By default, these models can access information on the public internet via the [web search tool](https://developers.openai.com/api/docs/guides/tools-web-search). To give the model access to your own data, you have several options:433Deep research models are designed to access both public and private data sources, but they require a specific setup for private or internal data. By default, these models can access information on the public internet via the [web search tool](https://developers.openai.com/api/docs/guides/tools-web-search). To give the model access to your own data, you have several options:

114 462

115Lastly, in deep research, the approval mode for MCP tools must have `require_approval` set to `never`—since both the search and fetch actions are read-only the human-in-the-loop reviews add lesser value and are currently unsupported.463Lastly, in deep research, the approval mode for MCP tools must have `require_approval` set to `never`—since both the search and fetch actions are read-only the human-in-the-loop reviews add lesser value and are currently unsupported.

116 464

465Remote MCP server configuration for deep research

466

467```bash

468curl https://api.openai.com/v1/responses \

469 -H "Content-Type: application/json" \

470 -H "Authorization: Bearer $OPENAI_API_KEY" \

471 -d '{

472 "model": "o3-deep-research",

473 "tools": [

474 {

475 "type": "mcp",

476 "server_label": "mycompany_mcp_server",

477 "server_url": "https://mycompany.com/mcp",

478 "require_approval": "never"

479 }

480 ],

481 "input": "What similarities are in the notes for our closed/lost Salesforce opportunities?"

482}'

483```

484

485```javascript

486import OpenAI from "openai";

487const client = new OpenAI();

488

489const instructions = "<deep research instructions...>";

490

491const resp = await client.responses.create({

492 model: "o3-deep-research",

493 background: true,

494 reasoning: {

495 summary: "auto",

496 },

497 tools: [

498 {

499 type: "mcp",

500 server_label: "mycompany_mcp_server",

501 server_url: "https://mycompany.com/mcp",

502 require_approval: "never",

503 },

504 ],

505 instructions,

506 input: "What similarities are in the notes for our closed/lost Salesforce opportunities?",

507});

508

509console.log(resp.output_text);

510```

511

512```python

513from openai import OpenAI

514

515client = OpenAI()

516

517instructions = "<deep research instructions...>"

518

519resp = client.responses.create(

520 model="o3-deep-research",

521 background=True,

522 reasoning={

523 "summary": "auto",

524 },

525 tools=[

526 {

527 "type": "mcp",

528 "server_label": "mycompany_mcp_server",

529 "server_url": "https://mycompany.com/mcp",

530 "require_approval": "never",

531 },

532 ],

533 instructions=instructions,

534 input="What similarities are in the notes for our closed/lost Salesforce opportunities?",

535)

536

537print(resp.output_text)

538```

539

540

117[541[

118 542

119<span slot="icon">543<span slot="icon">

guides/deployment-checklist.md +529 −0

Details

37problem, compare options, write a plan, or reason through code. Reserve `xhigh`37problem, compare options, write a plan, or reason through code. Reserve `xhigh`

38for cases where your evals show the extra latency is worth it.38for cases where your evals show the extra latency is worth it.

39 39

40Tune reasoning effort for the task

42```javascript

43import OpenAI from "openai";

45const openai = new OpenAI();

47const prompt = [

48 "Our CI job started failing after a dependency bump.",

49 "",

50 "Error:",

51 "TypeError: Timeout.__init__() got an unexpected keyword argument 'connect'",

52 "",

53 "Identify the likeliest root cause and the smallest safe fix.",

54].join("\n");

56const response = await openai.responses.create({

57 model: "gpt-5.5",

58 reasoning: { effort: "high" },

59 input: prompt,

60});

62console.log(response.output_text);

63```

65```python

66from openai import OpenAI

68client = OpenAI()

70prompt = """

71Our CI job started failing after a dependency bump.

73Error:

74TypeError: Timeout.__init__() got an unexpected keyword argument 'connect'

76Identify the likeliest root cause and the smallest safe fix.

77"""

79response = client.responses.create(

80 model="gpt-5.5",

81 reasoning={"effort": "high"},

82 input=prompt,

83)

85print(response.output_text)

86```

40## Set up `text.verbosity`89## Set up `text.verbosity`

41 90

42`text.verbosity` is the main lever for balancing brevity against completeness.91`text.verbosity` is the main lever for balancing brevity against completeness.

48For coding, `medium` and `high` tend to produce longer, more organized output97For coding, `medium` and `high` tend to produce longer, more organized output

49with clearer structure. `low` keeps the answer tighter and more minimal.98with clearer structure. `low` keeps the answer tighter and more minimal.

50 99

100Set lower verbosity for compact output

101

102```javascript

103import OpenAI from "openai";

104

105const openai = new OpenAI();

106

107const incident = [

108 "Summarize this incident for the next on-call engineer.",

109 "- checkout latency spiked from 220 ms to 4.8 s",

110 "- only us-east-1 was affected",

111 "- rollback is complete",

112 "- likely trigger: cache stampede after deploy",

113].join("\n");

114

115const response = await openai.responses.create({

116 model: "gpt-5.5",

117 text: { verbosity: "low" },

118 input: incident,

119});

120

121console.log(response.output_text);

122```

123

124```python

125from openai import OpenAI

126

127client = OpenAI()

128

129response = client.responses.create(

130 model="gpt-5.5",

131 text={"verbosity": "low"},

132 input="""

133 Summarize this incident for the next on-call engineer.

134 - checkout latency spiked from 220 ms to 4.8 s

135 - only us-east-1 was affected

136 - rollback is complete

137 - likely trigger: cache stampede after deploy

138 """,

139)

140

141print(response.output_text)

142```

143

144

51## Set up the assistant `phase` parameter145## Set up the assistant `phase` parameter

52 146

53`phase` is a label on assistant messages in the conversation history. It147`phase` is a label on assistant messages in the conversation history. It

99instructions inside the deferred tool definitions. Avoid making one giant193instructions inside the deferred tool definitions. Avoid making one giant

100namespace for everything.194namespace for everything.

101 195

196Use hosted tool search with deferred tools

197

198```javascript

199import OpenAI from "openai";

200

201const openai = new OpenAI();

202

203const billingLookupInvoice = {

204 type: "function",

205 name: "billing.lookup_invoice",

206 description: "Look up invoice state, taxes, credits, and payment attempts.",

207 parameters: {

208 type: "object",

209 properties: {

210 invoice_id: { type: "string" },

211 },

212 required: ["invoice_id"],

213 additionalProperties: false,

214 },

215 strict: true,

216 defer_loading: true,

217};

218

219const crmGetAccount = {

220 type: "function",

221 name: "crm.get_account",

222 description: "Fetch account owner, plan, health, and payment history.",

223 parameters: {

224 type: "object",

225 properties: {

226 account_id: { type: "string" },

227 },

228 required: ["account_id"],

229 additionalProperties: false,

230 },

231 strict: true,

232 defer_loading: true,

233};

234

235const response = await openai.responses.create({

236 model: "gpt-5.5",

237 input:

238 "Find the right billing tool and explain why invoice INV-1043 still " +

239 "shows overdue after a payment yesterday.",

240 tools: [

241 { type: "tool_search" },

242 billingLookupInvoice,

243 crmGetAccount,

244 ],

245});

246

247console.log(response.output_text);

248```

249

250```python

251from openai import OpenAI

252

253client = OpenAI()

254

255billing_lookup_invoice = {

256 "type": "function",

257 "name": "billing.lookup_invoice",

258 "description": "Look up invoice state, taxes, credits, and payment attempts.",

259 "parameters": {

260 "type": "object",

261 "properties": {

262 "invoice_id": {"type": "string"},

263 },

264 "required": ["invoice_id"],

265 "additionalProperties": False,

266 },

267 "strict": True,

268 "defer_loading": True,

269}

270

271crm_get_account = {

272 "type": "function",

273 "name": "crm.get_account",

274 "description": "Fetch account owner, plan, health, and payment history.",

275 "parameters": {

276 "type": "object",

277 "properties": {

278 "account_id": {"type": "string"},

279 },

280 "required": ["account_id"],

281 "additionalProperties": False,

282 },

283 "strict": True,

284 "defer_loading": True,

285}

286

287response = client.responses.create(

288 model="gpt-5.5",

289 input=(

290 "Find the right billing tool and explain why invoice INV-1043 still "

291 "shows overdue after a payment yesterday."

292 ),

293 tools=[

294 {"type": "tool_search"},

295 billing_lookup_invoice,

296 crm_get_account,

297 ],

298)

299

300print(response.output_text)

301```

302

303

102## Leverage built-in tools304## Leverage built-in tools

103 305

104[Built-in tools](https://developers.openai.com/api/docs/guides/tools) are the API's native capabilities.306[Built-in tools](https://developers.openai.com/api/docs/guides/tools) are the API's native capabilities.

157state that helps the model continue. Pass it forward as-is, then add the next359state that helps the model continue. Pass it forward as-is, then add the next

158user message.360user message.

159 361

362Continue from compacted response state

363

364```javascript

365import OpenAI from "openai";

366

367const openai = new OpenAI();

368

369// Full window collected from a long debugging session:

370// user messages, assistant outputs, tool calls, and tool outputs.

371const longWindow = sessionItems;

372

373const compacted = await openai.responses.compact({

374 model: "gpt-5.5",

375 input: longWindow,

376});

377

378const nextResponse = await openai.responses.create({

379 model: "gpt-5.5",

380 store: false,

381 input: [

382 ...compacted.output, // Use compact output as-is.

383 {

384 type: "message",

385 role: "user",

386 content:

387 "We found the bad cache invalidation path. Write the fix plan " +

388 "and the verification checklist.",

389 },

390 ],

391});

392

393console.log(nextResponse.output_text);

394```

395

396```python

397from openai import OpenAI

398

399client = OpenAI()

400

401# Full window collected from a long debugging session:

402# user messages, assistant outputs, tool calls, and tool outputs.

403long_window = session_items

404

405compacted = client.responses.compact(

406 model="gpt-5.5",

407 input=long_window,

408)

409

410next_response = client.responses.create(

411 model="gpt-5.5",

412 store=False,

413 input=[

414 *compacted.output, # Use compact output as-is.

415 {

416 "type": "message",

417 "role": "user",

418 "content": (

419 "We found the bad cache invalidation path. Write the fix plan "

420 "and the verification checklist."

421 ),

422 },

423 ],

424)

425

426print(next_response.output_text)

427```

428

429

160## Use `prompt_cache_key`430## Use `prompt_cache_key`

161 431

162[Prompt caching](https://developers.openai.com/api/docs/guides/prompt-caching) automatically reduces latency432[Prompt caching](https://developers.openai.com/api/docs/guides/prompt-caching) automatically reduces latency

172combination exceeds about 15 requests per minute, requests may overflow to442combination exceeds about 15 requests per minute, requests may overflow to

173additional machines and reduce cache effectiveness.443additional machines and reduce cache effectiveness.

174 444

445Route related requests to the same prompt cache

446

447```javascript

448import OpenAI from "openai";

449

450const openai = new OpenAI();

451

452const instructions = [

453 "You are the support agent for Acme.",

454 "Follow the Acme support policy and escalation rubric.",

455 "Use the same tone, safety rules, and tool plan for each ticket.",

456].join("\n");

457

458const response = await openai.responses.create({

459 model: "gpt-5.5",

460 prompt_cache_key: "tenant-acme-support-agent",

461 instructions,

462 input: "Summarize the current escalation for the on-call lead.",

463});

464

465console.log(response.output_text);

466```

467

468```python

469from openai import OpenAI

470

471client = OpenAI()

472

473instructions = """

474You are the support agent for Acme.

475Follow the Acme support policy and escalation rubric.

476Use the same tone, safety rules, and tool plan for each ticket.

477"""

478

479response = client.responses.create(

480 model="gpt-5.5",

481 prompt_cache_key="tenant-acme-support-agent",

482 instructions=instructions,

483 input="Summarize the current escalation for the on-call lead.",

484)

485

486print(response.output_text)

487```

488

489

175## Use `reasoning.encrypted_content`490## Use `reasoning.encrypted_content`

176 491

177Always round-trip reasoning items. This helps the model by allowing it to work492Always round-trip reasoning items. This helps the model by allowing it to work

186keeps the reasoning item exactly as returned and sends it back during the next501keeps the reasoning item exactly as returned and sends it back during the next

187turn, so the model can use it to continue the workflow.502turn, so the model can use it to continue the workflow.

188 503

504Pass encrypted reasoning between stateless turns

505

506```javascript

507import OpenAI from "openai";

508

509const openai = new OpenAI();

510

511const first = await openai.responses.create({

512 model: "gpt-5.5",

513 store: false,

514 reasoning: { effort: "medium" },

515 include: ["reasoning.encrypted_content"],

516 input: "Investigate why invoice INV-1043 has mismatched tax totals.",

517});

518

519const second = await openai.responses.create({

520 model: "gpt-5.5",

521 store: false,

522 reasoning: { effort: "medium" },

523 include: ["reasoning.encrypted_content"],

524 input: [

525 ...first.output,

526 {

527 role: "user",

528 content: "Now write the customer-facing explanation in plain English.",

529 },

530 ],

531});

532

533console.log(second.output_text);

534```

535

536```python

537from openai import OpenAI

538

539client = OpenAI()

540

541first = client.responses.create(

542 model="gpt-5.5",

543 store=False,

544 reasoning={"effort": "medium"},

545 include=["reasoning.encrypted_content"],

546 input="Investigate why invoice INV-1043 has mismatched tax totals.",

547)

548

549second = client.responses.create(

550 model="gpt-5.5",

551 store=False,

552 reasoning={"effort": "medium"},

553 include=["reasoning.encrypted_content"],

554 input=[

555 *first.output,

556 {

557 "role": "user",

558 "content": "Now write the customer-facing explanation in plain English.",

559 },

560 ],

561)

562

563print(second.output_text)

564```

565

566

189## Use `background=True`567## Use `background=True`

190 568

191Use [`background=True`](https://developers.openai.com/api/docs/guides/background) for requests that may take569Use [`background=True`](https://developers.openai.com/api/docs/guides/background) for requests that may take

196 574

197`background=True` **requires `store=True`**.575`background=True` **requires `store=True`**.

198 576

577Run and poll a background response

578

579```javascript

580import OpenAI from "openai";

581

582const openai = new OpenAI();

583

584let job = await openai.responses.create({

585 model: "gpt-5.5",

586 background: true,

587 store: true,

588 input: "Analyze this large log bundle and cluster the primary failure modes.",

589 tools: [

590 {

591 type: "code_interpreter",

592 container: {

593 type: "auto",

594 file_ids: [logBundleFileId],

595 },

596 },

597 ],

598});

599

600while (["queued", "in_progress"].includes(job.status)) {

601 await new Promise((resolve) => setTimeout(resolve, 2000));

602 job = await openai.responses.retrieve(job.id);

603}

604

605console.log(job.output_text);

606```

607

608```python

609from openai import OpenAI

610import time

611

612client = OpenAI()

613

614job = client.responses.create(

615 model="gpt-5.5",

616 background=True,

617 store=True,

618 input="Analyze this large log bundle and cluster the primary failure modes.",

619 tools=[

620 {

621 "type": "code_interpreter",

622 "container": {

623 "type": "auto",

624 "file_ids": [log_bundle_file_id],

625 },

626 }

627 ],

628)

629

630while job.status in {"queued", "in_progress"}:

631 time.sleep(2)

632 job = client.responses.retrieve(job.id)

633

634print(job.output_text)

635```

636

637

199You can combine it with `stream=True` for progress events, but the first event638You can combine it with `stream=True` for progress events, but the first event

200may take longer than a normal request.639may take longer than a normal request.

201 640

237The default Python sample uses `websocket-client` (`pip install676The default Python sample uses `websocket-client` (`pip install

238websocket-client`). The JavaScript sample uses `ws` (`npm install ws`).677websocket-client`). The JavaScript sample uses `ws` (`npm install ws`).

239 678

679Start a Responses API WebSocket session

680

681```javascript

682import OpenAI from "openai";

683import WebSocket from "ws";

684

685const openai = new OpenAI();

686

687const ws = new WebSocket("wss://api.openai.com/v1/responses", {

688 headers: {

689 Authorization: "Bearer " + openai.apiKey,

690 },

691});

692

693ws.on("open", () => {

694 ws.send(

695 JSON.stringify({

696 type: "response.create",

697 model: "gpt-5.5",

698 store: false,

699 input: [

700 {

701 type: "message",

702 role: "user",

703 content: [

704 {

705 type: "input_text",

706 text:

707 "Find the flaky test in this run, call the tools you need, " +

708 "and keep going until you can explain the root cause.",

709 },

710 ],

711 },

712 ],

713 tools: [testLogTool, codeSearchTool],

714 })

715 );

716});

717

718ws.on("message", (data) => {

719 const firstEvent = JSON.parse(data.toString());

720 console.log(firstEvent.type);

721});

722```

723

724```python

725from openai import OpenAI

726from websocket import create_connection

727import json

728

729client = OpenAI()

730

731ws = create_connection(

732 "wss://api.openai.com/v1/responses",

733 header=[f"Authorization: Bearer {client.api_key}"],

734)

735

736# Same request body you would send to client.responses.create(...).

737ws.send(

738 json.dumps(

739 {

740 "type": "response.create",

741 "model": "gpt-5.5",

742 "store": False,

743 "input": [

744 {

745 "type": "message",

746 "role": "user",

747 "content": [

748 {

749 "type": "input_text",

750 "text": (

751 "Find the flaky test in this run, call the tools "

752 "you need, and keep going until you can explain "

753 "the root cause."

754 ),

755 }

756 ],

757 }

758 ],

759 "tools": [test_log_tool, code_search_tool],

760 }

761 )

762)

763

764first_event = json.loads(ws.recv())

765print(first_event["type"])

766```

767

768

240## Final takeaway769## Final takeaway

241 770

242Responses API is the foundation for building smarter, more capable OpenAI771Responses API is the foundation for building smarter, more capable OpenAI

guides/moderation.md +52 −0

Details

23 23

24Set `moderation.model` when you create a response:24Set `moderation.model` when you create a response:

25 25

26Generate a response with moderation scores

28```python

29from openai import OpenAI

30client = OpenAI()

32response = client.responses.create(

33 model="gpt-5.5",

34 input=[

35 {

36 "role": "user",

37 "content": (

38 "A user asks for instructions to make a harmful weapon. "

39 "Draft a brief refusal and offer a safer alternative."

40 ),

41 }

42 ],

43 moderation={"model": "omni-moderation-latest"},

44)

46input_moderation = response.moderation.input

47output_moderation = response.moderation.output

49print(input_moderation.flagged)

50print(output_moderation.flagged)

51```

53```javascript

54import OpenAI from "openai";

56const client = new OpenAI();

58const response = await client.responses.create({

59 model: "gpt-5.5",

60 input: [

61 {

62 role: "user",

63 content:

64 "A user asks for instructions to make a harmful weapon. Draft a brief refusal and offer a safer alternative.",

65 },

66 ],

67 moderation: { model: "omni-moderation-latest" },

68});

70const inputModeration = response.moderation.input;

71const outputModeration = response.moderation.output;

73console.log(inputModeration.flagged);

74console.log(outputModeration.flagged);

75```

26The Responses API returns an input `moderation_result` object at `response.moderation.input` and an output `moderation_result` object at `response.moderation.output`.78The Responses API returns an input `moderation_result` object at `response.moderation.input` and an output `moderation_result` object at `response.moderation.output`.

27 79

28 80

guides/rate-limits.md +0 −100

Details

1# Rate limits1# Rate limits

2 2

~~3export const snippetTenacityLibrary = {~~

~~4 python: `~~

~~5from openai import OpenAI~~

~~6client = OpenAI()~~

~~8from tenacity import (~~

~~9retry,~~

~~10stop_after_attempt,~~

~~11wait_random_exponential,~~

~~12) # for exponential backoff~~

~~14@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))~~

~~15def completion_with_backoff(**kwargs):~~

~~16return client.completions.create(**kwargs)~~

~~18completion_with_backoff(model="gpt-3.5-turbo-instruct", prompt="Once upon a time,")~~

~~19`.trim(),~~

~~20};~~

~~22export const snippetBackoffLibrary = {~~

~~23 python: `~~

~~24import backoff~~

~~25import openai~~

~~26from openai import OpenAI~~

~~27client = OpenAI()~~

~~29@backoff.on_exception(backoff.expo, openai.RateLimitError)~~

~~30def completions_with_backoff(**kwargs):~~

~~31return client.completions.create(**kwargs)~~

~~33completions_with_backoff(model="gpt-3.5-turbo-instruct", prompt="Once upon a time,")~~

~~34`.trim(),~~

~~35};~~

~~37export const snippetManualBackoffImplementation = {~~

~~38 python: `~~

~~39# imports~~

~~40import random~~

~~41import time~~

~~43import openai~~

~~44from openai import OpenAI~~

~~45client = OpenAI()~~

~~47# define a retry decorator~~

~~49def retry_with_exponential_backoff(~~

~~50func,~~

~~51initial_delay: float = 1,~~

~~52exponential_base: float = 2,~~

~~53jitter: bool = True,~~

~~54max_retries: int = 10,~~

~~55errors: tuple = (openai.RateLimitError,),~~

~~56):~~

~~57"""Retry a function with exponential backoff."""~~

~~59 def wrapper(*args, **kwargs):~~

~~60 # Initialize variables~~

~~61 num_retries = 0~~

~~62 delay = initial_delay~~

~~64 # Loop until a successful response or max_retries is hit or an exception is raised~~

~~65 while True:~~

~~66 try:~~

~~67 return func(*args, **kwargs)~~

~~69 # Retry on specific errors~~

~~70 except errors as e:~~

~~71 # Increment retries~~

~~72 num_retries += 1~~

~~74 # Check if max retries has been reached~~

~~75 if num_retries > max_retries:~~

~~76 raise Exception(~~

~~77 f"Maximum number of retries ({max_retries}) exceeded."~~

~~78 )~~

~~80 # Increment the delay~~

~~81 delay *= exponential_base * (1 + jitter * random.random())~~

~~83 # Sleep for the delay~~

~~84 time.sleep(delay)~~

~~86 # Raise exceptions for any errors not specified~~

~~87 except Exception as e:~~

~~88 raise e~~

~~90 return wrapper~~

~~92@retry_with_exponential_backoff~~

~~93def completions_with_backoff(**kwargs):~~

~~94return client.completions.create(**kwargs)~~

~~95`.trim(),~~

~~96};~~

98Rate limits are restrictions that our API imposes on the number of times a user or client can3Rate limits are restrictions that our API imposes on the number of times a user or client can

99access our services within a specified period of time.4access our services within a specified period of time.

100 5

160 65

161The fine-tuning rate limits for your organization can be [found in the dashboard as well](https://platform.openai.com/settings/organization/limits), and can also be retrieved via API:66The fine-tuning rate limits for your organization can be [found in the dashboard as well](https://platform.openai.com/settings/organization/limits), and can also be retrieved via API:

162 67

163```bash

164curl https://api.openai.com/v1/fine_tuning/model_limits \

165 -H "Authorization: Bearer $OPENAI_API_KEY"

166```

~~167~~

168## Error mitigation68## Error mitigation

169 69

170### What are some steps I can take to mitigate this?70### What are some steps I can take to mitigate this?

guides/realtime-server-controls.md +1 −1

Details

35 `wss://api.openai.com/v1/realtime?call_id=rtc_xxxxx`, as shown below:37 `wss://api.openai.com/v1/realtime?call_id=rtc_xxxxx`, as shown below:

36 38

37```javascript39```javascript

38 40import WebSocket from "ws";

39const callId = "rtc_u1_9c6574da8b8a41a18da9308f4ad974ce";41const callId = "rtc_u1_9c6574da8b8a41a18da9308f4ad974ce";

40 42

41// Connect to a WebSocket for the in-progress call43// Connect to a WebSocket for the in-progress call

guides/realtime-sip.md +1 −1

Details

126for more information.127for more information.

127 128

128```javascript129```javascript

~~129~~ 130import WebSocket from "ws";

130 131

131const callId = "rtc_u1_9c6574da8b8a41a18da9308f4ad974ce";132const callId = "rtc_u1_9c6574da8b8a41a18da9308f4ad974ce";

132const ws = new WebSocket(`wss://api.openai.com/v1/realtime?call_id=${callId}`, {133const ws = new WebSocket(`wss://api.openai.com/v1/realtime?call_id=${callId}`, {

guides/reasoning.md +59 −0

Details

350 350

351### Round-trip assistant phase values351### Round-trip assistant phase values

352 352

353Round-trip assistant phase values

354

355```javascript

356import OpenAI from "openai";

357const client = new OpenAI();

358

359const response = await client.responses.create({

360 model: "gpt-5.5",

361 input: [

362 {

363 role: "assistant",

364 phase: "commentary",

365 content:

366 "I’ll inspect the logs and then summarize root cause and remediation.",

367 },

368 {

369 role: "assistant",

370 phase: "final_answer",

371 content: "Root cause: cache invalidation race.",

372 },

373 {

374 role: "user",

375 content: "Great—now give me a rollout-safe fix plan.",

376 },

377 ],

378});

379

380console.log(response.output_text);

381```

382

383```python

384from openai import OpenAI

385

386client = OpenAI()

387

388response = client.responses.create(

389 model="gpt-5.5",

390 input=[

391 {

392 "role": "assistant",

393 "phase": "commentary",

394 "content": "I’ll inspect the logs and then summarize root cause and remediation.",

395 },

396 {

397 "role": "assistant",

398 "phase": "final_answer",

399 "content": "Root cause: cache invalidation race.",

400 },

401 {

402 "role": "user",

403 "content": "Great—now give me a rollout-safe fix plan.",

404 },

405 ],

406)

407

408print(response.output_text)

409```

410

411

353## Advice on prompting412## Advice on prompting

354 413

355There are some differences to consider when prompting a reasoning model. Reasoning-capable GPT-5 models usually work best when you give them a clear goal, strong constraints, and an explicit output contract without prescribing every intermediate step.414There are some differences to consider when prompting a reasoning model. Reasoning-capable GPT-5 models usually work best when you give them a clear goal, strong constraints, and an explicit output contract without prescribing every intermediate step.

guides/secure-mcp-tunnels.md +15 −2

Details

42- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.42- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.

43- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.43- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.

44 44

45## Associate tunnels with the right organizations and workspaces

47A tunnel can be associated with one or more Platform organizations or ChatGPT workspaces. Use these associations to define every OpenAI context that should be allowed to find or use the tunnel.

49- Include the Platform organization that owns or manages the tunnel.

50- Include the ChatGPT workspace that should list the tunnel in connector settings.

51- Include another Platform organization when Codex, the Responses API, or another supported product will call the private MCP server from that organization.

52- Use the same `tunnel_id` for `tunnel-client`; adding organizations or workspaces does not create a second tunnel or change the private MCP server endpoint.

54For personal accounts, use the personal Platform organization that belongs to that account. A tunnel associated only with a personal account won't automatically appear in an enterprise ChatGPT workspace.

56If the Platform organization and ChatGPT workspace are already linked, you can add the missing organization or workspace in [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels). If your enterprise setup can't be verified automatically, such as when the Platform organization has no corresponding ChatGPT workspace, contact your OpenAI account team to request a reviewed manual association override for the enterprise account mapping that should use the tunnel.

45## Network requirements58## Network requirements

46 59

47`tunnel-client` does not need inbound internet access. It needs outbound HTTPS to OpenAI and local reachability to the private MCP server:60`tunnel-client` does not need inbound internet access. It needs outbound HTTPS to OpenAI and local reachability to the private MCP server:

95 108

96Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.109Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.

97 110

~~98If the tunnel does not appear in ChatGPT, verify that the tunnel is associated with the target workspace and that the connector operator has Tunnels **Read** + **Use**.~~111If the tunnel does not appear in ChatGPT, verify that the tunnel is associated with the target ChatGPT workspace, not only with a Platform organization, and that the connector operator has Tunnels **Read** + **Use**.

99 112

100## Security and networking113## Security and networking

101 114

128 141

129## Troubleshooting142## Troubleshooting

130 143

131- **Tunnel not visible in ChatGPT:** Check the tunnel workspace scope and the connector operator's Tunnels **Use** permission.144- **Tunnel not visible in ChatGPT:** Check that the tunnel includes the target ChatGPT workspace, not only a Platform organization; then check the connector operator's Tunnels **Use** permission. If the workspace cannot be linked automatically for an enterprise account, contact your OpenAI account team for a reviewed manual association override.

132- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.145- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.

133- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.146- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.

134- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.147- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.

guides/token-counting.md +334 −0

Details

21 21

22## Count tokens in basic messages22## Count tokens in basic messages

23 23

24Simple text input

26```python

27from openai import OpenAI

29client = OpenAI()

31response = client.responses.input_tokens.count(

32 model="gpt-5.5",

33 input="Tell me a joke."

34)

35print(response.input_tokens)

36```

38```javascript

39import OpenAI from "openai";

41const client = new OpenAI();

43const response = await client.responses.input_tokens.count({

44 model: "gpt-5.5",

45 input: "Tell me a joke.",

46});

48console.log(response.input_tokens);

49```

51```bash

52curl https://api.openai.com/v1/responses/input_tokens \

53 -H "Authorization: Bearer $OPENAI_API_KEY" \

54 -H "Content-Type: application/json" \

55 -d '{

56 "model": "gpt-5.5",

57 "input": "Tell me a joke."

58 }'

59```

61```cli

62openai responses:input-tokens count \

63 --model gpt-5.5 \

64 --input "Tell me a joke." \

65 --raw-output \

66 --transform input_tokens

67```

24## Count tokens in conversations70## Count tokens in conversations

25 71

72Multi-turn conversation

74```python

75from openai import OpenAI

77client = OpenAI()

79response = client.responses.input_tokens.count(

80 model="gpt-5.5",

81 input=[

82 {"role": "user", "content": "What is 2 + 2?"},

83 {"role": "assistant", "content": "2 + 2 equals 4."},

84 {"role": "user", "content": "What about 3 + 3?"},

85 ],

86)

87print(response.input_tokens)

88```

90```javascript

91import OpenAI from "openai";

93const client = new OpenAI();

95const response = await client.responses.input_tokens.count({

96 model: "gpt-5.5",

97 input: [

98 { role: "user", content: "What is 2 + 2?" },

99 { role: "assistant", content: "2 + 2 equals 4." },

100 { role: "user", content: "What about 3 + 3?" },

101 ],

102});

103

104console.log(response.input_tokens);

105```

106

107```bash

108curl https://api.openai.com/v1/responses/input_tokens \

109 -H "Authorization: Bearer $OPENAI_API_KEY" \

110 -H "Content-Type: application/json" \

111 -d '{

112 "model": "gpt-5.5",

113 "input": [

114 {"role": "user", "content": "What is 2 + 2?"},

115 {"role": "assistant", "content": "2 + 2 equals 4."},

116 {"role": "user", "content": "What about 3 + 3?"}

117 ]

118 }'

119```

120

121```cli

122openai responses:input-tokens count \

123 --raw-output \

124 --transform input_tokens <<'YAML'

125model: gpt-5.5

126input:

127 - role: user

128 content: What is 2 + 2?

129 - role: assistant

130 content: 2 + 2 equals 4.

131 - role: user

132 content: What about 3 + 3?

133YAML

134```

135

136

26## Count tokens with instructions137## Count tokens with instructions

27 138

139Input with system instructions

140

141```python

142from openai import OpenAI

143

144client = OpenAI()

145

146response = client.responses.input_tokens.count(

147 model="gpt-5.5",

148 instructions="You are a helpful assistant that explains concepts simply.",

149 input="Explain quantum computing in one sentence.",

150)

151print(response.input_tokens)

152```

153

154```javascript

155import OpenAI from "openai";

156

157const client = new OpenAI();

158

159const response = await client.responses.input_tokens.count({

160 model: "gpt-5.5",

161 instructions:

162 "You are a helpful assistant that explains concepts simply.",

163 input: "Explain quantum computing in one sentence.",

164});

165

166console.log(response.input_tokens);

167```

168

169```bash

170curl https://api.openai.com/v1/responses/input_tokens \

171 -H "Authorization: Bearer $OPENAI_API_KEY" \

172 -H "Content-Type: application/json" \

173 -d '{

174 "model": "gpt-5.5",

175 "instructions": "You are a helpful assistant that explains concepts simply.",

176 "input": "Explain quantum computing in one sentence."

177 }'

178```

179

180```cli

181openai responses:input-tokens count \

182 --raw-output \

183 --transform input_tokens <<'YAML'

184model: gpt-5.5

185instructions: You are a helpful assistant that explains concepts simply.

186input: Explain quantum computing in one sentence.

187YAML

188```

189

190

28## Count tokens with images191## Count tokens with images

29 192

30Images consume tokens based on size and detail level. The token counting API returns the exact count—no guesswork.193Images consume tokens based on size and detail level. The token counting API returns the exact count—no guesswork.

31 194

195Input with an image

196

197```python

198from openai import OpenAI

199

200client = OpenAI()

201

202# Use file_id from uploaded file, or image_url for a URL

203response = client.responses.input_tokens.count(

204 model="gpt-5.5",

205 input=[

206 {

207 "role": "user",

208 "content": [

209 {"type": "input_image", "image_url": "https://example.com/chart.png"},

210 {"type": "input_text", "text": "Summarize this chart."},

211 ],

212 }

213 ],

214)

215print(response.input_tokens)

216```

217

218```javascript

219import OpenAI from "openai";

220

221const client = new OpenAI();

222

223const response = await client.responses.input_tokens.count({

224 model: "gpt-5.5",

225 input: [

226 {

227 role: "user",

228 content: [

229 {

230 type: "input_image",

231 image_url: "https://example.com/chart.png",

232 },

233 { type: "input_text", text: "Summarize this chart." },

234 ],

235 },

236 ],

237});

238

239console.log(response.input_tokens);

240```

241

242```bash

243curl https://api.openai.com/v1/responses/input_tokens \

244 -H "Authorization: Bearer $OPENAI_API_KEY" \

245 -H "Content-Type: application/json" \

246 -d '{

247 "model": "gpt-5.5",

248 "input": [{

249 "role": "user",

250 "content": [

251 {"type": "input_image", "image_url": "https://example.com/chart.png"},

252 {"type": "input_text", "text": "Summarize this chart."}

253 ]

254 }]

255 }'

256```

257

258```cli

259openai responses:input-tokens count \

260 --raw-output \

261 --transform input_tokens <<'YAML'

262model: gpt-5.5

263input:

264 - role: user

265 content:

266 - type: input_image

267 image_url: https://example.com/chart.png

268 - type: input_text

269 text: Summarize this chart.

270YAML

271```

272

273

32You can use `file_id` (from the [Files API](https://developers.openai.com/api/docs/api-reference/files)) or `image_url` (a URL or base64 data URL). See [images and vision](https://developers.openai.com/api/docs/guides/images-vision) for details.274You can use `file_id` (from the [Files API](https://developers.openai.com/api/docs/api-reference/files)) or `image_url` (a URL or base64 data URL). See [images and vision](https://developers.openai.com/api/docs/guides/images-vision) for details.

33 275

34## Count tokens with tools276## Count tokens with tools

35 277

36Tool definitions (function schemas, MCP servers, etc.) add tokens to the context. Count them together with your input:278Tool definitions (function schemas, MCP servers, etc.) add tokens to the context. Count them together with your input:

37 279

280Input with function tools

281

282```python

283from openai import OpenAI

284

285client = OpenAI()

286

287response = client.responses.input_tokens.count(

288 model="gpt-5.5",

289 tools=[

290 {

291 "type": "function",

292 "name": "get_weather",

293 "description": "Get the current weather in a location",

294 "parameters": {

295 "type": "object",

296 "properties": {"location": {"type": "string"}},

297 "required": ["location"],

298 },

299 }

300 ],

301 input="What is the weather in San Francisco?",

302)

303print(response.input_tokens)

304```

305

306```javascript

307import OpenAI from "openai";

308

309const client = new OpenAI();

310

311const response = await client.responses.input_tokens.count({

312 model: "gpt-5.5",

313 tools: [

314 {

315 type: "function",

316 name: "get_weather",

317 description: "Get the current weather in a location",

318 parameters: {

319 type: "object",

320 properties: { location: { type: "string" } },

321 required: ["location"],

322 },

323 },

324 ],

325 input: "What is the weather in San Francisco?",

326});

327

328console.log(response.input_tokens);

329```

330

331```bash

332curl https://api.openai.com/v1/responses/input_tokens \

333 -H "Authorization: Bearer $OPENAI_API_KEY" \

334 -H "Content-Type: application/json" \

335 -d '{

336 "model": "gpt-5.5",

337 "tools": [{

338 "type": "function",

339 "name": "get_weather",

340 "description": "Get the current weather in a location",

341 "parameters": {

342 "type": "object",

343 "properties": {"location": {"type": "string"}},

344 "required": ["location"]

345 }

346 }],

347 "input": "What is the weather in San Francisco?"

348 }'

349```

350

351```cli

352openai responses:input-tokens count \

353 --raw-output \

354 --transform input_tokens <<'YAML'

355model: gpt-5.5

356tools:

357 - type: function

358 name: get_weather

359 description: Get the current weather in a location

360 parameters:

361 type: object

362 properties:

363 location:

364 type: string

365 required:

366 - location

367input: What is the weather in San Francisco?

368YAML

369```

370

371

38## Count tokens with files372## Count tokens with files

39 373

40[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.374[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.

guides/tools.md +113 −0

Details

64 </div>64 </div>

65 <div data-content-switcher-pane data-value="tool-search" hidden>65 <div data-content-switcher-pane data-value="tool-search" hidden>

66 <div class="hidden">Tool search</div>66 <div class="hidden">Tool search</div>

67 Load deferred tools at runtime

69```python

70from openai import OpenAI

72client = OpenAI()

74crm_namespace = {

75 "type": "namespace",

76 "name": "crm",

77 "description": "CRM tools for customer lookup and order management.",

78 "tools": [

79 {

80 "type": "function",

81 "name": "get_customer_profile",

82 "description": "Fetch a customer profile by customer ID.",

83 "parameters": {

84 "type": "object",

85 "properties": {

86 "customer_id": {"type": "string"},

87 },

88 "required": ["customer_id"],

89 "additionalProperties": False,

90 },

91 },

92 {

93 "type": "function",

94 "name": "list_open_orders",

95 "description": "List open orders for a customer ID.",

96 # highlight-start:subtle

97 "defer_loading": True,

98 # highlight-end

99 "parameters": {

100 "type": "object",

101 "properties": {

102 "customer_id": {"type": "string"},

103 },

104 "required": ["customer_id"],

105 "additionalProperties": False,

106 },

107 },

108 ],

109}

110

111response = client.responses.create(

112 model="gpt-5.5",

113 input="List open orders for customer CUST-12345.",

114 tools=[

115 crm_namespace,

116 # highlight-start:subtle

117 {"type": "tool_search"},

118 # highlight-end

119 ],

120 parallel_tool_calls=False,

121)

122

123print(response.output)

124```

125

126```javascript

127import OpenAI from "openai";

128

129const client = new OpenAI();

130

131const crmNamespace = {

132 type: "namespace",

133 name: "crm",

134 description: "CRM tools for customer lookup and order management.",

135 tools: [

136 {

137 type: "function",

138 name: "get_customer_profile",

139 description: "Fetch a customer profile by customer ID.",

140 parameters: {

141 type: "object",

142 properties: {

143 customer_id: { type: "string" },

144 },

145 required: ["customer_id"],

146 additionalProperties: false,

147 },

148 },

149 {

150 type: "function",

151 name: "list_open_orders",

152 description: "List open orders for a customer ID.",

153 // highlight-start:subtle

154 defer_loading: true,

155 // highlight-end

156 parameters: {

157 type: "object",

158 properties: {

159 customer_id: { type: "string" },

160 },

161 required: ["customer_id"],

162 additionalProperties: false,

163 },

164 },

165 ],

166};

167

168const response = await client.responses.create({

169 model: "gpt-5.5",

170 input: "List open orders for customer CUST-12345.",

171 // highlight-start:subtle

172 tools: [crmNamespace, { type: "tool_search" }],

173 // highlight-end

174 parallel_tool_calls: false,

175});

176

177console.log(response.output);

178```

179

67 </div>180 </div>

68 <div data-content-switcher-pane data-value="function-calling" hidden>181 <div data-content-switcher-pane data-value="function-calling" hidden>

69 <div class="hidden">Function calling</div>182 <div class="hidden">Function calling</div>

guides/tools-apply-patch.md +113 −0

Details

194 194

195Alternatively, you can use the [Agents SDK](https://developers.openai.com/api/docs/guides/tools#usage-in-the-agents-sdk) to use the apply patch tool. You'll still have to implement the harness that handles the actual file operations but you can use the `applyDiff` function to handle the diff processing.195Alternatively, you can use the [Agents SDK](https://developers.openai.com/api/docs/guides/tools#usage-in-the-agents-sdk) to use the apply patch tool. You'll still have to implement the harness that handles the actual file operations but you can use the `applyDiff` function to handle the diff processing.

196 196

197Use the apply patch tool with the Agents SDK

198

199```javascript

200import { applyDiff, Agent, run, applyPatchTool, Editor } from "@openai/agents";

201

202class WorkspaceEditor implements Editor {

203 async createFile(operation) {

204 // convert the diff to the file content

205 const content = applyDiff("", operation.diff, "create");

206 // write the file content to the file system

207 return { status: "completed", output: `Created ${operation.path}` };

208 }

209

210 async updateFile(operation) {

211 // read the file content from the file system

212 const current = "";

213 // convert the diff to the new file content

214 const newContent = applyDiff(current, operation.diff);

215 // write the updated file content to the file system

216 return { status: "completed", output: `Updated ${operation.path}` };

217 }

218

219 async deleteFile(operation) {

220 // delete the file from the file system

221 return { status: "completed", output: `Deleted ${operation.path}` };

222 }

223}

224

225const editor = new WorkspaceEditor();

226

227const agent = new Agent({

228 name: "Patch Assistant",

229 model: "gpt-5.5",

230 instructions: "You can edit files inside the /tmp directory using the apply_patch tool.",

231 tools: [

232 applyPatchTool({

233 editor,

234 // could also be a function for you to determine if approval is needed

235 needsApproval: true,

236 onApproval: async (_ctx, _approvalItem) => {

237 // create your own approval logic

238 return { approve: true };

239 },

240 }),

241 ],

242});

243

244const result = await run(

245 agent,

246 "Create tasks.md with a shopping checklist of 5 entries."

247);

248

249console.log(`\nFinal response:\n${result.finalOutput}`);

250```

251

252```python

253from agents import Agent, ApplyPatchTool, Runner, apply_diff

254

255

256class WorkspaceEditor:

257 async def create_file(self, operation):

258 # convert the diff to the file content

259 content = apply_diff("", operation.diff, create=True)

260 # write the file content to the file system

261 return {"status": "completed", "output": f"Created {operation.path}"}

262

263 async def update_file(self, operation):

264 # read the file content from the file system

265 current = ""

266 # convert the diff to the new file content

267 new_content = apply_diff(current, operation.diff)

268 # write the updated file content to the file system

269 return {"status": "completed", "output": f"Updated {operation.path}"}

270

271 async def delete_file(self, operation):

272 # delete the file from the file system

273 return {"status": "completed", "output": f"Deleted {operation.path}"}

274

275

276editor = WorkspaceEditor()

277

278agent = Agent(

279 name="Patch Assistant",

280 model="gpt-5.5",

281 instructions="You can edit files inside the /tmp directory using the apply_patch tool.",

282 tools=[

283 ApplyPatchTool(

284 editor=editor,

285 # could also be a function for you to determine if approval is needed

286 needs_approval=True,

287 # Implement your own approval logic

288 on_approval=lambda _ctx, _approval_item: {"approve": True},

289 ),

290 ],

291)

292

293

294async def main():

295 result = await Runner.run(

296 agent,

297 input="Create tasks.md with a shopping checklist of 5 entries.",

298 )

299

300 print(f"\nFinal response:\n{result.final_output}")

301

302

303if __name__ == "__main__":

304 import asyncio

305

306 asyncio.run(main())

307```

308

309

197You can find full working examples on GitHub.310You can find full working examples on GitHub.

198 311

199<a312<a

guides/tools-shell.md +426 −0

Details

21 21

22Shell tool with container_auto22Shell tool with container_auto

23 23

24```bash

25curl -L 'https://api.openai.com/v1/responses' \

26 -H "Content-Type: application/json" \

27 -H "Authorization: Bearer $OPENAI_API_KEY" \

28 -d '{

29 "model": "gpt-5.5",

30 "tools": [

31 { "type": "shell", "environment": { "type": "container_auto" } }

32 ],

33 "input": [

34 {

35 "type": "message",

36 "role": "user",

37 "content": [

38 { "type": "input_text", "text": "Execute: ls -lah /mnt/data && python --version && node --version" }

39 ]

40 }

41 ],

42 "tool_choice": "auto"

43 }'

44```

24```javascript46```javascript

25import OpenAI from "openai";47import OpenAI from "openai";

26 48

100 122

101Create a reusable container123Create a reusable container

102 124

125```bash

126curl -L 'https://api.openai.com/v1/containers' \

127 -H "Content-Type: application/json" \

128 -H "Authorization: Bearer $OPENAI_API_KEY" \

129 -d '{

130 "name": "analysis-container",

131 "memory_limit": "1g",

132 "expires_after": { "anchor": "last_active_at", "minutes": 20 }

133 }'

134```

135

103```javascript136```javascript

104import OpenAI from "openai";137import OpenAI from "openai";

105 138

133 166

134Use shell with container_reference167Use shell with container_reference

135 168

169```bash

170curl -L 'https://api.openai.com/v1/responses' \

171 -H "Content-Type: application/json" \

172 -H "Authorization: Bearer $OPENAI_API_KEY" \

173 -d '{

174 "model": "gpt-5.5",

175 "tools": [

176 {

177 "type": "shell",

178 "environment": {

179 "type": "container_reference",

180 "container_id": "cntr_08f3d96c87a585390069118b594f7481a088b16cda7d9415fe"

181 }

182 }

183 ],

184 "input": "List files in the container and show disk usage."

185 }'

186```

187

136```javascript188```javascript

137import OpenAI from "openai";189import OpenAI from "openai";

138 190

186 238

187Create a container with attached skills239Create a container with attached skills

188 240

241```bash

242curl -L 'https://api.openai.com/v1/containers' \

243 -H "Content-Type: application/json" \

244 -H "Authorization: Bearer $OPENAI_API_KEY" \

245 -d '{

246 "name": "skill-container",

247 "skills": [

248 { "type": "skill_reference", "skill_id": "skill_4db6f1a2c9e73508b41f9da06e2c7b5f" },

249 { "type": "skill_reference", "skill_id": "openai-spreadsheets", "version": "latest" }

250 ]

251 }'

252```

253

189```javascript254```javascript

190import OpenAI from "openai";255import OpenAI from "openai";

191 256

230 295

231Shell tool with network allowlist296Shell tool with network allowlist

232 297

298```bash

299curl -L 'https://api.openai.com/v1/responses' \

300 -H "Authorization: Bearer $OPENAI_API_KEY" \

301 -H "Content-Type: application/json" \

302 -d '{

303 "model": "gpt-5.5",

304 "tool_choice": "required",

305 "tools": [

306 {

307 "type": "shell",

308 "environment": {

309 "type": "container_auto",

310 "network_policy": {

311 "type": "allowlist",

312 "allowed_domains": ["pypi.org", "files.pythonhosted.org", "github.com"]

313 }

314 }

315 }

316 ],

317 "input": [

318 {

319 "role": "user",

320 "content": "In the container, pip install httpx beautifulsoup4, fetch release pages, and write /mnt/data/release_digest.md."

321 }

322 ]

323 }'

324```

325

233```javascript326```javascript

234import OpenAI from "openai";327import OpenAI from "openai";

235 328

323 416

324Use inline files and inline skills417Use inline files and inline skills

325 418

419```bash

420INLINE_ZIP=$(base64 -i ./csv_insights.zip)

421REPORT_CSV=$(base64 -i ./report.csv)

422

423CONTAINER_ID=$(

424 curl -sL 'https://api.openai.com/v1/containers' \

425 -H "Content-Type: application/json" \

426 -H "Authorization: Bearer $OPENAI_API_KEY" \

427 -d '{

428 "name": "inline-skill-container",

429 "skills": [

430 {

431 "type": "inline",

432 "name": "csv-insights",

433 "description": "Summarize CSV files and produce a markdown report.",

434 "source": {

435 "type": "base64",

436 "media_type": "application/zip",

437 "data": "'"$INLINE_ZIP"'"

438 }

439 }

440 ]

441 }' | jq -r '.id'

442)

443

444curl -L 'https://api.openai.com/v1/responses' \

445 -H "Content-Type: application/json" \

446 -H "Authorization: Bearer $OPENAI_API_KEY" \

447 -d '{

448 "model": "gpt-5.5",

449 "tools": [

450 {

451 "type": "shell",

452 "environment": {

453 "type": "container_reference",

454 "container_id": "'"$CONTAINER_ID"'"

455 }

456 }

457 ],

458 "input": [

459 {

460 "role": "user",

461 "content": [

462 {

463 "type": "input_file",

464 "filename": "report.csv",

465 "file_data": "data:text/csv;base64,'"${REPORT_CSV}"'"

466 },

467 {

468 "type": "input_text",

469 "text": "Use the csv-insights skill to summarize report.csv."

470 }

471 ]

472 }

473 ]

474 }'

475```

476

326```javascript477```javascript

327import fs from "fs";478import fs from "fs";

328import OpenAI from "openai";479import OpenAI from "openai";

449 600

450Delete a container601Delete a container

451 602

603```bash

604curl -L -X DELETE 'https://api.openai.com/v1/containers/container_id' \

605 -H "Authorization: Bearer $OPENAI_API_KEY"

606```

607

452```javascript608```javascript

453import OpenAI from "openai";609import OpenAI from "openai";

454 610

490 646

491Shell tool with domain_secrets647Shell tool with domain_secrets

492 648

649```bash

650curl -L 'https://api.openai.com/v1/responses' \

651 -H "Authorization: Bearer $OPENAI_API_KEY" \

652 -H "Content-Type: application/json" \

653 -d '{

654 "model": "gpt-5.5",

655 "input": [

656 {

657 "role": "user",

658 "content": "Use curl to call https://httpbin.org/headers with header Authorization: Bearer $API_KEY. Tell me what you see in the final text response."

659 }

660 ],

661 "tool_choice": "required",

662 "tools": [

663 {

664 "type": "shell",

665 "environment": {

666 "type": "container_auto",

667 "network_policy": {

668 "type": "allowlist",

669 "allowed_domains": ["httpbin.org"],

670 "domain_secrets": [

671 {

672 "domain": "httpbin.org",

673 "name": "API_KEY",

674 "value": "debug-secret-123"

675 }

676 ]

677 }

678 }

679 }

680 ]

681 }'

682```

683

493```javascript684```javascript

494import OpenAI from "openai";685import OpenAI from "openai";

495 686

574 765

575Continue a shell workflow766Continue a shell workflow

576 767

768```bash

769curl -L 'https://api.openai.com/v1/responses' \

770 -H "Content-Type: application/json" \

771 -H "Authorization: Bearer $OPENAI_API_KEY" \

772 -d '{

773 "model": "gpt-5.5",

774 "previous_response_id": "resp_2a8e5c9174d63b0f18a4c572de9f64a1b3c76d508e12f9ab47",

775 "tools": [

776 {

777 "type": "shell",

778 "environment": {

779 "type": "container_reference",

780 "container_id": "cntr_f19c2b51e4a06793d82d54a7be0fc9154d3361ab28ce7f6041"

781 }

782 }

783 ],

784 "input": "Read /mnt/data/top5.csv and report the top candidate."

785 }'

786```

787

577```javascript788```javascript

578import OpenAI from "openai";789import OpenAI from "openai";

579 790

650 861

651Use this mode when you need full control over execution environment, filesystem access, or existing internal tooling.862Use this mode when you need full control over execution environment, filesystem access, or existing internal tooling.

652 863

864Local shell request

865

866```bash

867curl -L 'https://api.openai.com/v1/responses' \

868 -H "Content-Type: application/json" \

869 -H "Authorization: Bearer $OPENAI_API_KEY" \

870 -d '{

871 "model": "gpt-5.5",

872 "instructions": "The local bash shell environment is on Mac.",

873 "input": "find me the largest pdf file in ~/Documents",

874 "tools": [{ "type": "shell", "environment": { "type": "local" } }]

875 }'

876```

877

878```python

879from openai import OpenAI

880

881client = OpenAI()

882

883response = client.responses.create(

884 model="gpt-5.5",

885 instructions="The local bash shell environment is on Mac.",

886 input="find me the largest pdf file in ~/Documents",

887 tools=[{"type": "shell", "environment": {"type": "local"}}],

888)

889

890print(response)

891```

892

893```javascript

894import OpenAI from "openai";

895

896const client = new OpenAI();

897

898const response = await client.responses.create({

899 model: "gpt-5.5",

900 instructions: "The local bash shell environment is on Mac.",

901 input: "find me the largest pdf file in ~/Documents",

902 tools: [{ type: "shell", environment: { type: "local" } }],

903});

904

905console.log(response);

906```

907

908

653When you receive `shell_call` output items:909When you receive `shell_call` output items:

654 910

655- Execute requested commands in your runtime.911- Execute requested commands in your runtime.

656- Capture `stdout`, `stderr`, and outcome.912- Capture `stdout`, `stderr`, and outcome.

657- Return results as `shell_call_output` in the next request.913- Return results as `shell_call_output` in the next request.

658 914

915Local shell executor example

916

917```python

918@dataclass

919class CmdResult:

920 stdout: str

921 stderr: str

922 exit_code: int | None

923 timed_out: bool

924

925class ShellExecutor:

926 def __init__(self, default_timeout: float = 60):

927 self.default_timeout = default_timeout

928

929 def run(self, cmd: str, timeout: float | None = None) -> CmdResult:

930 t = timeout or self.default_timeout

931 p = subprocess.Popen(

932 cmd,

933 shell=True,

934 stdout=subprocess.PIPE,

935 stderr=subprocess.PIPE,

936 text=True,

937 )

938 try:

939 out, err = p.communicate(timeout=t)

940 return CmdResult(out, err, p.returncode, False)

941 except subprocess.TimeoutExpired:

942 p.kill()

943 out, err = p.communicate()

944 return CmdResult(out, err, p.returncode, True)

945```

946

947```javascript

948import { exec } from "node:child_process/promises";

949

950class ShellExecutor {

951 constructor(defaultTimeoutMs = 60_000) {

952 this.defaultTimeoutMs = defaultTimeoutMs;

953 }

954

955 async run(cmd, timeoutMs) {

956 const timeout = timeoutMs ?? this.defaultTimeoutMs;

957

958 try {

959 const { stdout, stderr } = await exec(cmd, { timeout });

960 return { stdout, stderr, exitCode: 0, timedOut: false };

961 } catch (error) {

962 const timedOut = Boolean(error?.killed) && error?.signal === "SIGTERM";

963 const exitCode = timedOut ? null : error?.code ?? null;

964 return {

965 stdout: error?.stdout ?? "",

966 stderr: error?.stderr ?? String(error),

967 exitCode,

968 timedOut,

969 };

970 }

971 }

972}

973```

974

975

659Example shell_call_output payload976Example shell_call_output payload

660 977

661```json978```json

690 1007

691If you are using the [Agents SDK](https://developers.openai.com/api/docs/guides/tools#usage-in-the-agents-sdk), you can pass your own shell executor implementation to the shell tool helper.1008If you are using the [Agents SDK](https://developers.openai.com/api/docs/guides/tools#usage-in-the-agents-sdk), you can pass your own shell executor implementation to the shell tool helper.

692 1009

1010Use local shell with Agents SDK

1011

1012```javascript

1013import {

1014 Agent,

1015 run,

1016 withTrace,

1017 Shell,

1018 ShellAction,

1019 ShellResult,

1020 shellTool,

1021} from "@openai/agents";

1022

1023class LocalShell implements Shell {

1024 async run(action: ShellAction): Promise<ShellResult> {

1025 return {

1026 output: [

1027 {

1028 stdout: "Shell is not available. Needs to be implemented first.",

1029 stderr: "",

1030 outcome: {

1031 type: "exit",

1032 exitCode: 1,

1033 },

1034 },

1035 ],

1036 maxOutputLength: action.maxOutputLength,

1037 };

1038 }

1039}

1040

1041const shell = new LocalShell();

1042

1043const agent = new Agent({

1044 name: "Shell Assistant",

1045 model: "gpt-5.5",

1046 instructions:

1047 "You can execute shell commands to inspect the repository. Keep responses concise and include command output when helpful.",

1048 tools: [

1049 shellTool({

1050 shell,

1051 needsApproval: true,

1052 onApproval: async (_ctx, _approvalItem) => {

1053 return { approve: true };

1054 },

1055 }),

1056 ],

1057});

1058

1059await withTrace("shell-tool-example", async () => {

1060 const result = await run(agent, "Show the Node.js version.");

1061 console.log(`\nFinal response:\n${result.finalOutput}`);

1062});

1063```

1064

1065```python

1066from agents import (

1067 Agent,

1068 Runner,

1069 ShellCallOutcome,

1070 ShellCommandOutput,

1071 ShellCommandRequest,

1072 ShellResult,

1073 ShellTool,

1074)

1075

1076

1077class LocalShell:

1078 async def __call__(self, request: ShellCommandRequest) -> ShellResult:

1079 action = request.data.action

1080 return ShellResult(

1081 output=[

1082 ShellCommandOutput(

1083 command="(not executed)",

1084 stdout="Shell is not available. Needs to be implemented first.",

1085 stderr="",

1086 outcome=ShellCallOutcome(type="exit", exit_code=1),

1087 )

1088 ],

1089 max_output_length=action.max_output_length,

1090 )

1091

1092

1093shell_tool = ShellTool(

1094 executor=LocalShell(),

1095 needs_approval=True,

1096 on_approval=lambda _ctx, _approval_item: {"approve": True},

1097)

1098

1099agent = Agent(

1100 name="Shell Assistant",

1101 model="gpt-5.5",

1102 instructions="You can execute shell commands to inspect the repository. Keep responses concise and include command output when helpful.",

1103 tools=[shell_tool],

1104)

1105

1106

1107async def main():

1108 result = await Runner.run(agent, input="Show the Node.js version.")

1109 print(f"\nFinal response:\n{result.final_output}")

1110

1111

1112if __name__ == "__main__":

1113 import asyncio

1114

1115 asyncio.run(main())

1116```

1117

1118

693You can find working examples in the SDK repositories.1119You can find working examples in the SDK repositories.

694 1120

695<a href="https://github.com/openai/openai-agents-js/blob/main/examples/tools/shell.ts" target="_blank" rel="noreferrer">1121<a href="https://github.com/openai/openai-agents-js/blob/main/examples/tools/shell.ts" target="_blank" rel="noreferrer">

guides/tools-skills.md +47 −0

Details

61 61

62Use skills in hosted shell62Use skills in hosted shell

63 63

64```bash

65curl -L 'https://api.openai.com/v1/responses' \

66 -H "Content-Type: application/json" \

67 -H "Authorization: Bearer $OPENAI_API_KEY" \

68 -d '{

69 "model": "gpt-5.5",

70 "tools": [

71 {

72 "type": "shell",

73 "environment": {

74 "type": "container_auto",

75 "skills": [

76 { "type": "skill_reference", "skill_id": "<skill_id>" },

77 { "type": "skill_reference", "skill_id": "<skill_id>", "version": 2 }

78 ]

79 }

80 }

81 ],

82 "input": "Use the skills to add 144 and 377, then compute triangle area with base 9 height 13."

83 }'

84```

64```javascript86```javascript

65import OpenAI from "openai";87import OpenAI from "openai";

66 88

127 149

128Use skills in local shell mode150Use skills in local shell mode

129 151

152```bash

153curl -L 'https://api.openai.com/v1/responses' \

154 -H "Content-Type: application/json" \

155 -H "Authorization: Bearer $OPENAI_API_KEY" \

156 -d '{

157 "model": "gpt-5.5",

158 "tools": [

159 {

160 "type": "shell",

161 "environment": {

162 "type": "local",

163 "skills": [

164 {

165 "name": "csv-insights",

166 "description": "Summarize CSV files and produce a markdown report.",

167 "path": "<path-to-skill-folder>"

168 }

169 ]

170 }

171 }

172 ],

173 "input": "Use the csv-insights skill and run locally to summarize today\'s CSV reports in this repo."

174 }'

175```

176

130```javascript177```javascript

131import OpenAI from "openai";178import OpenAI from "openai";

132 179

guides/tools-tool-search.md +376 −0

Details

73 73

74Hosted tool search is the simplest path when you already know the full inventory of [functions](https://developers.openai.com/api/docs/guides/function-calling#defining-functions), [namespaces](https://developers.openai.com/api/docs/guides/function-calling#defining-namespaces), or [MCP servers](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) you want the model to search. You declare them up front, add `{"type": "tool_search"}`, and let the API decide what to load.74Hosted tool search is the simplest path when you already know the full inventory of [functions](https://developers.openai.com/api/docs/guides/function-calling#defining-functions), [namespaces](https://developers.openai.com/api/docs/guides/function-calling#defining-namespaces), or [MCP servers](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) you want the model to search. You declare them up front, add `{"type": "tool_search"}`, and let the API decide what to load.

75 75

76Configure hosted tool search

78```python

79from openai import OpenAI

81client = OpenAI()

83crm_namespace = {

84 "type": "namespace",

85 "name": "crm",

86 "description": "CRM tools for customer lookup and order management.",

87 "tools": [

88 {

89 "type": "function",

90 "name": "get_customer_profile",

91 "description": "Fetch a customer profile by customer ID.",

92 "parameters": {

93 "type": "object",

94 "properties": {

95 "customer_id": {"type": "string"},

96 },

97 "required": ["customer_id"],

98 "additionalProperties": False,

99 },

100 },

101 {

102 "type": "function",

103 "name": "list_open_orders",

104 "description": "List open orders for a customer ID.",

105 # highlight-start:subtle

106 "defer_loading": True,

107 # highlight-end

108 "parameters": {

109 "type": "object",

110 "properties": {

111 "customer_id": {"type": "string"},

112 },

113 "required": ["customer_id"],

114 "additionalProperties": False,

115 },

116 },

117 ],

118}

119

120response = client.responses.create(

121 model="gpt-5.5",

122 input="List open orders for customer CUST-12345.",

123 tools=[

124 crm_namespace,

125 # highlight-start:subtle

126 {"type": "tool_search"},

127 # highlight-end

128 ],

129 parallel_tool_calls=False,

130)

131

132print(response.output)

133```

134

135```javascript

136import OpenAI from "openai";

137

138const client = new OpenAI();

139

140const crmNamespace = {

141 type: "namespace",

142 name: "crm",

143 description: "CRM tools for customer lookup and order management.",

144 tools: [

145 {

146 type: "function",

147 name: "get_customer_profile",

148 description: "Fetch a customer profile by customer ID.",

149 parameters: {

150 type: "object",

151 properties: {

152 customer_id: { type: "string" },

153 },

154 required: ["customer_id"],

155 additionalProperties: false,

156 },

157 },

158 {

159 type: "function",

160 name: "list_open_orders",

161 description: "List open orders for a customer ID.",

162 // highlight-start:subtle

163 defer_loading: true,

164 // highlight-end

165 parameters: {

166 type: "object",

167 properties: {

168 customer_id: { type: "string" },

169 },

170 required: ["customer_id"],

171 additionalProperties: false,

172 },

173 },

174 ],

175};

176

177const response = await client.responses.create({

178 model: "gpt-5.5",

179 input: "List open orders for customer CUST-12345.",

180 // highlight-start:subtle

181 tools: [crmNamespace, { type: "tool_search" }],

182 // highlight-end

183 parallel_tool_calls: false,

184});

185

186console.log(response.output);

187```

188

189

76If the model decides it needs a deferred tool, the response includes two additional output items before the eventual function call:190If the model decides it needs a deferred tool, the response includes two additional output items before the eventual function call:

77 191

78- `tool_search_call`, which records the hosted search step.192- `tool_search_call`, which records the hosted search step.

79- `tool_search_output`, which contains the loaded subset that becomes callable.193- `tool_search_output`, which contains the loaded subset that becomes callable.

80 194

195Hosted tool search response

196

197```json

198[

199 {

200 // highlight-start:subtle

201 "type": "tool_search_call",

202 // highlight-end

203 "execution": "server",

204 "call_id": null,

205 "status": "completed",

206 "arguments": {

207 "paths": ["crm"]

208 }

209 },

210 {

211 // highlight-start:subtle

212 "type": "tool_search_output",

213 // highlight-end

214 "execution": "server",

215 "call_id": null,

216 "status": "completed",

217 "tools": [

218 {

219 "type": "namespace",

220 "name": "crm",

221 "description": "CRM tools for customer lookup and order management.",

222 "tools": [

223 {

224 "type": "function",

225 "name": "list_open_orders",

226 "description": "List open orders for a customer ID.",

227 "defer_loading": true,

228 "parameters": {

229 "type": "object",

230 "properties": {

231 "customer_id": { "type": "string" }

232 },

233 "required": ["customer_id"],

234 "additionalProperties": false

235 }

236 }

237 ]

238 }

239 ]

240 },

241 {

242 "type": "function_call",

243 "name": "list_open_orders",

244 "namespace": "crm",

245 "call_id": "call_abc123",

246 "arguments": "{\"customer_id\":\"CUST-12345\"}"

247 }

248]

249```

250

251

81In hosted mode, `execution` is set to `server` and `call_id` is set to `null`.252In hosted mode, `execution` is set to `server` and `call_id` is set to `null`.

82 253

83For more complex tasks, the model can also load multiple namespaces or MCP servers in the same `tool_search_call`. For example, if it needs functions from different namespaces to complete one task, it may choose to search and load those surfaces together before making the subsequent function calls.254For more complex tasks, the model can also load multiple namespaces or MCP servers in the same `tool_search_call`. For example, if it needs functions from different namespaces to complete one task, it may choose to search and load those surfaces together before making the subsequent function calls.

88 259

89Configure the `tool_search` tool with `execution: "client"` and a schema for the search arguments your application expects:260Configure the `tool_search` tool with `execution: "client"` and a schema for the search arguments your application expects:

90 261

262Configure client-executed tool search

263

264```python

265from openai import OpenAI

266

267client = OpenAI()

268

269first_response = client.responses.create(

270 model="gpt-5.5",

271 input="Find the shipping ETA tool first, then use it for order_42.",

272 tools=[

273 {

274 "type": "tool_search",

275 # highlight-start:subtle

276 "execution": "client",

277 # highlight-end

278 "description": "Find the project-specific tools needed to continue the task.",

279 "parameters": {

280 "type": "object",

281 "properties": {

282 "goal": {"type": "string"},

283 },

284 "required": ["goal"],

285 "additionalProperties": False,

286 },

287 }

288 ],

289 parallel_tool_calls=False,

290)

291

292search_call = next(

293 item for item in first_response.output if item.type == "tool_search_call"

294)

295

296loaded_tools = [

297 {

298 "type": "function",

299 "name": "get_shipping_eta",

300 "description": "Look up shipping ETA details for an order.",

301 "defer_loading": True,

302 "parameters": {

303 "type": "object",

304 "properties": {

305 "order_id": {"type": "string"},

306 },

307 "required": ["order_id"],

308 "additionalProperties": False,

309 },

310 }

311]

312

313second_response = client.responses.create(

314 model="gpt-5.5",

315 input=[

316 *first_response.output,

317 {

318 # highlight-start:subtle

319 "type": "tool_search_output",

320 # highlight-end

321 "execution": "client",

322 "call_id": search_call.call_id,

323 "status": "completed",

324 # highlight-start:subtle

325 "tools": loaded_tools,

326 # highlight-end

327 },

328 ],

329)

330

331print(second_response.output)

332```

333

334```javascript

335import OpenAI from "openai";

336

337const client = new OpenAI();

338

339const firstResponse = await client.responses.create({

340 model: "gpt-5.5",

341 input: "Find the shipping ETA tool first, then use it for order_42.",

342 tools: [

343 {

344 type: "tool_search",

345 // highlight-start:subtle

346 execution: "client",

347 // highlight-end

348 description: "Find the project-specific tools needed to continue the task.",

349 parameters: {

350 type: "object",

351 properties: {

352 goal: { type: "string" },

353 },

354 required: ["goal"],

355 additionalProperties: false,

356 },

357 },

358 ],

359 parallel_tool_calls: false,

360});

361

362const searchCall = firstResponse.output.find(

363 (item) => item.type === "tool_search_call",

364);

365

366const loadedTools = [

367 {

368 type: "function",

369 name: "get_shipping_eta",

370 description: "Look up shipping ETA details for an order.",

371 defer_loading: true,

372 parameters: {

373 type: "object",

374 properties: {

375 order_id: { type: "string" },

376 },

377 required: ["order_id"],

378 additionalProperties: false,

379 },

380 },

381];

382

383const secondResponse = await client.responses.create({

384 model: "gpt-5.5",

385 input: [

386 ...firstResponse.output,

387 {

388 // highlight-start:subtle

389 type: "tool_search_output",

390 // highlight-end

391 execution: "client",

392 call_id: searchCall.call_id,

393 status: "completed",

394 // highlight-start:subtle

395 tools: loadedTools,

396 // highlight-end

397 },

398 ],

399});

400

401console.log(secondResponse.output);

402```

403

404

91On the first turn, the model emits a `tool_search_call` and stops there:405On the first turn, the model emits a `tool_search_call` and stops there:

92 406

407Client tool search call

408

409```json

410[

411 {

412 "type": "tool_search_call",

413 "execution": "client",

414 "call_id": "call_abc123",

415 "status": "completed",

416 "arguments": {

417 "goal": "Find the shipping ETA tool for order_42."

418 }

419 }

420]

421```

422

423

93Your application then performs the search and returns a `tool_search_output` with the tools it wants to load:424Your application then performs the search and returns a `tool_search_output` with the tools it wants to load:

94 425

426Return tool_search_output

427

428```json

429[

430 {

431 "type": "tool_search_output",

432 "execution": "client",

433 "call_id": "call_abc123",

434 "status": "completed",

435 "tools": [

436 {

437 "type": "function",

438 "name": "get_shipping_eta",

439 "description": "Look up shipping ETA details for an order.",

440 "defer_loading": true,

441 "parameters": {

442 "type": "object",

443 "properties": {

444 "order_id": { "type": "string" }

445 },

446 "required": ["order_id"],

447 "additionalProperties": false

448 }

449 }

450 ]

451 }

452]

453```

454

455

95On the next turn, the loaded tool is callable like a normal function:456On the next turn, the loaded tool is callable like a normal function:

96 457

458Loaded function call

459

460```json

461[

462 {

463 "type": "function_call",

464 "name": "get_shipping_eta",

465 "namespace": "get_shipping_eta",

466 "call_id": "call_xyz456",

467 "arguments": "{\"order_id\":\"order_42\"}"

468 }

469]

470```

471

472

97In client mode, `execution` is set to `client` and `call_id` is defined. Echo the same `call_id` from the `tool_search_call` in your `tool_search_output`.473In client mode, `execution` is set to `client` and `call_id` is defined. Echo the same `call_id` from the `tool_search_call` in your `tool_search_output`.

98 474

99## Advanced usage475## Advanced usage

guides/video-generation.md +10 −9

Details

341 341

342For each completed video, you can also download a **thumbnail** and a **spritesheet**. These are lightweight assets useful for previews, scrubbers, or catalog displays. Use the `variant` query parameter to specify what you want to download. The default is `variant=video` for the MP4.342For each completed video, you can also download a **thumbnail** and a **spritesheet**. These are lightweight assets useful for previews, scrubbers, or catalog displays. Use the `variant` query parameter to specify what you want to download. The default is `variant=video` for the MP4.

343 343

344```shell344```bash

345# Download a thumbnail345# Download a thumbnail

346curl -L "https://api.openai.com/v1/videos/video_abc123/content?variant=thumbnail" \346curl -L "https://api.openai.com/v1/videos/video_abc123/content?variant=thumbnail" \

347 -H "Authorization: Bearer $OPENAI_API_KEY" \347 -H "Authorization: Bearer $OPENAI_API_KEY" \

366 367

367Supported file formats are `image/jpeg`, `image/png`, and `image/webp`.368Supported file formats are `image/jpeg`, `image/png`, and `image/webp`.

368 369

369```shell370```bash

370curl -X POST "https://api.openai.com/v1/videos" \371curl -X POST "https://api.openai.com/v1/videos" \

371 -H "Authorization: Bearer $OPENAI_API_KEY" \372 -H "Authorization: Bearer $OPENAI_API_KEY" \

372 -H "Content-Type: multipart/form-data" \373 -H "Content-Type: multipart/form-data" \

403 team](https://openai.com/contact-sales/) to learn more about eligibility for405 team](https://openai.com/contact-sales/) to learn more about eligibility for

404 human-likeness access.406 human-likeness access.

405 407

406```shell408```bash

407curl -X POST "https://api.openai.com/v1/videos/characters" \409curl -X POST "https://api.openai.com/v1/videos/characters" \

408 -H "Authorization: Bearer $OPENAI_API_KEY" \410 -H "Authorization: Bearer $OPENAI_API_KEY" \

409 -H "Content-Type: multipart/form-data" \411 -H "Content-Type: multipart/form-data" \

417Characters can be combined with `input_reference`. Extensions don't support420Characters can be combined with `input_reference`. Extensions don't support

418characters.421characters.

419 422

420```shell423```bash

421curl -X POST "https://api.openai.com/v1/videos" \424curl -X POST "https://api.openai.com/v1/videos" \

422 -H "Authorization: Bearer $OPENAI_API_KEY" \425 -H "Authorization: Bearer $OPENAI_API_KEY" \

423 -H "Content-Type: application/json" \426 -H "Content-Type: application/json" \

443 currently accept only a source video and prompt. They don't support characters447 currently accept only a source video and prompt. They don't support characters

444 or image references.448 or image references.

445 449

446```shell450```bash

447curl -X POST "https://api.openai.com/v1/videos/extensions" \451curl -X POST "https://api.openai.com/v1/videos/extensions" \

448 -H "Authorization: Bearer $OPENAI_API_KEY" \452 -H "Authorization: Bearer $OPENAI_API_KEY" \

449 -H "Content-Type: application/json" \453 -H "Content-Type: application/json" \

470 account manager or [reach out to our sales475 account manager or [reach out to our sales

471 team](https://openai.com/contact-sales/) if you need this workflow.476 team](https://openai.com/contact-sales/) if you need this workflow.

472 477

473```shell478```bash

474curl -X POST "https://api.openai.com/v1/videos/edits" \479curl -X POST "https://api.openai.com/v1/videos/edits" \

475 -H "Authorization: Bearer $OPENAI_API_KEY" \480 -H "Authorization: Bearer $OPENAI_API_KEY" \

476 -H "Content-Type: application/json" \481 -H "Content-Type: application/json" \

485If you upload a new video instead of editing an existing generation, set491If you upload a new video instead of editing an existing generation, set

486`model` explicitly in the request.492`model` explicitly in the request.

487 493

488```shell494```bash

489curl -X POST "https://api.openai.com/v1/videos/edits" \495curl -X POST "https://api.openai.com/v1/videos/edits" \

490 -H "Authorization: Bearer $OPENAI_API_KEY" \496 -H "Authorization: Bearer $OPENAI_API_KEY" \

491 -H "Content-Type: multipart/form-data" \497 -H "Content-Type: multipart/form-data" \

525 532

526Use `GET /videos` to enumerate your videos. The endpoint supports optional query parameters for pagination and sorting.533Use `GET /videos` to enumerate your videos. The endpoint supports optional query parameters for pagination and sorting.

527 534

528```shell535```bash

529curl "https://api.openai.com/v1/videos?limit=20&after=video_123&order=asc" \536curl "https://api.openai.com/v1/videos?limit=20&after=video_123&order=asc" \

530 -H "Authorization: Bearer $OPENAI_API_KEY" | jq .537 -H "Authorization: Bearer $OPENAI_API_KEY" | jq .

531```538```

532 539

540

533Use `DELETE /videos/{video_id}` to remove videos you no longer need from OpenAI’s storage.541Use `DELETE /videos/{video_id}` to remove videos you no longer need from OpenAI’s storage.

534 542

535```shell543```bash

536curl -X DELETE "https://api.openai.com/v1/videos/REPLACE_WITH_YOUR_VIDEO_ID" \544curl -X DELETE "https://api.openai.com/v1/videos/REPLACE_WITH_YOUR_VIDEO_ID" \

537 -H "Authorization: Bearer $OPENAI_API_KEY" | jq .545 -H "Authorization: Bearer $OPENAI_API_KEY" | jq .

538```546```

guides/workload-identity-federation/aws.md +640 −2

Details

155 155

156Set `OPENAI_WIF_AUDIENCE` to the same audience configured on the OpenAI Workload Identity Provider. The subject token provider calls AWS STS `GetWebIdentityToken` with that audience, returns the AWS-issued JWT as the subject token, and the OpenAI SDK exchanges it for an OpenAI-issued access token.156Set `OPENAI_WIF_AUDIENCE` to the same audience configured on the OpenAI Workload Identity Provider. The subject token provider calls AWS STS `GetWebIdentityToken` with that audience, returns the AWS-issued JWT as the subject token, and the OpenAI SDK exchanges it for an OpenAI-issued access token.

157 157

158</div>158Authenticate from an AWS-issued OIDC token

159

160```typescript

161import { GetWebIdentityTokenCommand, STSClient } from "@aws-sdk/client-sts";

162import OpenAI from "openai";

163import type { SubjectTokenProvider } from "openai/auth";

164

165const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

166const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

167const audience = process.env.OPENAI_WIF_AUDIENCE;

168const awsRegion = process.env.AWS_REGION;

169

170if (!identityProviderId || !serviceAccountId || !audience || !awsRegion) {

171 throw new Error(

172 "Set OPENAI_IDENTITY_PROVIDER_ID, OPENAI_SERVICE_ACCOUNT_ID, OPENAI_WIF_AUDIENCE, and AWS_REGION"

173 );

174}

175

176const sts = new STSClient({ region: awsRegion });

177

178function awsOutboundWebIdentityTokenProvider(): SubjectTokenProvider {

179 return {

180 tokenType: "jwt",

181 getToken: async () => {

182 const response = await sts.send(

183 new GetWebIdentityTokenCommand({

184 Audience: [audience],

185 SigningAlgorithm: "ES384",

186 DurationSeconds: 300,

187 })

188 );

189

190 if (!response.WebIdentityToken) {

191 throw new Error("AWS STS did not return a web identity token.");

192 }

193

194 return response.WebIdentityToken;

195 },

196 };

197}

198

199const client = new OpenAI({

200 workloadIdentity: {

201 identityProviderId,

202 serviceAccountId,

203 provider: awsOutboundWebIdentityTokenProvider(),

204 },

205});

206

207const response = await client.responses.create({

208 model: "gpt-5.4-mini",

209 input: "Say hello from AWS outbound workload identity federation.",

210});

211

212console.log(response.output_text);

213```

214

215```python

216import os

217

218import boto3

219from openai import OpenAI

220from openai.auth import SubjectTokenProvider

221

222

223def aws_outbound_web_identity_token_provider(audience: str) -> SubjectTokenProvider:

224 sts = boto3.client("sts", region_name=os.environ["AWS_REGION"])

225

226 def get_token() -> str:

227 response = sts.get_web_identity_token(

228 Audience=[audience],

229 SigningAlgorithm="ES384",

230 DurationSeconds=300,

231 )

232 token = response.get("WebIdentityToken", "")

233 if not token:

234 raise RuntimeError("AWS STS did not return a web identity token.")

235 return token

236

237 return {"token_type": "jwt", "get_token": get_token}

238

239

240client = OpenAI(

241 workload_identity={

242 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

243 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

244 "provider": aws_outbound_web_identity_token_provider(

245 os.environ["OPENAI_WIF_AUDIENCE"]

246 ),

247 },

248)

249

250response = client.responses.create(

251 model="gpt-5.4-mini",

252 input="Say hello from AWS outbound workload identity federation.",

253)

254

255print(response.output_text)

256```

257

258```go

259package main

260

261import (

262 "context"

263 "fmt"

264 "log"

265 "os"

266

267 awssdk "github.com/aws/aws-sdk-go-v2/aws"

268 "github.com/aws/aws-sdk-go-v2/config"

269 "github.com/aws/aws-sdk-go-v2/service/sts"

270 "github.com/openai/openai-go/v3"

271 "github.com/openai/openai-go/v3/auth"

272 "github.com/openai/openai-go/v3/option"

273 "github.com/openai/openai-go/v3/responses"

274)

275

276type awsOutboundWebIdentityTokenProvider struct {

277 client *sts.Client

278 audience string

279}

280

281func (p awsOutboundWebIdentityTokenProvider) TokenType() auth.SubjectTokenType {

282 return auth.SubjectTokenTypeJWT

283}

284

285func (p awsOutboundWebIdentityTokenProvider) GetToken(ctx context.Context, _ auth.HTTPDoer) (string, error) {

286 output, err := p.client.GetWebIdentityToken(ctx, &sts.GetWebIdentityTokenInput{

287 Audience: []string{p.audience},

288 DurationSeconds: awssdk.Int32(300),

289 SigningAlgorithm: awssdk.String("ES384"),

290 })

291 if err != nil {

292 return "", &auth.SubjectTokenProviderError{

293 Provider: "aws-outbound",

294 Message: "failed to request AWS web identity token",

295 Cause: err,

296 }

297 }

298

299 token := awssdk.ToString(output.WebIdentityToken)

300 if token == "" {

301 return "", &auth.SubjectTokenProviderError{

302 Provider: "aws-outbound",

303 Message: "AWS STS did not return a web identity token",

304 }

305 }

306

307 return token, nil

308}

309

310func main() {

311 ctx := context.Background()

312 audience := os.Getenv("OPENAI_WIF_AUDIENCE")

313 if audience == "" {

314 log.Fatal("Set OPENAI_WIF_AUDIENCE")

315 }

316

317 cfg, err := config.LoadDefaultConfig(ctx)

318 if err != nil {

319 log.Fatal(err)

320 }

321

322 client := openai.NewClient(

323 option.WithWorkloadIdentity(auth.WorkloadIdentity{

324 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

325 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

326 Provider: awsOutboundWebIdentityTokenProvider{

327 client: sts.NewFromConfig(cfg),

328 audience: audience,

329 },

330 }),

331 )

332

333 response, err := client.Responses.New(ctx, responses.ResponseNewParams{

334 Model: openai.ChatModelGPT4_1Mini,

335 Input: responses.ResponseNewParamsInputUnion{

336 OfString: openai.String("Say hello from AWS outbound workload identity federation."),

337 },

338 })

339 if err != nil {

340 log.Fatal(err)

341 }

342

343 fmt.Println(response.OutputText())

344}

345```

346

347```java

348import com.fasterxml.jackson.databind.json.JsonMapper;

349import com.openai.auth.SubjectTokenProvider;

350import com.openai.auth.SubjectTokenType;

351import com.openai.auth.WorkloadIdentity;

352import com.openai.client.OpenAIClient;

353import com.openai.client.okhttp.OpenAIOkHttpClient;

354import com.openai.core.http.HttpClient;

355import com.openai.errors.SubjectTokenProviderException;

356import com.openai.models.ChatModel;

357import com.openai.models.responses.ResponseCreateParams;

358import java.util.concurrent.CompletableFuture;

359import software.amazon.awssdk.regions.Region;

360import software.amazon.awssdk.services.sts.StsClient;

361import software.amazon.awssdk.services.sts.model.GetWebIdentityTokenRequest;

362

363public final class AwsOutboundWorkloadIdentityExample {

364 private AwsOutboundWorkloadIdentityExample() {}

365

366 static final class AwsOutboundWebIdentityTokenProvider implements SubjectTokenProvider {

367 private final StsClient stsClient;

368 private final String audience;

369

370 AwsOutboundWebIdentityTokenProvider(StsClient stsClient, String audience) {

371 this.stsClient = stsClient;

372 this.audience = audience;

373 }

374

375 @Override

376 public SubjectTokenType tokenType() {

377 return SubjectTokenType.JWT;

378 }

379

380 @Override

381 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

382 try {

383 String token = stsClient.getWebIdentityToken(GetWebIdentityTokenRequest.builder()

384 .audience(audience)

385 .durationSeconds(300)

386 .signingAlgorithm("ES384")

387 .build()).webIdentityToken();

388

389 if (token == null || token.isEmpty()) {

390 throw new SubjectTokenProviderException(

391 "aws-outbound",

392 "AWS STS did not return a web identity token",

393 null);

394 }

395

396 return token;

397 } catch (SubjectTokenProviderException e) {

398 throw e;

399 } catch (Exception e) {

400 throw new SubjectTokenProviderException(

401 "aws-outbound",

402 "failed to request AWS web identity token",

403 e);

404 }

405 }

406

407 @Override

408 public CompletableFuture<String> getTokenAsync(

409 HttpClient httpClient, JsonMapper jsonMapper) {

410 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

411 }

412 }

413

414 public static void main(String[] args) {

415 String audience = System.getenv("OPENAI_WIF_AUDIENCE");

416 StsClient stsClient = StsClient.builder()

417 .region(Region.of(System.getenv("AWS_REGION")))

418 .build();

419

420 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

421 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

422 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

423 .provider(new AwsOutboundWebIdentityTokenProvider(stsClient, audience))

424 .build();

425

426 OpenAIClient client = OpenAIOkHttpClient.builder()

427 .workloadIdentity(workloadIdentity)

428 .build();

429

430 ResponseCreateParams params = ResponseCreateParams.builder()

431 .model(ChatModel.GPT_4_1_MINI)

432 .input("Say hello from AWS outbound workload identity federation.")

433 .build();

434

435 client.responses().create(params).output().stream()

436 .flatMap(item -> item.message().stream())

437 .flatMap(message -> message.content().stream())

438 .flatMap(content -> content.outputText().stream())

439 .forEach(outputText -> System.out.println(outputText.text()));

440 }

441}

442```

443

444```ruby

445require "aws-sdk-sts"

446require "openai"

447

448class AwsOutboundWebIdentityTokenProvider

449 include OpenAI::Auth::SubjectTokenProvider

450

451 def initialize(audience:, sts_client:)

452 @audience = audience

453 @sts_client = sts_client

454 end

455

456 def token_type

457 OpenAI::Auth::TokenType::JWT

458 end

459

460 def get_token

461 response = @sts_client.get_web_identity_token(

462 audience: [@audience],

463 signing_algorithm: "ES384",

464 duration_seconds: 300

465 )

466 token = response.web_identity_token.to_s

467 if token.empty?

468 raise OpenAI::Errors::SubjectTokenProviderError.new(

469 message: "AWS STS did not return a web identity token",

470 provider: "aws-outbound"

471 )

472 end

473 token

474 rescue Aws::STS::Errors::ServiceError => e

475 raise OpenAI::Errors::SubjectTokenProviderError.new(

476 message: "Failed to request AWS web identity token: #{e.message}",

477 provider: "aws-outbound",

478 cause: e

479 )

480 end

481end

482

483provider = AwsOutboundWebIdentityTokenProvider.new(

484 audience: ENV.fetch("OPENAI_WIF_AUDIENCE"),

485 sts_client: Aws::STS::Client.new(region: ENV.fetch("AWS_REGION"))

486)

487

488workload_identity = OpenAI::Auth::WorkloadIdentity.new(

489 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

490 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

491 provider: provider

492)

493

494client = OpenAI::Client.new(workload_identity: workload_identity)

495

496response = client.responses.create(

497 model: "gpt-5.4-mini",

498 input: "Say hello from AWS outbound workload identity federation."

499)

500

501puts(response.output_text)

502```

503

504

505 </div>

159 506

160 <div data-content-switcher-pane data-value="eks" hidden>507 <div data-content-switcher-pane data-value="eks" hidden>

161 508

295 642

296The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected EKS service account token from the mounted file path and uses it as the subject token for workload identity federation.643The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected EKS service account token from the mounted file path and uses it as the subject token for workload identity federation.

297 644

298</div>645Authenticate from an EKS projected service account token

646

647```typescript

648import { readFile } from "node:fs/promises";

649import OpenAI from "openai";

650import type { SubjectTokenProvider } from "openai/auth";

651

652const tokenPath = "/var/run/secrets/tokens/token";

653const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

654const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

655

656if (!identityProviderId || !serviceAccountId) {

657 throw new Error("Set OPENAI_IDENTITY_PROVIDER_ID and OPENAI_SERVICE_ACCOUNT_ID");

658}

659

660function mountedEksServiceAccountTokenProvider(path: string): SubjectTokenProvider {

661 return {

662 tokenType: "jwt",

663 getToken: async () => {

664 const token = (await readFile(path, "utf8")).trim();

665 if (!token) {

666 throw new Error("The mounted EKS service account token file is empty.");

667 }

668 return token;

669 },

670 };

671}

672

673const client = new OpenAI({

674 workloadIdentity: {

675 identityProviderId,

676 serviceAccountId,

677 provider: mountedEksServiceAccountTokenProvider(tokenPath),

678 },

679});

680

681const response = await client.responses.create({

682 model: "gpt-5.4-mini",

683 input: "Say hello from AWS workload identity federation.",

684});

685

686console.log(response.output_text);

687```

688

689```python

690import os

691from pathlib import Path

692

693from openai import OpenAI

694from openai.auth import SubjectTokenProvider

695

696TOKEN_PATH = "/var/run/secrets/tokens/token"

697

698

699def mounted_eks_service_account_token_provider(token_path: str) -> SubjectTokenProvider:

700 def get_token() -> str:

701 token = Path(token_path).read_text().strip()

702 if not token:

703 raise RuntimeError("The mounted EKS service account token file is empty.")

704 return token

705

706 return {"token_type": "jwt", "get_token": get_token}

707

708

709client = OpenAI(

710 workload_identity={

711 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

712 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

713 "provider": mounted_eks_service_account_token_provider(TOKEN_PATH),

714 },

715)

716

717response = client.responses.create(

718 model="gpt-5.4-mini",

719 input="Say hello from AWS workload identity federation.",

720)

721

722print(response.output_text)

723```

724

725```go

726package main

727

728import (

729 "context"

730 "fmt"

731 "log"

732 "os"

733 "strings"

734

735 "github.com/openai/openai-go/v3"

736 "github.com/openai/openai-go/v3/auth"

737 "github.com/openai/openai-go/v3/option"

738 "github.com/openai/openai-go/v3/responses"

739)

740

741const tokenPath = "/var/run/secrets/tokens/token"

742

743type mountedEksServiceAccountTokenProvider struct {

744 path string

745}

746

747func (p mountedEksServiceAccountTokenProvider) TokenType() auth.SubjectTokenType {

748 return auth.SubjectTokenTypeJWT

749}

750

751func (p mountedEksServiceAccountTokenProvider) GetToken(_ context.Context, _ auth.HTTPDoer) (string, error) {

752 data, err := os.ReadFile(p.path)

753 if err != nil {

754 return "", &auth.SubjectTokenProviderError{

755 Provider: "aws-eks",

756 Message: "failed to read mounted EKS service account token",

757 Cause: err,

758 }

759 }

760

761 token := strings.TrimSpace(string(data))

762 if token == "" {

763 return "", &auth.SubjectTokenProviderError{

764 Provider: "aws-eks",

765 Message: "mounted EKS service account token is empty",

766 }

767 }

768

769 return token, nil

770}

771

772func main() {

773 client := openai.NewClient(

774 option.WithWorkloadIdentity(auth.WorkloadIdentity{

775 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

776 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

777 Provider: mountedEksServiceAccountTokenProvider{

778 path: tokenPath,

779 },

780 }),

781 )

782

783 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

784 Model: openai.ChatModelGPT4_1Mini,

785 Input: responses.ResponseNewParamsInputUnion{

786 OfString: openai.String("Say hello from AWS workload identity federation."),

787 },

788 })

789 if err != nil {

790 log.Fatal(err)

791 }

792

793 fmt.Println(response.OutputText())

794}

795```

796

797```java

798import com.fasterxml.jackson.databind.json.JsonMapper;

799import com.openai.auth.SubjectTokenProvider;

800import com.openai.auth.SubjectTokenType;

801import com.openai.auth.WorkloadIdentity;

802import com.openai.client.OpenAIClient;

803import com.openai.client.okhttp.OpenAIOkHttpClient;

804import com.openai.core.http.HttpClient;

805import com.openai.errors.SubjectTokenProviderException;

806import com.openai.models.ChatModel;

807import com.openai.models.responses.ResponseCreateParams;

808import java.nio.file.Files;

809import java.nio.file.Path;

810import java.util.concurrent.CompletableFuture;

811

812public final class AwsEksWorkloadIdentityExample {

813 private static final String TOKEN_PATH = "/var/run/secrets/tokens/token";

814

815 private AwsEksWorkloadIdentityExample() {}

816

817 static final class MountedEksServiceAccountTokenProvider implements SubjectTokenProvider {

818 private final Path tokenPath;

819

820 MountedEksServiceAccountTokenProvider(String tokenPath) {

821 this.tokenPath = Path.of(tokenPath);

822 }

823

824 @Override

825 public SubjectTokenType tokenType() {

826 return SubjectTokenType.JWT;

827 }

828

829 @Override

830 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

831 String token;

832 try {

833 token = Files.readString(tokenPath).trim();

834 } catch (Exception e) {

835 throw new SubjectTokenProviderException(

836 "aws-eks",

837 "failed to read mounted EKS service account token",

838 e);

839 }

840

841 if (token.isEmpty()) {

842 throw new SubjectTokenProviderException(

843 "aws-eks",

844 "mounted EKS service account token is empty",

845 null);

846 }

847

848 return token;

849 }

850

851 @Override

852 public CompletableFuture<String> getTokenAsync(

853 HttpClient httpClient, JsonMapper jsonMapper) {

854 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

855 }

856 }

857

858 public static void main(String[] args) {

859 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

860 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

861 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

862 .provider(new MountedEksServiceAccountTokenProvider(TOKEN_PATH))

863 .build();

864

865 OpenAIClient client = OpenAIOkHttpClient.builder()

866 .workloadIdentity(workloadIdentity)

867 .build();

868

869 ResponseCreateParams params = ResponseCreateParams.builder()

870 .model(ChatModel.GPT_4_1_MINI)

871 .input("Say hello from AWS workload identity federation.")

872 .build();

873

874 client.responses().create(params).output().stream()

875 .flatMap(item -> item.message().stream())

876 .flatMap(message -> message.content().stream())

877 .flatMap(content -> content.outputText().stream())

878 .forEach(outputText -> System.out.println(outputText.text()));

879 }

880}

881```

882

883```ruby

884require "openai"

885

886TOKEN_PATH = "/var/run/secrets/tokens/token"

887

888class MountedEksServiceAccountTokenProvider

889 include OpenAI::Auth::SubjectTokenProvider

890

891 def initialize(token_path:)

892 @token_path = token_path

893 end

894

895 def token_type

896 OpenAI::Auth::TokenType::JWT

897 end

898

899 def get_token

900 token = File.read(@token_path).strip

901 if token.empty?

902 raise OpenAI::Errors::SubjectTokenProviderError.new(

903 message: "Mounted EKS service account token is empty",

904 provider: "aws-eks"

905 )

906 end

907 token

908 rescue SystemCallError => e

909 raise OpenAI::Errors::SubjectTokenProviderError.new(

910 message: "Failed to read mounted EKS service account token: #{e.message}",

911 provider: "aws-eks",

912 cause: e

913 )

914 end

915end

916

917provider = MountedEksServiceAccountTokenProvider.new(token_path: TOKEN_PATH)

918

919workload_identity = OpenAI::Auth::WorkloadIdentity.new(

920 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

921 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

922 provider: provider

923)

924

925client = OpenAI::Client.new(workload_identity: workload_identity)

926

927response = client.responses.create(

928 model: "gpt-5.4-mini",

929 input: "Say hello from AWS workload identity federation."

930)

931

932puts(response.output_text)

933```

934

935

936 </div>

299 937

300 938

301 939

guides/workload-identity-federation/github-actions.md +445 −0

Details

159 159

160The following examples initialize an OpenAI client with a custom subject token provider. The provider requests a GitHub OIDC token for the configured audience and uses it as the subject token for workload identity federation.160The following examples initialize an OpenAI client with a custom subject token provider. The provider requests a GitHub OIDC token for the configured audience and uses it as the subject token for workload identity federation.

161 161

162Authenticate from a GitHub Actions OIDC token

163

164```typescript

165import OpenAI from "openai";

166import type { SubjectTokenProvider } from "openai/auth";

167

168const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

169const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

170const audience = process.env.OPENAI_WIF_AUDIENCE;

171const requestURL = process.env.ACTIONS_ID_TOKEN_REQUEST_URL;

172const requestToken = process.env.ACTIONS_ID_TOKEN_REQUEST_TOKEN;

173

174if (

175 !identityProviderId ||

176 !serviceAccountId ||

177 !audience ||

178 !requestURL ||

179 !requestToken

180) {

181 throw new Error(

182 "Set OPENAI_IDENTITY_PROVIDER_ID, OPENAI_SERVICE_ACCOUNT_ID, OPENAI_WIF_AUDIENCE, and run inside GitHub Actions with id-token: write"

183 );

184}

185

186function githubActionsOIDCTokenProvider(

187 requestURL: string,

188 requestToken: string,

189 audience: string

190): SubjectTokenProvider {

191 return {

192 tokenType: "jwt",

193 getToken: async () => {

194 const url = new URL(requestURL);

195 url.searchParams.set("audience", audience);

196

197 const response = await fetch(url, {

198 headers: { Authorization: `bearer ${requestToken}` },

199 });

200

201 if (!response.ok) {

202 throw new Error(

203 `Failed to request GitHub OIDC token: ${response.status} ${response.statusText}`

204 );

205 }

206

207 const body = (await response.json()) as { value?: string };

208 if (!body.value) {

209 throw new Error("GitHub OIDC token response did not include a value.");

210 }

211

212 return body.value;

213 },

214 };

215}

216

217const client = new OpenAI({

218 workloadIdentity: {

219 identityProviderId,

220 serviceAccountId,

221 provider: githubActionsOIDCTokenProvider(requestURL, requestToken, audience),

222 },

223});

224

225const response = await client.responses.create({

226 model: "gpt-5.4-mini",

227 input: "Say hello from GitHub Actions workload identity federation.",

228});

229

230console.log(response.output_text);

231```

232

233```python

234import json

235import os

236import urllib.parse

237import urllib.request

238

239from openai import OpenAI

240from openai.auth import SubjectTokenProvider

241

242

243def github_actions_oidc_token_provider(audience: str) -> SubjectTokenProvider:

244 request_url = os.environ["ACTIONS_ID_TOKEN_REQUEST_URL"]

245 request_token = os.environ["ACTIONS_ID_TOKEN_REQUEST_TOKEN"]

246

247 def get_token() -> str:

248 parsed_url = urllib.parse.urlparse(request_url)

249 query = dict(urllib.parse.parse_qsl(parsed_url.query, keep_blank_values=True))

250 query["audience"] = audience

251 url = urllib.parse.urlunparse(

252 parsed_url._replace(query=urllib.parse.urlencode(query))

253 )

254

255 request = urllib.request.Request(

256 url,

257 headers={"Authorization": f"bearer {request_token}"},

258 )

259 with urllib.request.urlopen(request) as response:

260 payload = json.loads(response.read().decode("utf-8"))

261

262 token = payload.get("value")

263 if not token:

264 raise RuntimeError("GitHub OIDC token response did not include a value.")

265 return token

266

267 return {"token_type": "jwt", "get_token": get_token}

268

269

270client = OpenAI(

271 workload_identity={

272 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

273 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

274 "provider": github_actions_oidc_token_provider(

275 os.environ["OPENAI_WIF_AUDIENCE"]

276 ),

277 },

278)

279

280response = client.responses.create(

281 model="gpt-5.4-mini",

282 input="Say hello from GitHub Actions workload identity federation.",

283)

284

285print(response.output_text)

286```

287

288```go

289package main

290

291import (

292 "context"

293 "encoding/json"

294 "fmt"

295 "log"

296 "net/http"

297 "net/url"

298 "os"

299

300 "github.com/openai/openai-go/v3"

301 "github.com/openai/openai-go/v3/auth"

302 "github.com/openai/openai-go/v3/option"

303 "github.com/openai/openai-go/v3/responses"

304)

305

306type githubActionsOIDCTokenProvider struct {

307 requestURL string

308 requestToken string

309 audience string

310}

311

312func (p githubActionsOIDCTokenProvider) TokenType() auth.SubjectTokenType {

313 return auth.SubjectTokenTypeJWT

314}

315

316func (p githubActionsOIDCTokenProvider) GetToken(ctx context.Context, httpClient auth.HTTPDoer) (string, error) {

317 if httpClient == nil {

318 httpClient = http.DefaultClient

319 }

320

321 oidcURL, err := url.Parse(p.requestURL)

322 if err != nil {

323 return "", &auth.SubjectTokenProviderError{

324 Provider: "github-actions",

325 Message: "failed to parse GitHub OIDC request URL",

326 Cause: err,

327 }

328 }

329 query := oidcURL.Query()

330 query.Set("audience", p.audience)

331 oidcURL.RawQuery = query.Encode()

332

333 req, err := http.NewRequestWithContext(ctx, http.MethodGet, oidcURL.String(), nil)

334 if err != nil {

335 return "", &auth.SubjectTokenProviderError{

336 Provider: "github-actions",

337 Message: "failed to create GitHub OIDC token request",

338 Cause: err,

339 }

340 }

341 req.Header.Set("Authorization", "bearer "+p.requestToken)

342

343 resp, err := httpClient.Do(req)

344 if err != nil {

345 return "", &auth.SubjectTokenProviderError{

346 Provider: "github-actions",

347 Message: "failed to request GitHub OIDC token",

348 Cause: err,

349 }

350 }

351 defer resp.Body.Close()

352

353 if resp.StatusCode < 200 || resp.StatusCode >= 300 {

354 return "", &auth.SubjectTokenProviderError{

355 Provider: "github-actions",

356 Message: fmt.Sprintf("GitHub OIDC token request failed with status %s", resp.Status),

357 }

358 }

359

360 var body struct {

361 Value string `json:"value"`

362 }

363 if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {

364 return "", &auth.SubjectTokenProviderError{

365 Provider: "github-actions",

366 Message: "failed to decode GitHub OIDC token response",

367 Cause: err,

368 }

369 }

370 if body.Value == "" {

371 return "", &auth.SubjectTokenProviderError{

372 Provider: "github-actions",

373 Message: "GitHub OIDC token response did not include a value",

374 }

375 }

376

377 return body.Value, nil

378}

379

380func main() {

381 client := openai.NewClient(

382 option.WithWorkloadIdentity(auth.WorkloadIdentity{

383 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

384 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

385 Provider: githubActionsOIDCTokenProvider{

386 requestURL: os.Getenv("ACTIONS_ID_TOKEN_REQUEST_URL"),

387 requestToken: os.Getenv("ACTIONS_ID_TOKEN_REQUEST_TOKEN"),

388 audience: os.Getenv("OPENAI_WIF_AUDIENCE"),

389 },

390 }),

391 )

392

393 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

394 Model: openai.ChatModelGPT4_1Mini,

395 Input: responses.ResponseNewParamsInputUnion{

396 OfString: openai.String("Say hello from GitHub Actions workload identity federation."),

397 },

398 })

399 if err != nil {

400 log.Fatal(err)

401 }

402

403 fmt.Println(response.OutputText())

404}

405```

406

407```java

408import com.fasterxml.jackson.databind.JsonNode;

409import com.fasterxml.jackson.databind.json.JsonMapper;

410import com.openai.auth.SubjectTokenProvider;

411import com.openai.errors.SubjectTokenProviderException;

412import com.openai.auth.SubjectTokenType;

413import com.openai.auth.WorkloadIdentity;

414import com.openai.client.OpenAIClient;

415import com.openai.client.okhttp.OpenAIOkHttpClient;

416import com.openai.models.ChatModel;

417import com.openai.models.responses.ResponseCreateParams;

418import java.net.URI;

419import java.net.URLEncoder;

420import java.net.http.HttpRequest;

421import java.net.http.HttpResponse;

422import java.nio.charset.StandardCharsets;

423import java.util.concurrent.CompletableFuture;

424

425public final class GitHubActionsWorkloadIdentityExample {

426 private GitHubActionsWorkloadIdentityExample() {}

427

428 static final class GitHubActionsOidcTokenProvider implements SubjectTokenProvider {

429 private final String requestUrl;

430 private final String requestToken;

431 private final String audience;

432

433 GitHubActionsOidcTokenProvider(String requestUrl, String requestToken, String audience) {

434 this.requestUrl = requestUrl;

435 this.requestToken = requestToken;

436 this.audience = audience;

437 }

438

439 @Override

440 public SubjectTokenType tokenType() {

441 return SubjectTokenType.JWT;

442 }

443

444 @Override

445 public String getToken(

446 com.openai.core.http.HttpClient httpClient, JsonMapper jsonMapper) {

447 try {

448 String separator = requestUrl.contains("?") ? "&" : "?";

449 URI uri = URI.create(

450 requestUrl

451 + separator

452 + "audience="

453 + URLEncoder.encode(audience, StandardCharsets.UTF_8));

454

455 HttpRequest request = HttpRequest.newBuilder(uri)

456 .header("Authorization", "bearer " + requestToken)

457 .GET()

458 .build();

459

460 HttpResponse<String> response = java.net.http.HttpClient.newHttpClient()

461 .send(request, HttpResponse.BodyHandlers.ofString());

462

463 if (response.statusCode() < 200 || response.statusCode() >= 300) {

464 throw new SubjectTokenProviderException(

465 "github-actions",

466 "GitHub OIDC token request failed with status "

467 + response.statusCode(),

468 null);

469 }

470

471 JsonNode payload = jsonMapper.readTree(response.body());

472 String token = payload.path("value").asText("");

473 if (token.isEmpty()) {

474 throw new SubjectTokenProviderException(

475 "github-actions",

476 "GitHub OIDC token response did not include a value",

477 null);

478 }

479

480 return token;

481 } catch (SubjectTokenProviderException e) {

482 throw e;

483 } catch (Exception e) {

484 throw new SubjectTokenProviderException(

485 "github-actions",

486 "failed to request GitHub OIDC token",

487 e);

488 }

489 }

490

491 @Override

492 public CompletableFuture<String> getTokenAsync(

493 com.openai.core.http.HttpClient httpClient, JsonMapper jsonMapper) {

494 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

495 }

496 }

497

498 public static void main(String[] args) {

499 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

500 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

501 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

502 .provider(new GitHubActionsOidcTokenProvider(

503 System.getenv("ACTIONS_ID_TOKEN_REQUEST_URL"),

504 System.getenv("ACTIONS_ID_TOKEN_REQUEST_TOKEN"),

505 System.getenv("OPENAI_WIF_AUDIENCE")))

506 .build();

507

508 OpenAIClient client = OpenAIOkHttpClient.builder()

509 .workloadIdentity(workloadIdentity)

510 .build();

511

512 ResponseCreateParams params = ResponseCreateParams.builder()

513 .model(ChatModel.GPT_4_1_MINI)

514 .input("Say hello from GitHub Actions workload identity federation.")

515 .build();

516

517 client.responses().create(params).output().stream()

518 .flatMap(item -> item.message().stream())

519 .flatMap(message -> message.content().stream())

520 .flatMap(content -> content.outputText().stream())

521 .forEach(outputText -> System.out.println(outputText.text()));

522 }

523}

524```

525

526```ruby

527require "json"

528require "net/http"

529require "openai"

530require "uri"

531

532class GitHubActionsOIDCTokenProvider

533 include OpenAI::Auth::SubjectTokenProvider

534

535 def initialize(request_url:, request_token:, audience:)

536 @request_url = request_url

537 @request_token = request_token

538 @audience = audience

539 end

540

541 def token_type

542 OpenAI::Auth::TokenType::JWT

543 end

544

545 def get_token

546 uri = URI(@request_url)

547 params = URI.decode_www_form(uri.query || "")

548 params.reject! { |key, _| key == "audience" }

549 params << ["audience", @audience]

550 uri.query = URI.encode_www_form(params)

551

552 request = Net::HTTP::Get.new(uri)

553 request["Authorization"] = "bearer #{@request_token}"

554

555 response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: uri.scheme == "https") do |http|

556 http.request(request)

557 end

558

559 unless response.is_a?(Net::HTTPSuccess)

560 raise OpenAI::Errors::SubjectTokenProviderError.new(

561 message: "GitHub OIDC token request failed with status #{response.code}",

562 provider: "github-actions"

563 )

564 end

565

566 token = JSON.parse(response.body).fetch("value", "").to_s

567 if token.empty?

568 raise OpenAI::Errors::SubjectTokenProviderError.new(

569 message: "GitHub OIDC token response did not include a value",

570 provider: "github-actions"

571 )

572 end

573

574 token

575 rescue JSON::ParserError, SystemCallError => e

576 raise OpenAI::Errors::SubjectTokenProviderError.new(

577 message: "Failed to request GitHub OIDC token: #{e.message}",

578 provider: "github-actions",

579 cause: e

580 )

581 end

582end

583

584provider = GitHubActionsOIDCTokenProvider.new(

585 request_url: ENV.fetch("ACTIONS_ID_TOKEN_REQUEST_URL"),

586 request_token: ENV.fetch("ACTIONS_ID_TOKEN_REQUEST_TOKEN"),

587 audience: ENV.fetch("OPENAI_WIF_AUDIENCE")

588)

589

590workload_identity = OpenAI::Auth::WorkloadIdentity.new(

591 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

592 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

593 provider: provider

594)

595

596client = OpenAI::Client.new(workload_identity: workload_identity)

597

598response = client.responses.create(

599 model: "gpt-5.4-mini",

600 input: "Say hello from GitHub Actions workload identity federation."

601)

602

603puts(response.output_text)

604```

605

606

162## GitHub Actions best practices607## GitHub Actions best practices

163 608

164- Use environment protections for production deployments. Require approvals or branch restrictions before workflows can access production OpenAI resources.609- Use environment protections for production deployments. Require approvals or branch restrictions before workflows can access production OpenAI resources.

guides/workload-identity-federation/google-cloud.md +704 −2

Details

110 110

111Set `OPENAI_WIF_AUDIENCE` to the custom audience configured as the Workload Identity Provider audience. The SDK requests a Google identity token for that audience, exchanges it for an OpenAI-issued access token, and uses the OpenAI token to authenticate API requests.111Set `OPENAI_WIF_AUDIENCE` to the custom audience configured as the Workload Identity Provider audience. The SDK requests a Google identity token for that audience, exchanges it for an OpenAI-issued access token, and uses the OpenAI token to authenticate API requests.

112 112

113</div>113Authenticate from a Google metadata server identity token

114

115```typescript

116import OpenAI from "openai";

117import type { SubjectTokenProvider } from "openai/auth";

118

119const metadataEndpoint =

120 "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity";

121

122const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

123const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

124const audience = process.env.OPENAI_WIF_AUDIENCE;

125

126if (!identityProviderId || !serviceAccountId || !audience) {

127 throw new Error(

128 "Set OPENAI_IDENTITY_PROVIDER_ID, OPENAI_SERVICE_ACCOUNT_ID, and OPENAI_WIF_AUDIENCE"

129 );

130}

131

132function googleMetadataIdentityTokenProvider(audience: string): SubjectTokenProvider {

133 return {

134 tokenType: "jwt",

135 getToken: async () => {

136 const url = new URL(metadataEndpoint);

137 url.searchParams.set("audience", audience);

138 url.searchParams.set("format", "full");

139

140 const response = await fetch(url, {

141 headers: { "Metadata-Flavor": "Google" },

142 });

143

144 if (!response.ok) {

145 throw new Error(

146 `Google metadata token request failed with status ${response.status}.`

147 );

148 }

149

150 const token = (await response.text()).trim();

151 if (!token) {

152 throw new Error("Google metadata server did not return an identity token.");

153 }

154

155 return token;

156 },

157 };

158}

159

160const client = new OpenAI({

161 workloadIdentity: {

162 identityProviderId,

163 serviceAccountId,

164 provider: googleMetadataIdentityTokenProvider(audience),

165 },

166});

167

168const response = await client.responses.create({

169 model: "gpt-5.4-mini",

170 input: "Say hello from Google Cloud workload identity federation.",

171});

172

173console.log(response.output_text);

174```

175

176```python

177import os

178from urllib.parse import urlencode

179from urllib.request import Request, urlopen

180

181from openai import OpenAI

182from openai.auth import SubjectTokenProvider

183

184METADATA_ENDPOINT = (

185 "http://metadata.google.internal/computeMetadata/v1/instance/"

186 "service-accounts/default/identity"

187)

188

189

190def google_metadata_identity_token_provider(audience: str) -> SubjectTokenProvider:

191 def get_token() -> str:

192 request = Request(

193 f"{METADATA_ENDPOINT}?{urlencode({'audience': audience, 'format': 'full'})}",

194 headers={"Metadata-Flavor": "Google"},

195 )

196

197 with urlopen(request, timeout=10) as response:

198 token = response.read().decode("utf-8").strip()

199

200 if not token:

201 raise RuntimeError("Google metadata server did not return an identity token.")

202 return token

203

204 return {"token_type": "jwt", "get_token": get_token}

205

206client = OpenAI(

207 workload_identity={

208 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

209 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

210 "provider": google_metadata_identity_token_provider(

211 audience=os.environ["OPENAI_WIF_AUDIENCE"]

212 ),

213 },

214)

215

216response = client.responses.create(

217 model="gpt-5.4-mini",

218 input="Say hello from Google Cloud workload identity federation.",

219)

220

221print(response.output_text)

222```

223

224```go

225package main

226

227import (

228 "context"

229 "fmt"

230 "io"

231 "log"

232 "net/http"

233 "net/url"

234 "os"

235 "strings"

236

237 "github.com/openai/openai-go/v3"

238 "github.com/openai/openai-go/v3/auth"

239 "github.com/openai/openai-go/v3/option"

240 "github.com/openai/openai-go/v3/responses"

241)

242

243const googleMetadataEndpoint = "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity"

244

245type googleMetadataIdentityTokenProvider struct {

246 audience string

247}

248

249func (p googleMetadataIdentityTokenProvider) TokenType() auth.SubjectTokenType {

250 return auth.SubjectTokenTypeJWT

251}

252

253func (p googleMetadataIdentityTokenProvider) GetToken(ctx context.Context, httpClient auth.HTTPDoer) (string, error) {

254 values := url.Values{}

255 values.Set("audience", p.audience)

256 values.Set("format", "full")

257

258 req, err := http.NewRequestWithContext(ctx, http.MethodGet, googleMetadataEndpoint+"?"+values.Encode(), nil)

259 if err != nil {

260 return "", &auth.SubjectTokenProviderError{

261 Provider: "google-metadata",

262 Message: "failed to build Google metadata token request",

263 Cause: err,

264 }

265 }

266 req.Header.Set("Metadata-Flavor", "Google")

267

268 resp, err := httpClient.Do(req)

269 if err != nil {

270 return "", &auth.SubjectTokenProviderError{

271 Provider: "google-metadata",

272 Message: "failed to request Google identity token",

273 Cause: err,

274 }

275 }

276 defer resp.Body.Close()

277

278 if resp.StatusCode < 200 || resp.StatusCode >= 300 {

279 return "", &auth.SubjectTokenProviderError{

280 Provider: "google-metadata",

281 Message: fmt.Sprintf("Google metadata token request failed with status %d", resp.StatusCode),

282 }

283 }

284

285 data, err := io.ReadAll(resp.Body)

286 if err != nil {

287 return "", &auth.SubjectTokenProviderError{

288 Provider: "google-metadata",

289 Message: "failed to read Google metadata token response",

290 Cause: err,

291 }

292 }

293

294 token := strings.TrimSpace(string(data))

295 if token == "" {

296 return "", &auth.SubjectTokenProviderError{

297 Provider: "google-metadata",

298 Message: "Google metadata server did not return an identity token",

299 }

300 }

301

302 return token, nil

303}

304

305func main() {

306 audience := os.Getenv("OPENAI_WIF_AUDIENCE")

307 if audience == "" {

308 log.Fatal("Set OPENAI_WIF_AUDIENCE")

309 }

310

311 client := openai.NewClient(

312 option.WithWorkloadIdentity(auth.WorkloadIdentity{

313 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

314 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

315 Provider: googleMetadataIdentityTokenProvider{

316 audience: audience,

317 },

318 }),

319 )

320

321 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

322 Model: openai.ChatModelGPT4_1Mini,

323 Input: responses.ResponseNewParamsInputUnion{

324 OfString: openai.String("Say hello from Google Cloud workload identity federation."),

325 },

326 })

327 if err != nil {

328 log.Fatal(err)

329 }

330

331 fmt.Println(response.OutputText())

332}

333```

334

335```java

336import com.fasterxml.jackson.databind.json.JsonMapper;

337import com.openai.auth.SubjectTokenProvider;

338import com.openai.auth.SubjectTokenType;

339import com.openai.auth.WorkloadIdentity;

340import com.openai.client.OpenAIClient;

341import com.openai.client.okhttp.OpenAIOkHttpClient;

342import com.openai.core.http.HttpClient;

343import com.openai.errors.SubjectTokenProviderException;

344import com.openai.models.ChatModel;

345import com.openai.models.responses.ResponseCreateParams;

346import java.net.URI;

347import java.net.URLEncoder;

348import java.net.http.HttpRequest;

349import java.net.http.HttpResponse;

350import java.nio.charset.StandardCharsets;

351import java.util.concurrent.CompletableFuture;

352

353public final class GoogleWorkloadIdentityExample {

354 private static final String METADATA_ENDPOINT =

355 "http://metadata.google.internal/computeMetadata/v1/instance/"

356 + "service-accounts/default/identity";

357

358 private GoogleWorkloadIdentityExample() {}

359

360 static final class GoogleMetadataIdentityTokenProvider implements SubjectTokenProvider {

361 private final String audience;

362

363 GoogleMetadataIdentityTokenProvider(String audience) {

364 this.audience = audience;

365 }

366

367 @Override

368 public SubjectTokenType tokenType() {

369 return SubjectTokenType.JWT;

370 }

371

372 @Override

373 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

374 try {

375 String query = "audience="

376 + URLEncoder.encode(audience, StandardCharsets.UTF_8)

377 + "&format=full";

378 HttpRequest request = HttpRequest.newBuilder()

379 .uri(URI.create(METADATA_ENDPOINT + "?" + query))

380 .header("Metadata-Flavor", "Google")

381 .GET()

382 .build();

383

384 HttpResponse<String> response = java.net.http.HttpClient.newHttpClient()

385 .send(request, HttpResponse.BodyHandlers.ofString());

386 if (response.statusCode() < 200 || response.statusCode() >= 300) {

387 throw new SubjectTokenProviderException(

388 "google-metadata",

389 "Google metadata token request failed with status "

390 + response.statusCode(),

391 null);

392 }

393

394 String token = response.body().trim();

395 if (token.isEmpty()) {

396 throw new SubjectTokenProviderException(

397 "google-metadata",

398 "Google metadata server did not return an identity token",

399 null);

400 }

401

402 return token;

403 } catch (SubjectTokenProviderException e) {

404 throw e;

405 } catch (Exception e) {

406 throw new SubjectTokenProviderException(

407 "google-metadata",

408 "failed to request Google identity token",

409 e);

410 }

411 }

412

413 @Override

414 public CompletableFuture<String> getTokenAsync(

415 HttpClient httpClient, JsonMapper jsonMapper) {

416 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

417 }

418 }

419

420 public static void main(String[] args) {

421 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

422 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

423 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

424 .provider(new GoogleMetadataIdentityTokenProvider(

425 System.getenv("OPENAI_WIF_AUDIENCE")))

426 .build();

427

428 OpenAIClient client = OpenAIOkHttpClient.builder()

429 .workloadIdentity(workloadIdentity)

430 .build();

431

432 ResponseCreateParams params = ResponseCreateParams.builder()

433 .model(ChatModel.GPT_4_1_MINI)

434 .input("Say hello from Google Cloud workload identity federation.")

435 .build();

436

437 client.responses().create(params).output().stream()

438 .flatMap(item -> item.message().stream())

439 .flatMap(message -> message.content().stream())

440 .flatMap(content -> content.outputText().stream())

441 .forEach(outputText -> System.out.println(outputText.text()));

442 }

443}

444```

445

446```ruby

447require "net/http"

448require "openai"

449require "uri"

450

451class GoogleMetadataIdentityTokenProvider

452 include OpenAI::Auth::SubjectTokenProvider

453

454 METADATA_ENDPOINT =

455 "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/identity"

456

457 def initialize(audience:)

458 @audience = audience

459 end

460

461 def token_type

462 OpenAI::Auth::TokenType::JWT

463 end

464

465 def get_token

466 uri = URI(METADATA_ENDPOINT)

467 uri.query = URI.encode_www_form(

468 audience: @audience,

469 format: "full"

470 )

471

472 request = Net::HTTP::Get.new(uri)

473 request["Metadata-Flavor"] = "Google"

474

475 response = Net::HTTP.start(uri.hostname, uri.port, read_timeout: 10) do |http|

476 http.request(request)

477 end

478

479 unless response.is_a?(Net::HTTPSuccess)

480 raise OpenAI::Errors::SubjectTokenProviderError.new(

481 message: "Google metadata token request failed with status #{response.code}",

482 provider: "google-metadata"

483 )

484 end

485

486 token = response.body.strip

487 if token.empty?

488 raise OpenAI::Errors::SubjectTokenProviderError.new(

489 message: "Google metadata server did not return an identity token",

490 provider: "google-metadata"

491 )

492 end

493 token

494 rescue SystemCallError => e

495 raise OpenAI::Errors::SubjectTokenProviderError.new(

496 message: "Failed to request Google identity token: #{e.message}",

497 provider: "google-metadata",

498 cause: e

499 )

500 end

501end

502

503provider = GoogleMetadataIdentityTokenProvider.new(

504 audience: ENV.fetch("OPENAI_WIF_AUDIENCE")

505)

506

507workload_identity = OpenAI::Auth::WorkloadIdentity.new(

508 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

509 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

510 provider: provider

511)

512

513client = OpenAI::Client.new(workload_identity: workload_identity)

514

515response = client.responses.create(

516 model: "gpt-5.4-mini",

517 input: "Say hello from Google Cloud workload identity federation."

518)

519

520puts(response.output_text)

521```

522

523

524 </div>

114 525

115 <div data-content-switcher-pane data-value="gke" hidden>526 <div data-content-switcher-pane data-value="gke" hidden>

116 527

256 667

257The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected GKE service account token from the mounted file path and uses it as the subject token for workload identity federation.668The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected GKE service account token from the mounted file path and uses it as the subject token for workload identity federation.

258 669

259</div>670Authenticate from a GKE projected service account token

671

672```typescript

673import { readFile } from "node:fs/promises";

674import OpenAI from "openai";

675import type { SubjectTokenProvider } from "openai/auth";

676

677const tokenPath = "/var/run/secrets/tokens/token";

678const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

679const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

680

681if (!identityProviderId || !serviceAccountId) {

682 throw new Error("Set OPENAI_IDENTITY_PROVIDER_ID and OPENAI_SERVICE_ACCOUNT_ID");

683}

684

685function mountedGkeServiceAccountTokenProvider(path: string): SubjectTokenProvider {

686 return {

687 tokenType: "jwt",

688 getToken: async () => {

689 const token = (await readFile(path, "utf8")).trim();

690 if (!token) {

691 throw new Error("The mounted GKE service account token file is empty.");

692 }

693 return token;

694 },

695 };

696}

697

698const client = new OpenAI({

699 workloadIdentity: {

700 identityProviderId,

701 serviceAccountId,

702 provider: mountedGkeServiceAccountTokenProvider(tokenPath),

703 },

704});

705

706const response = await client.responses.create({

707 model: "gpt-5.4-mini",

708 input: "Say hello from Google GKE workload identity federation.",

709});

710

711console.log(response.output_text);

712```

713

714```python

715import os

716from pathlib import Path

717

718from openai import OpenAI

719from openai.auth import SubjectTokenProvider

720

721TOKEN_PATH = "/var/run/secrets/tokens/token"

722

723

724def mounted_gke_service_account_token_provider(token_path: str) -> SubjectTokenProvider:

725 def get_token() -> str:

726 token = Path(token_path).read_text().strip()

727 if not token:

728 raise RuntimeError("The mounted GKE service account token file is empty.")

729 return token

730

731 return {"token_type": "jwt", "get_token": get_token}

732

733

734client = OpenAI(

735 workload_identity={

736 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

737 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

738 "provider": mounted_gke_service_account_token_provider(TOKEN_PATH),

739 },

740)

741

742response = client.responses.create(

743 model="gpt-5.4-mini",

744 input="Say hello from Google GKE workload identity federation.",

745)

746

747print(response.output_text)

748```

749

750```go

751package main

752

753import (

754 "context"

755 "fmt"

756 "log"

757 "os"

758 "strings"

759

760 "github.com/openai/openai-go/v3"

761 "github.com/openai/openai-go/v3/auth"

762 "github.com/openai/openai-go/v3/option"

763 "github.com/openai/openai-go/v3/responses"

764)

765

766const tokenPath = "/var/run/secrets/tokens/token"

767

768type mountedGkeServiceAccountTokenProvider struct {

769 path string

770}

771

772func (p mountedGkeServiceAccountTokenProvider) TokenType() auth.SubjectTokenType {

773 return auth.SubjectTokenTypeJWT

774}

775

776func (p mountedGkeServiceAccountTokenProvider) GetToken(_ context.Context, _ auth.HTTPDoer) (string, error) {

777 data, err := os.ReadFile(p.path)

778 if err != nil {

779 return "", &auth.SubjectTokenProviderError{

780 Provider: "google-gke",

781 Message: "failed to read mounted GKE service account token",

782 Cause: err,

783 }

784 }

785

786 token := strings.TrimSpace(string(data))

787 if token == "" {

788 return "", &auth.SubjectTokenProviderError{

789 Provider: "google-gke",

790 Message: "mounted GKE service account token is empty",

791 }

792 }

793

794 return token, nil

795}

796

797func main() {

798 client := openai.NewClient(

799 option.WithWorkloadIdentity(auth.WorkloadIdentity{

800 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

801 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

802 Provider: mountedGkeServiceAccountTokenProvider{

803 path: tokenPath,

804 },

805 }),

806 )

807

808 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

809 Model: openai.ChatModelGPT4_1Mini,

810 Input: responses.ResponseNewParamsInputUnion{

811 OfString: openai.String("Say hello from Google GKE workload identity federation."),

812 },

813 })

814 if err != nil {

815 log.Fatal(err)

816 }

817

818 fmt.Println(response.OutputText())

819}

820```

821

822```java

823import com.fasterxml.jackson.databind.json.JsonMapper;

824import com.openai.auth.SubjectTokenProvider;

825import com.openai.auth.SubjectTokenType;

826import com.openai.auth.WorkloadIdentity;

827import com.openai.client.OpenAIClient;

828import com.openai.client.okhttp.OpenAIOkHttpClient;

829import com.openai.core.http.HttpClient;

830import com.openai.errors.SubjectTokenProviderException;

831import com.openai.models.ChatModel;

832import com.openai.models.responses.ResponseCreateParams;

833import java.nio.file.Files;

834import java.nio.file.Path;

835import java.util.concurrent.CompletableFuture;

836

837public final class GoogleGkeWorkloadIdentityExample {

838 private static final String TOKEN_PATH = "/var/run/secrets/tokens/token";

839

840 private GoogleGkeWorkloadIdentityExample() {}

841

842 static final class MountedGkeServiceAccountTokenProvider implements SubjectTokenProvider {

843 private final Path tokenPath;

844

845 MountedGkeServiceAccountTokenProvider(String tokenPath) {

846 this.tokenPath = Path.of(tokenPath);

847 }

848

849 @Override

850 public SubjectTokenType tokenType() {

851 return SubjectTokenType.JWT;

852 }

853

854 @Override

855 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

856 String token;

857 try {

858 token = Files.readString(tokenPath).trim();

859 } catch (Exception e) {

860 throw new SubjectTokenProviderException(

861 "google-gke",

862 "failed to read mounted GKE service account token",

863 e);

864 }

865

866 if (token.isEmpty()) {

867 throw new SubjectTokenProviderException(

868 "google-gke",

869 "mounted GKE service account token is empty",

870 null);

871 }

872

873 return token;

874 }

875

876 @Override

877 public CompletableFuture<String> getTokenAsync(

878 HttpClient httpClient, JsonMapper jsonMapper) {

879 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

880 }

881 }

882

883 public static void main(String[] args) {

884 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

885 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

886 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

887 .provider(new MountedGkeServiceAccountTokenProvider(TOKEN_PATH))

888 .build();

889

890 OpenAIClient client = OpenAIOkHttpClient.builder()

891 .workloadIdentity(workloadIdentity)

892 .build();

893

894 ResponseCreateParams params = ResponseCreateParams.builder()

895 .model(ChatModel.GPT_4_1_MINI)

896 .input("Say hello from Google GKE workload identity federation.")

897 .build();

898

899 client.responses().create(params).output().stream()

900 .flatMap(item -> item.message().stream())

901 .flatMap(message -> message.content().stream())

902 .flatMap(content -> content.outputText().stream())

903 .forEach(outputText -> System.out.println(outputText.text()));

904 }

905}

906```

907

908```ruby

909require "openai"

910

911TOKEN_PATH = "/var/run/secrets/tokens/token"

912

913class MountedGkeServiceAccountTokenProvider

914 include OpenAI::Auth::SubjectTokenProvider

915

916 def initialize(token_path:)

917 @token_path = token_path

918 end

919

920 def token_type

921 OpenAI::Auth::TokenType::JWT

922 end

923

924 def get_token

925 token = File.read(@token_path).strip

926 if token.empty?

927 raise OpenAI::Errors::SubjectTokenProviderError.new(

928 message: "Mounted GKE service account token is empty",

929 provider: "google-gke"

930 )

931 end

932 token

933 rescue SystemCallError => e

934 raise OpenAI::Errors::SubjectTokenProviderError.new(

935 message: "Failed to read mounted GKE service account token: #{e.message}",

936 provider: "google-gke",

937 cause: e

938 )

939 end

940end

941

942provider = MountedGkeServiceAccountTokenProvider.new(token_path: TOKEN_PATH)

943

944workload_identity = OpenAI::Auth::WorkloadIdentity.new(

945 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

946 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

947 provider: provider

948)

949

950client = OpenAI::Client.new(workload_identity: workload_identity)

951

952response = client.responses.create(

953 model: "gpt-5.4-mini",

954 input: "Say hello from Google GKE workload identity federation."

955)

956

957puts(response.output_text)

958```

959

960

961 </div>

260 962

261 963

262 964

guides/workload-identity-federation/kubernetes.md +291 −0

Details

132 132

133The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected Kubernetes service account token from the mounted file path and uses it as the subject token for workload identity federation.133The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected Kubernetes service account token from the mounted file path and uses it as the subject token for workload identity federation.

134 134

135Authenticate from a Kubernetes projected service account token

136

137```typescript

138import { readFile } from "node:fs/promises";

139import OpenAI from "openai";

140import type { SubjectTokenProvider } from "openai/auth";

141

142const tokenPath = "/var/run/secrets/tokens/token";

143const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

144const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

145

146if (!identityProviderId || !serviceAccountId) {

147 throw new Error("Set OPENAI_IDENTITY_PROVIDER_ID and OPENAI_SERVICE_ACCOUNT_ID");

148}

149

150function mountedServiceAccountTokenProvider(path: string): SubjectTokenProvider {

151 return {

152 tokenType: "jwt",

153 getToken: async () => {

154 const token = (await readFile(path, "utf8")).trim();

155 if (!token) {

156 throw new Error("The mounted service account token file is empty.");

157 }

158 return token;

159 },

160 };

161}

162

163const client = new OpenAI({

164 workloadIdentity: {

165 identityProviderId,

166 serviceAccountId,

167 provider: mountedServiceAccountTokenProvider(tokenPath),

168 },

169});

170

171const response = await client.responses.create({

172 model: "gpt-5.4-mini",

173 input: "Say hello from Kubernetes workload identity federation.",

174});

175

176console.log(response.output_text);

177```

178

179```python

180import os

181from pathlib import Path

182

183from openai import OpenAI

184from openai.auth import SubjectTokenProvider

185

186TOKEN_PATH = "/var/run/secrets/tokens/token"

187

188

189def mounted_service_account_token_provider(token_path: str) -> SubjectTokenProvider:

190 def get_token() -> str:

191 token = Path(token_path).read_text().strip()

192 if not token:

193 raise RuntimeError("The mounted service account token file is empty.")

194 return token

195

196 return {"token_type": "jwt", "get_token": get_token}

197

198

199client = OpenAI(

200 workload_identity={

201 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

202 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

203 "provider": mounted_service_account_token_provider(TOKEN_PATH),

204 },

205)

206

207response = client.responses.create(

208 model="gpt-5.4-mini",

209 input="Say hello from Kubernetes workload identity federation.",

210)

211

212print(response.output_text)

213```

214

215```go

216package main

217

218import (

219 "context"

220 "fmt"

221 "log"

222 "os"

223 "strings"

224

225 "github.com/openai/openai-go/v3"

226 "github.com/openai/openai-go/v3/auth"

227 "github.com/openai/openai-go/v3/option"

228 "github.com/openai/openai-go/v3/responses"

229)

230

231const tokenPath = "/var/run/secrets/tokens/token"

232

233type mountedServiceAccountTokenProvider struct {

234 path string

235}

236

237func (p mountedServiceAccountTokenProvider) TokenType() auth.SubjectTokenType {

238 return auth.SubjectTokenTypeJWT

239}

240

241func (p mountedServiceAccountTokenProvider) GetToken(ctx context.Context, _ auth.HTTPDoer) (string, error) {

242 data, err := os.ReadFile(p.path)

243 if err != nil {

244 return "", &auth.SubjectTokenProviderError{

245 Provider: "kubernetes",

246 Message: "failed to read mounted service account token",

247 Cause: err,

248 }

249 }

250

251 token := strings.TrimSpace(string(data))

252 if token == "" {

253 return "", &auth.SubjectTokenProviderError{

254 Provider: "kubernetes",

255 Message: "mounted service account token is empty",

256 }

257 }

258

259 return token, nil

260}

261

262func main() {

263 client := openai.NewClient(

264 option.WithWorkloadIdentity(auth.WorkloadIdentity{

265 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

266 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

267 Provider: mountedServiceAccountTokenProvider{

268 path: tokenPath,

269 },

270 }),

271 )

272

273 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

274 Model: openai.ChatModelGPT4_1Mini,

275 Input: responses.ResponseNewParamsInputUnion{

276 OfString: openai.String("Say hello from Kubernetes workload identity federation."),

277 },

278 })

279 if err != nil {

280 log.Fatal(err)

281 }

282

283 fmt.Println(response.OutputText())

284}

285```

286

287```java

288import com.fasterxml.jackson.databind.json.JsonMapper;

289import com.openai.auth.SubjectTokenProvider;

290import com.openai.auth.SubjectTokenType;

291import com.openai.auth.WorkloadIdentity;

292import com.openai.client.OpenAIClient;

293import com.openai.client.okhttp.OpenAIOkHttpClient;

294import com.openai.core.http.HttpClient;

295import com.openai.errors.SubjectTokenProviderException;

296import com.openai.models.ChatModel;

297import com.openai.models.responses.ResponseCreateParams;

298import java.nio.file.Files;

299import java.nio.file.Path;

300import java.util.concurrent.CompletableFuture;

301

302public final class KubernetesWorkloadIdentityExample {

303 private static final String TOKEN_PATH = "/var/run/secrets/tokens/token";

304

305 private KubernetesWorkloadIdentityExample() {}

306

307 static final class MountedServiceAccountTokenProvider implements SubjectTokenProvider {

308 private final Path tokenPath;

309

310 MountedServiceAccountTokenProvider(String tokenPath) {

311 this.tokenPath = Path.of(tokenPath);

312 }

313

314 @Override

315 public SubjectTokenType tokenType() {

316 return SubjectTokenType.JWT;

317 }

318

319 @Override

320 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

321 String token;

322 try {

323 token = Files.readString(tokenPath).trim();

324 } catch (Exception e) {

325 throw new SubjectTokenProviderException(

326 "kubernetes",

327 "failed to read mounted service account token",

328 e);

329 }

330

331 if (token.isEmpty()) {

332 throw new SubjectTokenProviderException(

333 "kubernetes",

334 "mounted service account token is empty",

335 null);

336 }

337

338 return token;

339 }

340

341 @Override

342 public CompletableFuture<String> getTokenAsync(

343 HttpClient httpClient, JsonMapper jsonMapper) {

344 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

345 }

346 }

347

348 public static void main(String[] args) {

349 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

350 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

351 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

352 .provider(new MountedServiceAccountTokenProvider(TOKEN_PATH))

353 .build();

354

355 OpenAIClient client = OpenAIOkHttpClient.builder()

356 .workloadIdentity(workloadIdentity)

357 .build();

358

359 ResponseCreateParams params = ResponseCreateParams.builder()

360 .model(ChatModel.GPT_4_1_MINI)

361 .input("Say hello from Kubernetes workload identity federation.")

362 .build();

363

364 client.responses().create(params).output().stream()

365 .flatMap(item -> item.message().stream())

366 .flatMap(message -> message.content().stream())

367 .flatMap(content -> content.outputText().stream())

368 .forEach(outputText -> System.out.println(outputText.text()));

369 }

370}

371```

372

373```ruby

374require "openai"

375

376TOKEN_PATH = "/var/run/secrets/tokens/token"

377

378class MountedServiceAccountTokenProvider

379 include OpenAI::Auth::SubjectTokenProvider

380

381 def initialize(token_path:)

382 @token_path = token_path

383 end

384

385 def token_type

386 OpenAI::Auth::TokenType::JWT

387 end

388

389 def get_token

390 token = File.read(@token_path).strip

391 if token.empty?

392 raise OpenAI::Errors::SubjectTokenProviderError.new(

393 message: "Mounted service account token is empty",

394 provider: "kubernetes"

395 )

396 end

397 token

398 rescue SystemCallError => e

399 raise OpenAI::Errors::SubjectTokenProviderError.new(

400 message: "Failed to read mounted service account token: #{e.message}",

401 provider: "kubernetes",

402 cause: e

403 )

404 end

405end

406

407provider = MountedServiceAccountTokenProvider.new(token_path: TOKEN_PATH)

408

409workload_identity = OpenAI::Auth::WorkloadIdentity.new(

410 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

411 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

412 provider: provider

413)

414

415client = OpenAI::Client.new(workload_identity: workload_identity)

416

417response = client.responses.create(

418 model: "gpt-5.4-mini",

419 input: "Say hello from Kubernetes workload identity federation."

420)

421

422puts(response.output_text)

423```

424

425

135## Kubernetes best practices426## Kubernetes best practices

136 427

137- Use a stable OIDC issuer. The issuer URL must match the projected service account token `iss` claim and should remain stable across cluster upgrades and maintenance operations.428- Use a stable OIDC issuer. The issuer URL must match the projected service account token `iss` claim and should remain stable across cluster upgrades and maintenance operations.

guides/workload-identity-federation/microsoft-azure.md +726 −2

Details

116 116

117Set `OPENAI_WIF_AUDIENCE` to the Microsoft Entra Application ID URI configured as the Workload Identity Provider audience. The SDK requests a managed identity token for that audience, exchanges it for an OpenAI-issued access token, and uses the OpenAI token to authenticate API requests.117Set `OPENAI_WIF_AUDIENCE` to the Microsoft Entra Application ID URI configured as the Workload Identity Provider audience. The SDK requests a managed identity token for that audience, exchanges it for an OpenAI-issued access token, and uses the OpenAI token to authenticate API requests.

118 118

119</div>119Authenticate from an Azure managed identity token

120

121```typescript

122import OpenAI from "openai";

123import type { SubjectTokenProvider } from "openai/auth";

124

125const imdsEndpoint = "http://169.254.169.254/metadata/identity/oauth2/token";

126

127const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

128const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

129const audience = process.env.OPENAI_WIF_AUDIENCE;

130

131if (!identityProviderId || !serviceAccountId || !audience) {

132 throw new Error(

133 "Set OPENAI_IDENTITY_PROVIDER_ID, OPENAI_SERVICE_ACCOUNT_ID, and OPENAI_WIF_AUDIENCE"

134 );

135}

136

137function azureManagedIdentityTokenProvider(resource: string): SubjectTokenProvider {

138 return {

139 tokenType: "jwt",

140 getToken: async () => {

141 const url = new URL(imdsEndpoint);

142 url.searchParams.set("api-version", "2018-02-01");

143 url.searchParams.set("resource", resource);

144

145 const clientId = process.env.AZURE_CLIENT_ID;

146 if (clientId) {

147 url.searchParams.set("client_id", clientId);

148 }

149

150 const response = await fetch(url, {

151 headers: { Metadata: "true" },

152 });

153

154 if (!response.ok) {

155 throw new Error(

156 `Azure IMDS token request failed with status ${response.status}.`

157 );

158 }

159

160 const body = (await response.json()) as { access_token?: string };

161 if (!body.access_token) {

162 throw new Error("Azure IMDS did not return an access token.");

163 }

164

165 return body.access_token;

166 },

167 };

168}

169

170const client = new OpenAI({

171 workloadIdentity: {

172 identityProviderId,

173 serviceAccountId,

174 provider: azureManagedIdentityTokenProvider(audience),

175 },

176});

177

178const response = await client.responses.create({

179 model: "gpt-5.4-mini",

180 input: "Say hello from Azure managed identity workload identity federation.",

181});

182

183console.log(response.output_text);

184```

185

186```python

187import json

188import os

189from urllib.parse import urlencode

190from urllib.request import Request, urlopen

191

192from openai import OpenAI

193from openai.auth import SubjectTokenProvider

194

195IMDS_ENDPOINT = "http://169.254.169.254/metadata/identity/oauth2/token"

196

197

198def azure_managed_identity_token_provider(resource: str) -> SubjectTokenProvider:

199 def get_token() -> str:

200 params = {

201 "api-version": "2018-02-01",

202 "resource": resource,

203 }

204

205 client_id = os.environ.get("AZURE_CLIENT_ID")

206 if client_id:

207 params["client_id"] = client_id

208

209 request = Request(

210 f"{IMDS_ENDPOINT}?{urlencode(params)}",

211 headers={"Metadata": "true"},

212 )

213

214 with urlopen(request, timeout=10) as response:

215 body = json.loads(response.read().decode("utf-8"))

216

217 token = body.get("access_token", "")

218 if not token:

219 raise RuntimeError("Azure IMDS did not return an access token.")

220 return token

221

222 return {"token_type": "jwt", "get_token": get_token}

223

224client = OpenAI(

225 workload_identity={

226 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

227 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

228 "provider": azure_managed_identity_token_provider(

229 os.environ["OPENAI_WIF_AUDIENCE"]

230 ),

231 },

232)

233

234response = client.responses.create(

235 model="gpt-5.4-mini",

236 input="Say hello from Azure managed identity workload identity federation.",

237)

238

239print(response.output_text)

240```

241

242```go

243package main

244

245import (

246 "context"

247 "encoding/json"

248 "fmt"

249 "log"

250 "net/http"

251 "net/url"

252 "os"

253

254 "github.com/openai/openai-go/v3"

255 "github.com/openai/openai-go/v3/auth"

256 "github.com/openai/openai-go/v3/option"

257 "github.com/openai/openai-go/v3/responses"

258)

259

260const azureIMDSEndpoint = "http://169.254.169.254/metadata/identity/oauth2/token"

261

262type azureManagedIdentityTokenProvider struct {

263 resource string

264}

265

266func (p azureManagedIdentityTokenProvider) TokenType() auth.SubjectTokenType {

267 return auth.SubjectTokenTypeJWT

268}

269

270func (p azureManagedIdentityTokenProvider) GetToken(ctx context.Context, httpClient auth.HTTPDoer) (string, error) {

271 values := url.Values{}

272 values.Set("api-version", "2018-02-01")

273 values.Set("resource", p.resource)

274 if clientID := os.Getenv("AZURE_CLIENT_ID"); clientID != "" {

275 values.Set("client_id", clientID)

276 }

277

278 req, err := http.NewRequestWithContext(ctx, http.MethodGet, azureIMDSEndpoint+"?"+values.Encode(), nil)

279 if err != nil {

280 return "", &auth.SubjectTokenProviderError{

281 Provider: "azure-managed-identity",

282 Message: "failed to build Azure IMDS token request",

283 Cause: err,

284 }

285 }

286 req.Header.Set("Metadata", "true")

287

288 resp, err := httpClient.Do(req)

289 if err != nil {

290 return "", &auth.SubjectTokenProviderError{

291 Provider: "azure-managed-identity",

292 Message: "failed to request Azure managed identity token",

293 Cause: err,

294 }

295 }

296 defer resp.Body.Close()

297

298 if resp.StatusCode < 200 || resp.StatusCode >= 300 {

299 return "", &auth.SubjectTokenProviderError{

300 Provider: "azure-managed-identity",

301 Message: fmt.Sprintf("Azure IMDS token request failed with status %d", resp.StatusCode),

302 }

303 }

304

305 var body struct {

306 AccessToken string `json:"access_token"`

307 }

308 if err := json.NewDecoder(resp.Body).Decode(&body); err != nil {

309 return "", &auth.SubjectTokenProviderError{

310 Provider: "azure-managed-identity",

311 Message: "failed to decode Azure IMDS token response",

312 Cause: err,

313 }

314 }

315 if body.AccessToken == "" {

316 return "", &auth.SubjectTokenProviderError{

317 Provider: "azure-managed-identity",

318 Message: "Azure IMDS did not return an access token",

319 }

320 }

321

322 return body.AccessToken, nil

323}

324

325func main() {

326 audience := os.Getenv("OPENAI_WIF_AUDIENCE")

327 if audience == "" {

328 log.Fatal("Set OPENAI_WIF_AUDIENCE")

329 }

330

331 client := openai.NewClient(

332 option.WithWorkloadIdentity(auth.WorkloadIdentity{

333 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

334 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

335 Provider: azureManagedIdentityTokenProvider{

336 resource: audience,

337 },

338 }),

339 )

340

341 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

342 Model: openai.ChatModelGPT4_1Mini,

343 Input: responses.ResponseNewParamsInputUnion{

344 OfString: openai.String("Say hello from Azure managed identity workload identity federation."),

345 },

346 })

347 if err != nil {

348 log.Fatal(err)

349 }

350

351 fmt.Println(response.OutputText())

352}

353```

354

355```java

356import com.fasterxml.jackson.databind.JsonNode;

357import com.fasterxml.jackson.databind.json.JsonMapper;

358import com.openai.auth.SubjectTokenProvider;

359import com.openai.auth.SubjectTokenType;

360import com.openai.auth.WorkloadIdentity;

361import com.openai.client.OpenAIClient;

362import com.openai.client.okhttp.OpenAIOkHttpClient;

363import com.openai.core.http.HttpClient;

364import com.openai.errors.SubjectTokenProviderException;

365import com.openai.models.ChatModel;

366import com.openai.models.responses.ResponseCreateParams;

367import java.net.URI;

368import java.net.URLEncoder;

369import java.net.http.HttpRequest;

370import java.net.http.HttpResponse;

371import java.nio.charset.StandardCharsets;

372import java.util.concurrent.CompletableFuture;

373

374public final class AzureManagedIdentityWorkloadIdentityExample {

375 private static final String IMDS_ENDPOINT =

376 "http://169.254.169.254/metadata/identity/oauth2/token";

377

378 private AzureManagedIdentityWorkloadIdentityExample() {}

379

380 static final class AzureManagedIdentityTokenProvider implements SubjectTokenProvider {

381 private final String resource;

382

383 AzureManagedIdentityTokenProvider(String resource) {

384 this.resource = resource;

385 }

386

387 @Override

388 public SubjectTokenType tokenType() {

389 return SubjectTokenType.JWT;

390 }

391

392 @Override

393 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

394 try {

395 String query = "api-version=2018-02-01&resource="

396 + URLEncoder.encode(resource, StandardCharsets.UTF_8);

397 String clientId = System.getenv("AZURE_CLIENT_ID");

398 if (clientId != null && !clientId.isEmpty()) {

399 query += "&client_id="

400 + URLEncoder.encode(clientId, StandardCharsets.UTF_8);

401 }

402

403 HttpRequest request = HttpRequest.newBuilder()

404 .uri(URI.create(IMDS_ENDPOINT + "?" + query))

405 .header("Metadata", "true")

406 .GET()

407 .build();

408

409 HttpResponse<String> response = java.net.http.HttpClient.newHttpClient()

410 .send(request, HttpResponse.BodyHandlers.ofString());

411 if (response.statusCode() < 200 || response.statusCode() >= 300) {

412 throw new SubjectTokenProviderException(

413 "azure-managed-identity",

414 "Azure IMDS token request failed with status "

415 + response.statusCode(),

416 null);

417 }

418

419 JsonNode body = jsonMapper.readTree(response.body());

420 String token = body.path("access_token").asText();

421 if (token.isEmpty()) {

422 throw new SubjectTokenProviderException(

423 "azure-managed-identity",

424 "Azure IMDS did not return an access token",

425 null);

426 }

427

428 return token;

429 } catch (SubjectTokenProviderException e) {

430 throw e;

431 } catch (Exception e) {

432 throw new SubjectTokenProviderException(

433 "azure-managed-identity",

434 "failed to request Azure managed identity token",

435 e);

436 }

437 }

438

439 @Override

440 public CompletableFuture<String> getTokenAsync(

441 HttpClient httpClient, JsonMapper jsonMapper) {

442 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

443 }

444 }

445

446 public static void main(String[] args) {

447 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

448 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

449 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

450 .provider(new AzureManagedIdentityTokenProvider(

451 System.getenv("OPENAI_WIF_AUDIENCE")))

452 .build();

453

454 OpenAIClient client = OpenAIOkHttpClient.builder()

455 .workloadIdentity(workloadIdentity)

456 .build();

457

458 ResponseCreateParams params = ResponseCreateParams.builder()

459 .model(ChatModel.GPT_4_1_MINI)

460 .input("Say hello from Azure managed identity workload identity federation.")

461 .build();

462

463 client.responses().create(params).output().stream()

464 .flatMap(item -> item.message().stream())

465 .flatMap(message -> message.content().stream())

466 .flatMap(content -> content.outputText().stream())

467 .forEach(outputText -> System.out.println(outputText.text()));

468 }

469}

470```

471

472```ruby

473require "json"

474require "net/http"

475require "openai"

476require "uri"

477

478class AzureManagedIdentityTokenProvider

479 include OpenAI::Auth::SubjectTokenProvider

480

481 IMDS_ENDPOINT = "http://169.254.169.254/metadata/identity/oauth2/token"

482

483 def initialize(resource:)

484 @resource = resource

485 end

486

487 def token_type

488 OpenAI::Auth::TokenType::JWT

489 end

490

491 def get_token

492 uri = URI(IMDS_ENDPOINT)

493 params = {

494 "api-version" => "2018-02-01",

495 "resource" => @resource

496 }

497 params["client_id"] = ENV["AZURE_CLIENT_ID"] if ENV["AZURE_CLIENT_ID"]

498 uri.query = URI.encode_www_form(params)

499

500 request = Net::HTTP::Get.new(uri)

501 request["Metadata"] = "true"

502

503 response = Net::HTTP.start(uri.hostname, uri.port, read_timeout: 10) do |http|

504 http.request(request)

505 end

506

507 unless response.is_a?(Net::HTTPSuccess)

508 raise OpenAI::Errors::SubjectTokenProviderError.new(

509 message: "Azure IMDS token request failed with status #{response.code}",

510 provider: "azure-managed-identity"

511 )

512 end

513

514 token = JSON.parse(response.body).fetch("access_token", "")

515 if token.empty?

516 raise OpenAI::Errors::SubjectTokenProviderError.new(

517 message: "Azure IMDS did not return an access token",

518 provider: "azure-managed-identity"

519 )

520 end

521 token

522 rescue JSON::ParserError, SystemCallError => e

523 raise OpenAI::Errors::SubjectTokenProviderError.new(

524 message: "Failed to request Azure managed identity token: #{e.message}",

525 provider: "azure-managed-identity",

526 cause: e

527 )

528 end

529end

530

531provider = AzureManagedIdentityTokenProvider.new(

532 resource: ENV.fetch("OPENAI_WIF_AUDIENCE")

533)

534

535workload_identity = OpenAI::Auth::WorkloadIdentity.new(

536 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

537 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

538 provider: provider

539)

540

541client = OpenAI::Client.new(workload_identity: workload_identity)

542

543response = client.responses.create(

544 model: "gpt-5.4-mini",

545 input: "Say hello from Azure managed identity workload identity federation."

546)

547

548puts(response.output_text)

549```

550

551

552 </div>

120 553

121 <div data-content-switcher-pane data-value="aks" hidden>554 <div data-content-switcher-pane data-value="aks" hidden>

122 555

271 704

272The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected AKS service account token from the mounted file path and uses it as the subject token for workload identity federation.705The following examples initialize an OpenAI client with a custom subject token provider. The provider reads the projected AKS service account token from the mounted file path and uses it as the subject token for workload identity federation.

273 706

274</div>707Authenticate from an AKS projected service account token

708

709```typescript

710import { readFile } from "node:fs/promises";

711import OpenAI from "openai";

712import type { SubjectTokenProvider } from "openai/auth";

713

714const tokenPath = "/var/run/secrets/tokens/token";

715const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

716const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

717

718if (!identityProviderId || !serviceAccountId) {

719 throw new Error("Set OPENAI_IDENTITY_PROVIDER_ID and OPENAI_SERVICE_ACCOUNT_ID");

720}

721

722function mountedAksServiceAccountTokenProvider(path: string): SubjectTokenProvider {

723 return {

724 tokenType: "jwt",

725 getToken: async () => {

726 const token = (await readFile(path, "utf8")).trim();

727 if (!token) {

728 throw new Error("The mounted AKS service account token file is empty.");

729 }

730 return token;

731 },

732 };

733}

734

735const client = new OpenAI({

736 workloadIdentity: {

737 identityProviderId,

738 serviceAccountId,

739 provider: mountedAksServiceAccountTokenProvider(tokenPath),

740 },

741});

742

743const response = await client.responses.create({

744 model: "gpt-5.4-mini",

745 input: "Say hello from AKS workload identity federation.",

746});

747

748console.log(response.output_text);

749```

750

751```python

752import os

753from pathlib import Path

754

755from openai import OpenAI

756from openai.auth import SubjectTokenProvider

757

758TOKEN_PATH = "/var/run/secrets/tokens/token"

759

760

761def mounted_aks_service_account_token_provider(token_path: str) -> SubjectTokenProvider:

762 def get_token() -> str:

763 token = Path(token_path).read_text().strip()

764 if not token:

765 raise RuntimeError("The mounted AKS service account token file is empty.")

766 return token

767

768 return {"token_type": "jwt", "get_token": get_token}

769

770

771client = OpenAI(

772 workload_identity={

773 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

774 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

775 "provider": mounted_aks_service_account_token_provider(TOKEN_PATH),

776 },

777)

778

779response = client.responses.create(

780 model="gpt-5.4-mini",

781 input="Say hello from AKS workload identity federation.",

782)

783

784print(response.output_text)

785```

786

787```go

788package main

789

790import (

791 "context"

792 "fmt"

793 "log"

794 "os"

795 "strings"

796

797 "github.com/openai/openai-go/v3"

798 "github.com/openai/openai-go/v3/auth"

799 "github.com/openai/openai-go/v3/option"

800 "github.com/openai/openai-go/v3/responses"

801)

802

803const tokenPath = "/var/run/secrets/tokens/token"

804

805type mountedAksServiceAccountTokenProvider struct {

806 path string

807}

808

809func (p mountedAksServiceAccountTokenProvider) TokenType() auth.SubjectTokenType {

810 return auth.SubjectTokenTypeJWT

811}

812

813func (p mountedAksServiceAccountTokenProvider) GetToken(_ context.Context, _ auth.HTTPDoer) (string, error) {

814 data, err := os.ReadFile(p.path)

815 if err != nil {

816 return "", &auth.SubjectTokenProviderError{

817 Provider: "azure-aks",

818 Message: "failed to read mounted AKS service account token",

819 Cause: err,

820 }

821 }

822

823 token := strings.TrimSpace(string(data))

824 if token == "" {

825 return "", &auth.SubjectTokenProviderError{

826 Provider: "azure-aks",

827 Message: "mounted AKS service account token is empty",

828 }

829 }

830

831 return token, nil

832}

833

834func main() {

835 client := openai.NewClient(

836 option.WithWorkloadIdentity(auth.WorkloadIdentity{

837 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

838 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

839 Provider: mountedAksServiceAccountTokenProvider{

840 path: tokenPath,

841 },

842 }),

843 )

844

845 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

846 Model: openai.ChatModelGPT4_1Mini,

847 Input: responses.ResponseNewParamsInputUnion{

848 OfString: openai.String("Say hello from AKS workload identity federation."),

849 },

850 })

851 if err != nil {

852 log.Fatal(err)

853 }

854

855 fmt.Println(response.OutputText())

856}

857```

858

859```java

860import com.fasterxml.jackson.databind.json.JsonMapper;

861import com.openai.auth.SubjectTokenProvider;

862import com.openai.auth.SubjectTokenType;

863import com.openai.auth.WorkloadIdentity;

864import com.openai.client.OpenAIClient;

865import com.openai.client.okhttp.OpenAIOkHttpClient;

866import com.openai.core.http.HttpClient;

867import com.openai.errors.SubjectTokenProviderException;

868import com.openai.models.ChatModel;

869import com.openai.models.responses.ResponseCreateParams;

870import java.nio.file.Files;

871import java.nio.file.Path;

872import java.util.concurrent.CompletableFuture;

873

874public final class AzureAksWorkloadIdentityExample {

875 private static final String TOKEN_PATH = "/var/run/secrets/tokens/token";

876

877 private AzureAksWorkloadIdentityExample() {}

878

879 static final class MountedAksServiceAccountTokenProvider implements SubjectTokenProvider {

880 private final Path tokenPath;

881

882 MountedAksServiceAccountTokenProvider(String tokenPath) {

883 this.tokenPath = Path.of(tokenPath);

884 }

885

886 @Override

887 public SubjectTokenType tokenType() {

888 return SubjectTokenType.JWT;

889 }

890

891 @Override

892 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

893 String token;

894 try {

895 token = Files.readString(tokenPath).trim();

896 } catch (Exception e) {

897 throw new SubjectTokenProviderException(

898 "azure-aks",

899 "failed to read mounted AKS service account token",

900 e);

901 }

902

903 if (token.isEmpty()) {

904 throw new SubjectTokenProviderException(

905 "azure-aks",

906 "mounted AKS service account token is empty",

907 null);

908 }

909

910 return token;

911 }

912

913 @Override

914 public CompletableFuture<String> getTokenAsync(

915 HttpClient httpClient, JsonMapper jsonMapper) {

916 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

917 }

918 }

919

920 public static void main(String[] args) {

921 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

922 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

923 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

924 .provider(new MountedAksServiceAccountTokenProvider(TOKEN_PATH))

925 .build();

926

927 OpenAIClient client = OpenAIOkHttpClient.builder()

928 .workloadIdentity(workloadIdentity)

929 .build();

930

931 ResponseCreateParams params = ResponseCreateParams.builder()

932 .model(ChatModel.GPT_4_1_MINI)

933 .input("Say hello from AKS workload identity federation.")

934 .build();

935

936 client.responses().create(params).output().stream()

937 .flatMap(item -> item.message().stream())

938 .flatMap(message -> message.content().stream())

939 .flatMap(content -> content.outputText().stream())

940 .forEach(outputText -> System.out.println(outputText.text()));

941 }

942}

943```

944

945```ruby

946require "openai"

947

948TOKEN_PATH = "/var/run/secrets/tokens/token"

949

950class MountedAksServiceAccountTokenProvider

951 include OpenAI::Auth::SubjectTokenProvider

952

953 def initialize(token_path:)

954 @token_path = token_path

955 end

956

957 def token_type

958 OpenAI::Auth::TokenType::JWT

959 end

960

961 def get_token

962 token = File.read(@token_path).strip

963 if token.empty?

964 raise OpenAI::Errors::SubjectTokenProviderError.new(

965 message: "Mounted AKS service account token is empty",

966 provider: "azure-aks"

967 )

968 end

969 token

970 rescue SystemCallError => e

971 raise OpenAI::Errors::SubjectTokenProviderError.new(

972 message: "Failed to read mounted AKS service account token: #{e.message}",

973 provider: "azure-aks",

974 cause: e

975 )

976 end

977end

978

979provider = MountedAksServiceAccountTokenProvider.new(token_path: TOKEN_PATH)

980

981workload_identity = OpenAI::Auth::WorkloadIdentity.new(

982 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

983 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

984 provider: provider

985)

986

987client = OpenAI::Client.new(workload_identity: workload_identity)

988

989response = client.responses.create(

990 model: "gpt-5.4-mini",

991 input: "Say hello from AKS workload identity federation."

992)

993

994puts(response.output_text)

995```

996

997

998 </div>

275 999

276 1000

277 1001

guides/workload-identity-federation/spiffe.md +291 −0

Details

161 161

162Set `OPENAI_IDENTITY_PROVIDER_ID` and `OPENAI_SERVICE_ACCOUNT_ID` in the workload environment. The token file contains the external subject token. `OPENAI_IDENTITY_PROVIDER_ID` identifies the OpenAI Workload Identity Provider, and `OPENAI_SERVICE_ACCOUNT_ID` identifies the target OpenAI service account. OpenAI then finds a matching mapping for that provider and service account based on the token claims.162Set `OPENAI_IDENTITY_PROVIDER_ID` and `OPENAI_SERVICE_ACCOUNT_ID` in the workload environment. The token file contains the external subject token. `OPENAI_IDENTITY_PROVIDER_ID` identifies the OpenAI Workload Identity Provider, and `OPENAI_SERVICE_ACCOUNT_ID` identifies the target OpenAI service account. OpenAI then finds a matching mapping for that provider and service account based on the token claims.

163 163

164Authenticate from a SPIFFE JWT-SVID

165

166```typescript

167import { readFile } from "node:fs/promises";

168import OpenAI from "openai";

169import type { SubjectTokenProvider } from "openai/auth";

170

171const tokenPath = "/var/run/spiffe/openai.jwt";

172const identityProviderId = process.env.OPENAI_IDENTITY_PROVIDER_ID;

173const serviceAccountId = process.env.OPENAI_SERVICE_ACCOUNT_ID;

174

175if (!identityProviderId || !serviceAccountId) {

176 throw new Error("Set OPENAI_IDENTITY_PROVIDER_ID and OPENAI_SERVICE_ACCOUNT_ID");

177}

178

179function spiffeJwtSvidProvider(path: string): SubjectTokenProvider {

180 return {

181 tokenType: "jwt",

182 getToken: async () => {

183 const token = (await readFile(path, "utf8")).trim();

184 if (!token) {

185 throw new Error("The SPIFFE JWT-SVID file is empty.");

186 }

187 return token;

188 },

189 };

190}

191

192const client = new OpenAI({

193 workloadIdentity: {

194 identityProviderId,

195 serviceAccountId,

196 provider: spiffeJwtSvidProvider(tokenPath),

197 },

198});

199

200const response = await client.responses.create({

201 model: "gpt-5.4-mini",

202 input: "Say hello from SPIFFE workload identity federation.",

203});

204

205console.log(response.output_text);

206```

207

208```python

209import os

210from pathlib import Path

211

212from openai import OpenAI

213from openai.auth import SubjectTokenProvider

214

215TOKEN_PATH = "/var/run/spiffe/openai.jwt"

216

217

218def spiffe_jwt_svid_provider(token_path: str) -> SubjectTokenProvider:

219 def get_token() -> str:

220 token = Path(token_path).read_text().strip()

221 if not token:

222 raise RuntimeError("The SPIFFE JWT-SVID file is empty.")

223 return token

224

225 return {"token_type": "jwt", "get_token": get_token}

226

227

228client = OpenAI(

229 workload_identity={

230 "identity_provider_id": os.environ["OPENAI_IDENTITY_PROVIDER_ID"],

231 "service_account_id": os.environ["OPENAI_SERVICE_ACCOUNT_ID"],

232 "provider": spiffe_jwt_svid_provider(TOKEN_PATH),

233 },

234)

235

236response = client.responses.create(

237 model="gpt-5.4-mini",

238 input="Say hello from SPIFFE workload identity federation.",

239)

240

241print(response.output_text)

242```

243

244```go

245package main

246

247import (

248 "context"

249 "fmt"

250 "log"

251 "os"

252 "strings"

253

254 "github.com/openai/openai-go/v3"

255 "github.com/openai/openai-go/v3/auth"

256 "github.com/openai/openai-go/v3/option"

257 "github.com/openai/openai-go/v3/responses"

258)

259

260const tokenPath = "/var/run/spiffe/openai.jwt"

261

262type spiffeJWTSVIDProvider struct {

263 path string

264}

265

266func (p spiffeJWTSVIDProvider) TokenType() auth.SubjectTokenType {

267 return auth.SubjectTokenTypeJWT

268}

269

270func (p spiffeJWTSVIDProvider) GetToken(ctx context.Context, _ auth.HTTPDoer) (string, error) {

271 data, err := os.ReadFile(p.path)

272 if err != nil {

273 return "", &auth.SubjectTokenProviderError{

274 Provider: "spiffe",

275 Message: "failed to read SPIFFE JWT-SVID",

276 Cause: err,

277 }

278 }

279

280 token := strings.TrimSpace(string(data))

281 if token == "" {

282 return "", &auth.SubjectTokenProviderError{

283 Provider: "spiffe",

284 Message: "SPIFFE JWT-SVID file is empty",

285 }

286 }

287

288 return token, nil

289}

290

291func main() {

292 client := openai.NewClient(

293 option.WithWorkloadIdentity(auth.WorkloadIdentity{

294 IdentityProviderID: os.Getenv("OPENAI_IDENTITY_PROVIDER_ID"),

295 ServiceAccountID: os.Getenv("OPENAI_SERVICE_ACCOUNT_ID"),

296 Provider: spiffeJWTSVIDProvider{

297 path: tokenPath,

298 },

299 }),

300 )

301

302 response, err := client.Responses.New(context.Background(), responses.ResponseNewParams{

303 Model: openai.ChatModelGPT4_1Mini,

304 Input: responses.ResponseNewParamsInputUnion{

305 OfString: openai.String("Say hello from SPIFFE workload identity federation."),

306 },

307 })

308 if err != nil {

309 log.Fatal(err)

310 }

311

312 fmt.Println(response.OutputText())

313}

314```

315

316```java

317import com.fasterxml.jackson.databind.json.JsonMapper;

318import com.openai.auth.SubjectTokenProvider;

319import com.openai.auth.SubjectTokenType;

320import com.openai.auth.WorkloadIdentity;

321import com.openai.client.OpenAIClient;

322import com.openai.client.okhttp.OpenAIOkHttpClient;

323import com.openai.core.http.HttpClient;

324import com.openai.errors.SubjectTokenProviderException;

325import com.openai.models.ChatModel;

326import com.openai.models.responses.ResponseCreateParams;

327import java.nio.file.Files;

328import java.nio.file.Path;

329import java.util.concurrent.CompletableFuture;

330

331public final class SpiffeWorkloadIdentityExample {

332 private static final String TOKEN_PATH = "/var/run/spiffe/openai.jwt";

333

334 private SpiffeWorkloadIdentityExample() {}

335

336 static final class SpiffeJwtSvidProvider implements SubjectTokenProvider {

337 private final Path tokenPath;

338

339 SpiffeJwtSvidProvider(String tokenPath) {

340 this.tokenPath = Path.of(tokenPath);

341 }

342

343 @Override

344 public SubjectTokenType tokenType() {

345 return SubjectTokenType.JWT;

346 }

347

348 @Override

349 public String getToken(HttpClient httpClient, JsonMapper jsonMapper) {

350 String token;

351 try {

352 token = Files.readString(tokenPath).trim();

353 } catch (Exception e) {

354 throw new SubjectTokenProviderException(

355 "spiffe",

356 "failed to read SPIFFE JWT-SVID",

357 e);

358 }

359

360 if (token.isEmpty()) {

361 throw new SubjectTokenProviderException(

362 "spiffe",

363 "SPIFFE JWT-SVID file is empty",

364 null);

365 }

366

367 return token;

368 }

369

370 @Override

371 public CompletableFuture<String> getTokenAsync(

372 HttpClient httpClient, JsonMapper jsonMapper) {

373 return CompletableFuture.supplyAsync(() -> getToken(httpClient, jsonMapper));

374 }

375 }

376

377 public static void main(String[] args) {

378 WorkloadIdentity workloadIdentity = WorkloadIdentity.builder()

379 .identityProviderId(System.getenv("OPENAI_IDENTITY_PROVIDER_ID"))

380 .serviceAccountId(System.getenv("OPENAI_SERVICE_ACCOUNT_ID"))

381 .provider(new SpiffeJwtSvidProvider(TOKEN_PATH))

382 .build();

383

384 OpenAIClient client = OpenAIOkHttpClient.builder()

385 .workloadIdentity(workloadIdentity)

386 .build();

387

388 ResponseCreateParams params = ResponseCreateParams.builder()

389 .model(ChatModel.GPT_4_1_MINI)

390 .input("Say hello from SPIFFE workload identity federation.")

391 .build();

392

393 client.responses().create(params).output().stream()

394 .flatMap(item -> item.message().stream())

395 .flatMap(message -> message.content().stream())

396 .flatMap(content -> content.outputText().stream())

397 .forEach(outputText -> System.out.println(outputText.text()));

398 }

399}

400```

401

402```ruby

403require "openai"

404

405TOKEN_PATH = "/var/run/spiffe/openai.jwt"

406

407class SpiffeJWTSVIDProvider

408 include OpenAI::Auth::SubjectTokenProvider

409

410 def initialize(token_path:)

411 @token_path = token_path

412 end

413

414 def token_type

415 OpenAI::Auth::TokenType::JWT

416 end

417

418 def get_token

419 token = File.read(@token_path).strip

420 if token.empty?

421 raise OpenAI::Errors::SubjectTokenProviderError.new(

422 message: "SPIFFE JWT-SVID file is empty",

423 provider: "spiffe"

424 )

425 end

426 token

427 rescue SystemCallError => e

428 raise OpenAI::Errors::SubjectTokenProviderError.new(

429 message: "Failed to read SPIFFE JWT-SVID: #{e.message}",

430 provider: "spiffe",

431 cause: e

432 )

433 end

434end

435

436provider = SpiffeJWTSVIDProvider.new(token_path: TOKEN_PATH)

437

438workload_identity = OpenAI::Auth::WorkloadIdentity.new(

439 identity_provider_id: ENV.fetch("OPENAI_IDENTITY_PROVIDER_ID"),

440 service_account_id: ENV.fetch("OPENAI_SERVICE_ACCOUNT_ID"),

441 provider: provider

442)

443

444client = OpenAI::Client.new(workload_identity: workload_identity)

445

446response = client.responses.create(

447 model: "gpt-5.4-mini",

448 input: "Say hello from SPIFFE workload identity federation."

449)

450

451puts(response.output_text)

452```

453

454

164## SPIFFE best practices455## SPIFFE best practices

165 456

166- Use JWT-SVIDs for OpenAI workload identity federation. X.509-SVIDs are useful for mutual TLS but aren't accepted by the OpenAI token exchange endpoint.457- Use JWT-SVIDs for OpenAI workload identity federation. X.509-SVIDs are useful for mutual TLS but aren't accepted by the OpenAI token exchange endpoint.

quickstart.md +0 −12

Details

25 25

26<div data-content-switcher-pane data-value="macOS">26<div data-content-switcher-pane data-value="macOS">

27 <div class="hidden">macOS / Linux</div>27 <div class="hidden">macOS / Linux</div>

~~28 Export an environment variable on macOS or Linux systems~~

~~30```bash~~

~~31export OPENAI_API_KEY="your_api_key_here"~~

~~32```~~

34 </div>28 </div>

35 <div data-content-switcher-pane data-value="windows" hidden>29 <div data-content-switcher-pane data-value="windows" hidden>

36 <div class="hidden">Windows</div>30 <div class="hidden">Windows</div>

~~37 Export an environment variable in PowerShell~~

~~39```bash~~

~~40setx OPENAI_API_KEY "your_api_key_here"~~

~~41```~~

43 </div>31 </div>

44 32

45 33

tutorials/web-qa-embeddings.md +1 −1

Details

293 300

294The newest embeddings model can handle inputs with up to 8191 input tokens so most of the rows would not need any chunking, but this may not be the case for every subpage scraped so the next code chunk will split the longer lines into smaller chunks.301The newest embeddings model can handle inputs with up to 8191 input tokens so most of the rows would not need any chunking, but this may not be the case for every subpage scraped so the next code chunk will split the longer lines into smaller chunks.

295 302

296```Python303```python

297max_tokens = 500304max_tokens = 500

298 305

299# Function to split the text into chunks of a maximum number of tokens306# Function to split the text into chunks of a maximum number of tokens

Documentation 2026-06-11 08:59 UTC to 2026-06-12 19:02 UTC