assistants/deep-dive.md +181 −175
1# Assistants API deep dive1# Assistants API deep dive
2 2
33export const snippetFileCreate = {## Overview
44 python: `
5Don't start a new integration on the Assistants API. We've announced plans to deprecate it soon, as the Responses API now provides the same features and a more elegant integration.
6
7There are several concepts involved in building an app with the Assistants API, covered below in case it helps with your [migration to Responses](https://developers.openai.com/api/docs/guides/assistants/migration).
8
9## Creating assistants
10
11We recommend using OpenAI's <a href="/api/docs/models">latest models</a> with
12 the Assistants API for best results and maximum compatibility with tools.
13
14To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:
15
161. Use the `instructions` parameter to guide the personality of the Assistant and define its goals. Instructions are similar to system messages in the Chat Completions API.
172. Use the `tools` parameter to give the Assistant access to up to 128 tools. You can give it access to OpenAI built-in tools like `code_interpreter` and `file_search`, or call a third-party tools via a `function` calling.
183. Use the `tool_resources` parameter to give the tools like `code_interpreter` and `file_search` access to files. Files are uploaded using the `File` [upload endpoint](https://developers.openai.com/api/docs/api-reference/files/create) and must have the `purpose` set to `assistants` to be used with this API.
19
20For example, to create an Assistant that can create data visualization based on a `.csv` file, first upload a file.
21
22```python
5file = client.files.create(23file = client.files.create(
6 file=open("revenue-forecast.csv", "rb"),24 file=open("revenue-forecast.csv", "rb"),
7 purpose='assistants'25 purpose='assistants'
8)26)
927 `.trim(),```
1028 "node.js": `
29```javascript
11const file = await openai.files.create({30const file = await openai.files.create({
12 file: fs.createReadStream("revenue-forecast.csv"),31 file: fs.createReadStream("revenue-forecast.csv"),
13 purpose: "assistants",32 purpose: "assistants",
14});33});
1534 `.trim(),```
1635 curl: `
1736curl https://api.openai.com/v1/files \\```bash
1837 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/files \
1938 -F purpose="assistants" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
39 -F purpose="assistants" \
20 -F file="@revenue-forecast.csv"40 -F file="@revenue-forecast.csv"
2141 `.trim(),```
2242};
43
44Then, create the Assistant with the `code_interpreter` tool enabled and provide the file as a resource to the tool.
23 45
2446export const snippetAssistantCreation = {```python
25 python: `
26assistant = client.beta.assistants.create(47assistant = client.beta.assistants.create(
27 name="Data visualizer",48 name="Data visualizer",
28 description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",49 description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
34 }55 }
35 }56 }
36)57)
3758 `.trim(),```
3859 "node.js": `
60```javascript
39const assistant = await openai.beta.assistants.create({61const assistant = await openai.beta.assistants.create({
40 name: "Data visualizer",62 name: "Data visualizer",
41 description: "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",63 description: "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
47 }69 }
48 }70 }
49});71});
5072 `.trim(),```
5173 curl: `
5274curl https://api.openai.com/v1/assistants \\```bash
5375 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/assistants \
5476 -H "Content-Type: application/json" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
5577 -H "OpenAI-Beta: assistants=v2" \\ -H "Content-Type: application/json" \
78 -H "OpenAI-Beta: assistants=v2" \
56 -d '{79 -d '{
57 "name": "Data visualizer",80 "name": "Data visualizer",
58 "description": "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",81 "description": "You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
64 }87 }
65 }88 }
66 }'89 }'
6790 `.trim(),```
6891};
92
93You can attach a maximum of 20 files to `code_interpreter` and 10,000 files to `file_search` (using `vector_store` [objects](https://developers.openai.com/api/docs/api-reference/vector-stores/object)). For vector stores created starting in November 2025, the `file_search` limit is 100,000,000 files.
94
95Each file can be at most 512 MB in size and have a maximum of 5,000,000 tokens. By default, each project can store up to 2.5 TB of files total. There is no organization-wide storage limit. You can reach out to our support team to increase this limit.
96
97## Managing Threads and Messages
98
99Threads and Messages represent a conversation session between an Assistant and a user. There is a limit of 100,000 Messages per Thread. Once the size of the Messages exceeds the context window of the model, the Thread will attempt to smartly truncate messages, before fully dropping the ones it considers the least important.
100
101You can create a Thread with an initial list of Messages like this:
69 102
70103export const snippetThreadCreation = {```python
71 python: `
72thread = client.beta.threads.create(104thread = client.beta.threads.create(
73 messages=[105 messages=[
74 {106 {
83 }115 }
84 ]116 ]
85)117)
86118 `.trim(),```
87119 "node.js": `
120```javascript
88const thread = await openai.beta.threads.create({121const thread = await openai.beta.threads.create({
89 messages: [122 messages: [
90 {123 {
99 }132 }
100 ]133 ]
101});134});
102135 `.trim(),```
103136 curl: `
104137curl https://api.openai.com/v1/threads \\```bash
105138 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/threads \
106139 -H "Content-Type: application/json" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
107140 -H "OpenAI-Beta: assistants=v2" \\ -H "Content-Type: application/json" \
141 -H "OpenAI-Beta: assistants=v2" \
108 -d '{142 -d '{
109 "messages": [143 "messages": [
110 {144 {
119 }153 }
120 ]154 ]
121 }'155 }'
122156 `.trim(),```
123157};
158
159Messages can contain text, images, or file attachment. Message `attachments` are helper methods that add files to a thread's `tool_resources`. You can also choose to add files to the `thread.tool_resources` directly.
160
161### Creating image input content
162
163Message content can contain either external image URLs or File IDs uploaded via the [File API](https://developers.openai.com/api/docs/api-reference/files/create). Only [models](https://developers.openai.com/api/docs/models) with Vision support can accept image input. Supported image content types include png, jpg, gif, and webp. When creating image files, pass `purpose="vision"` to allow you to later download and display the input content. Projects are limited to 2.5 TB total file storage, and there is no organization-wide storage limit. Please contact us to request a limit increase.
164
165Tools cannot access image content unless specified. To pass image files to Code Interpreter, add the file ID in the message `attachments` list to allow the tool to read and analyze the input. Image URLs cannot be downloaded in Code Interpreter today.
124 166
125167export const snippetImageCreation = {```python
126 python: `
127file = client.files.create(168file = client.files.create(
128 file=open("myimage.png", "rb"),169 file=open("myimage.png", "rb"),
129 purpose="vision"170 purpose="vision"
149 }190 }
150 ]191 ]
151)192)
152193 `.trim(),```
153 "node.js": `
154 194
195```javascript
196import fs from "fs";
155const file = await openai.files.create({197const file = await openai.files.create({
156 file: fs.createReadStream("myimage.png"),198 file: fs.createReadStream("myimage.png"),
157 purpose: "vision",199 purpose: "vision",
177 }219 }
178 ]220 ]
179});221});
180222 `.trim(),```
181223 curl: `
224```bash
182# Upload a file with an "vision" purpose225# Upload a file with an "vision" purpose
183226curl https://api.openai.com/v1/files \\curl https://api.openai.com/v1/files \
184227 -H "Authorization: Bearer $OPENAI_API_KEY" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
185228 -F purpose="vision" \\ -F purpose="vision" \
186 -F file="@/path/to/myimage.png"229 -F file="@/path/to/myimage.png"
187 230
188## Pass the file ID in the content231## Pass the file ID in the content
189 232
190233curl https://api.openai.com/v1/threads \\curl https://api.openai.com/v1/threads \
191234-H "Authorization: Bearer $OPENAI_API_KEY" \\-H "Authorization: Bearer $OPENAI_API_KEY" \
192235-H "Content-Type: application/json" \\-H "Content-Type: application/json" \
193236-H "OpenAI-Beta: assistants=v2" \\-H "OpenAI-Beta: assistants=v2" \
194-d '{237-d '{
195"messages": [238"messages": [
196{239{
212}255}
213]256]
214}'257}'
215258`.trim(),```
216259};
260
261#### Low or high fidelity image understanding
262
263By controlling the `detail` parameter, which has three options, `low`, `high`, or `auto`, you have control over how the model processes the image and generates its textual understanding.
217 264
218265export const snippetLowHighFidelity = {- `low` will enable the "low res" mode. The model will receive a low-res 512px x 512px version of the image, and represent the image with a budget of 85 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.
219266 python: `- `high` will enable "high res" mode, which first allows the model to see the low res image and then creates detailed crops of input images based on the input image size. Use the [pricing calculator](https://openai.com/api/pricing/) to see token counts for various image sizes.
267
268```python
220thread = client.beta.threads.create(269thread = client.beta.threads.create(
221 messages=[270 messages=[
222 {271 {
237 }286 }
238 ]287 ]
239)288)
240289 `.trim(),```
241290 "node.js": `
291```javascript
242const thread = await openai.beta.threads.create({292const thread = await openai.beta.threads.create({
243 messages: [293 messages: [
244 {294 {
259 }309 }
260 ]310 ]
261});311});
262312 `.trim(),```
263313 curl: `
264314curl https://api.openai.com/v1/threads \\```bash
265315 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/threads \
266316 -H "Content-Type: application/json" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
267317 -H "OpenAI-Beta: assistants=v2" \\ -H "Content-Type: application/json" \
318 -H "OpenAI-Beta: assistants=v2" \
268 -d '{319 -d '{
269 "messages": [320 "messages": [
270 {321 {
285 }336 }
286 ]337 ]
287 }'338 }'
288339 `.trim(),```
289340};
341
342### Context window management
343
344The Assistants API automatically manages the truncation to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.
345
346#### Max Completion and Max Prompt Tokens
347
348To control the token usage in a single Run, set `max_prompt_tokens` and `max_completion_tokens` when creating the Run. These limits apply to the total number of tokens used in all completions throughout the Run's lifecycle.
349
350For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.
351
352If a completion reaches the `max_completion_tokens` limit, the Run will terminate with a status of `incomplete`, and details will be provided in the `incomplete_details` field of the Run object.
290 353
291354export const snippetMessageAnnotations = {When using the File Search tool, we recommend setting the max_prompt_tokens to
292355 python: ` no less than 20,000. For longer conversations or multiple interactions with
356 File Search, consider increasing this limit to 50,000, or ideally, removing
357 the max_prompt_tokens limits altogether to get the highest quality results.
358
359#### Truncation Strategy
360
361You may also specify a truncation strategy to control how your thread should be rendered into the model's context window.
362Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.
363
364### Message annotations
365
366Messages created by Assistants may contain [`annotations`](https://developers.openai.com/api/docs/api-reference/messages/object#messages/object-content) within the `content` array of the object. Annotations provide information around how you should annotate the text in the Message.
367
368There are two types of Annotations:
369
3701. `file_citation`: File citations are created by the [`file_search`](https://developers.openai.com/api/docs/assistants/tools/file-search) tool and define references to a specific file that was uploaded and used by the Assistant to generate the response.
3712. `file_path`: File path annotations are created by the [`code_interpreter`](https://developers.openai.com/api/docs/assistants/tools/code-interpreter) tool and contain references to the files generated by the tool.
372
373When annotations are present in the Message object, you'll see illegible model-generated substrings in the text that you should replace with the annotations. These strings may look something like `【13†source】` or `sandbox:/mnt/data/file.csv`. Here’s an example python code snippet that replaces these strings with the annotations.
374
375```python
293# Retrieve the message object376# Retrieve the message object
294message = client.beta.threads.messages.retrieve(377message = client.beta.threads.messages.retrieve(
295 thread_id="...",378 thread_id="...",
318 401
319# Add footnotes to the end of the message before displaying to user402# Add footnotes to the end of the message before displaying to user
320 403
321404message_content.value += '\\n' + '\\n'.join(citations)message_content.value += '\n' + '\n'.join(citations)
322405`.trim(),```
323};
324 406
325407export const snippetRunCreate = {
326408 python: `## Runs and Run Steps
409
410When you have all the context you need from your user in the Thread, you can run the Thread with an Assistant of your choice.
411
412```python
327run = client.beta.threads.runs.create(413run = client.beta.threads.runs.create(
328 thread_id=thread.id,414 thread_id=thread.id,
329 assistant_id=assistant.id415 assistant_id=assistant.id
330)416)
331417 `.trim(),```
332418 "node.js": `
419```javascript
333const run = await openai.beta.threads.runs.create(420const run = await openai.beta.threads.runs.create(
334 thread.id,421 thread.id,
335 { assistant_id: assistant.id }422 { assistant_id: assistant.id }
336);423);
337424 `.trim(),```
338425 curl: `
339426curl https://api.openai.com/v1/threads/THREAD_ID/runs \\```bash
340427 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/threads/THREAD_ID/runs \
341428 -H "Content-Type: application/json" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
342429 -H "OpenAI-Beta: assistants=v2" \\ -H "Content-Type: application/json" \
430 -H "OpenAI-Beta: assistants=v2" \
343 -d '{431 -d '{
344 "assistant_id": "asst_ToSF7Gb04YMj8AMMm50ZLLtY"432 "assistant_id": "asst_ToSF7Gb04YMj8AMMm50ZLLtY"
345 }'433 }'
346434 `.trim(),```
347435};
348 436
349437export const snippetRunOverride = {By default, a Run will use the `model` and `tools` configuration specified in Assistant object, but you can override most of these when creating the Run for added flexibility:
350438 python: `
439```python
351run = client.beta.threads.runs.create(440run = client.beta.threads.runs.create(
352 thread_id=thread.id,441 thread_id=thread.id,
353 assistant_id=assistant.id,442 assistant_id=assistant.id,
355 instructions="New instructions that override the Assistant instructions",444 instructions="New instructions that override the Assistant instructions",
356 tools=[{"type": "code_interpreter"}, {"type": "file_search"}]445 tools=[{"type": "code_interpreter"}, {"type": "file_search"}]
357)446)
358447 `.trim(),```
359448 "node.js": `
449```javascript
360const run = await openai.beta.threads.runs.create(450const run = await openai.beta.threads.runs.create(
361 thread.id,451 thread.id,
362 {452 {
366 tools: [{"type": "code_interpreter"}, {"type": "file_search"}]456 tools: [{"type": "code_interpreter"}, {"type": "file_search"}]
367 }457 }
368);458);
369459 `.trim(),```
370460 curl: `
371461curl https://api.openai.com/v1/threads/THREAD_ID/runs \\```bash
372462 -H "Authorization: Bearer $OPENAI_API_KEY" \\curl https://api.openai.com/v1/threads/THREAD_ID/runs \
373463 -H "Content-Type: application/json" \\ -H "Authorization: Bearer $OPENAI_API_KEY" \
374464 -H "OpenAI-Beta: assistants=v2" \\ -H "Content-Type: application/json" \
465 -H "OpenAI-Beta: assistants=v2" \
375 -d '{466 -d '{
376 "assistant_id": "ASSISTANT_ID",467 "assistant_id": "ASSISTANT_ID",
377 "model": "gpt-4o",468 "model": "gpt-4o",
378 "instructions": "New instructions that override the Assistant instructions",469 "instructions": "New instructions that override the Assistant instructions",
379 "tools": [{"type": "code_interpreter"}, {"type": "file_search"}]470 "tools": [{"type": "code_interpreter"}, {"type": "file_search"}]
380 }'471 }'
381472 `.trim(),```
382};
383
384## Overview
385
386Don't start a new integration on the Assistants API. We've announced plans to deprecate it soon, as the Responses API now provides the same features and a more elegant integration.
387
388There are several concepts involved in building an app with the Assistants API, covered below in case it helps with your [migration to Responses](https://developers.openai.com/api/docs/guides/assistants/migration).
389
390## Creating assistants
391
392We recommend using OpenAI's <a href="/api/docs/models">latest models</a> with
393 the Assistants API for best results and maximum compatibility with tools.
394
395To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:
396
3971. Use the `instructions` parameter to guide the personality of the Assistant and define its goals. Instructions are similar to system messages in the Chat Completions API.
3982. Use the `tools` parameter to give the Assistant access to up to 128 tools. You can give it access to OpenAI built-in tools like `code_interpreter` and `file_search`, or call a third-party tools via a `function` calling.
3993. Use the `tool_resources` parameter to give the tools like `code_interpreter` and `file_search` access to files. Files are uploaded using the `File` [upload endpoint](https://developers.openai.com/api/docs/api-reference/files/create) and must have the `purpose` set to `assistants` to be used with this API.
400
401For example, to create an Assistant that can create data visualization based on a `.csv` file, first upload a file.
402
403Then, create the Assistant with the `code_interpreter` tool enabled and provide the file as a resource to the tool.
404
405You can attach a maximum of 20 files to `code_interpreter` and 10,000 files to `file_search` (using `vector_store` [objects](https://developers.openai.com/api/docs/api-reference/vector-stores/object)). For vector stores created starting in November 2025, the `file_search` limit is 100,000,000 files.
406
407Each file can be at most 512 MB in size and have a maximum of 5,000,000 tokens. By default, each project can store up to 2.5 TB of files total. There is no organization-wide storage limit. You can reach out to our support team to increase this limit.
408
409## Managing Threads and Messages
410
411Threads and Messages represent a conversation session between an Assistant and a user. There is a limit of 100,000 Messages per Thread. Once the size of the Messages exceeds the context window of the model, the Thread will attempt to smartly truncate messages, before fully dropping the ones it considers the least important.
412
413You can create a Thread with an initial list of Messages like this:
414
415Messages can contain text, images, or file attachment. Message `attachments` are helper methods that add files to a thread's `tool_resources`. You can also choose to add files to the `thread.tool_resources` directly.
416
417### Creating image input content
418
419Message content can contain either external image URLs or File IDs uploaded via the [File API](https://developers.openai.com/api/docs/api-reference/files/create). Only [models](https://developers.openai.com/api/docs/models) with Vision support can accept image input. Supported image content types include png, jpg, gif, and webp. When creating image files, pass `purpose="vision"` to allow you to later download and display the input content. Projects are limited to 2.5 TB total file storage, and there is no organization-wide storage limit. Please contact us to request a limit increase.
420 473
421Tools cannot access image content unless specified. To pass image files to Code Interpreter, add the file ID in the message `attachments` list to allow the tool to read and analyze the input. Image URLs cannot be downloaded in Code Interpreter today.
422
423#### Low or high fidelity image understanding
424
425By controlling the `detail` parameter, which has three options, `low`, `high`, or `auto`, you have control over how the model processes the image and generates its textual understanding.
426
427- `low` will enable the "low res" mode. The model will receive a low-res 512px x 512px version of the image, and represent the image with a budget of 85 tokens. This allows the API to return faster responses and consume fewer input tokens for use cases that do not require high detail.
428- `high` will enable "high res" mode, which first allows the model to see the low res image and then creates detailed crops of input images based on the input image size. Use the [pricing calculator](https://openai.com/api/pricing/) to see token counts for various image sizes.
429
430### Context window management
431
432The Assistants API automatically manages the truncation to ensure it stays within the model's maximum context length. You can customize this behavior by specifying the maximum tokens you'd like a run to utilize and/or the maximum number of recent messages you'd like to include in a run.
433
434#### Max Completion and Max Prompt Tokens
435
436To control the token usage in a single Run, set `max_prompt_tokens` and `max_completion_tokens` when creating the Run. These limits apply to the total number of tokens used in all completions throughout the Run's lifecycle.
437
438For example, initiating a Run with `max_prompt_tokens` set to 500 and `max_completion_tokens` set to 1000 means the first completion will truncate the thread to 500 tokens and cap the output at 1000 tokens. If only 200 prompt tokens and 300 completion tokens are used in the first completion, the second completion will have available limits of 300 prompt tokens and 700 completion tokens.
439
440If a completion reaches the `max_completion_tokens` limit, the Run will terminate with a status of `incomplete`, and details will be provided in the `incomplete_details` field of the Run object.
441
442When using the File Search tool, we recommend setting the max_prompt_tokens to
443 no less than 20,000. For longer conversations or multiple interactions with
444 File Search, consider increasing this limit to 50,000, or ideally, removing
445 the max_prompt_tokens limits altogether to get the highest quality results.
446
447#### Truncation Strategy
448
449You may also specify a truncation strategy to control how your thread should be rendered into the model's context window.
450Using a truncation strategy of type `auto` will use OpenAI's default truncation strategy. Using a truncation strategy of type `last_messages` will allow you to specify the number of the most recent messages to include in the context window.
451
452### Message annotations
453
454Messages created by Assistants may contain [`annotations`](https://developers.openai.com/api/docs/api-reference/messages/object#messages/object-content) within the `content` array of the object. Annotations provide information around how you should annotate the text in the Message.
455
456There are two types of Annotations:
457
4581. `file_citation`: File citations are created by the [`file_search`](https://developers.openai.com/api/docs/assistants/tools/file-search) tool and define references to a specific file that was uploaded and used by the Assistant to generate the response.
4592. `file_path`: File path annotations are created by the [`code_interpreter`](https://developers.openai.com/api/docs/assistants/tools/code-interpreter) tool and contain references to the files generated by the tool.
460
461When annotations are present in the Message object, you'll see illegible model-generated substrings in the text that you should replace with the annotations. These strings may look something like `【13†source】` or `sandbox:/mnt/data/file.csv`. Here’s an example python code snippet that replaces these strings with the annotations.
462
463## Runs and Run Steps
464
465When you have all the context you need from your user in the Thread, you can run the Thread with an Assistant of your choice.
466
467By default, a Run will use the `model` and `tools` configuration specified in Assistant object, but you can override most of these when creating the Run for added flexibility:
468 474
469Note: `tool_resources` associated with the Assistant cannot be overridden during Run creation. You must use the [modify Assistant](https://developers.openai.com/api/docs/api-reference/assistants/modifyAssistant) endpoint to do this.475Note: `tool_resources` associated with the Assistant cannot be overridden during Run creation. You must use the [modify Assistant](https://developers.openai.com/api/docs/api-reference/assistants/modifyAssistant) endpoint to do this.
470 476