17 17
18The Responses API is a unified interface for building powerful, agent-like applications. It contains:18The Responses API is a unified interface for building powerful, agent-like applications. It contains:
19 19
20- Built-in tools like [web search](https://developers.openai.com/api/docs/guides/tools-web-search), [file search](https://developers.openai.com/api/docs/guides/tools-file-search)20- Built-in tools like [web search](https://developers.openai.com/api/docs/guides/tools-web-search), [file search](https://developers.openai.com/api/docs/guides/tools-file-search), [computer use](https://developers.openai.com/api/docs/guides/tools-computer-use), [code interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter), and [remote MCPs](https://developers.openai.com/api/docs/guides/tools-remote-mcp).
21 , [computer use](https://developers.openai.com/api/docs/guides/tools-computer-use), [code interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter), and [remote MCPs](https://developers.openai.com/api/docs/guides/tools-remote-mcp).
22- Seamless multi-turn interactions that allow you to pass previous responses for higher accuracy reasoning results.21- Seamless multi-turn interactions that allow you to pass previous responses for higher accuracy reasoning results.
23- Native multimodal support for text and images.22- Native multimodal support for text and images.
24 23
85 84
86## Migrating from Chat Completions85## Migrating from Chat Completions
87 86
87Treat migration as three related changes: send requests to `/v1/responses`, read output from a typed `output` array, and choose how your application will carry state between turns.
88
88### 1. Update generation endpoints89### 1. Update generation endpoints
89 90
90Start by updating your generation endpoints from `post /v1/chat/completions` to `post /v1/responses`.91Start by updating your generation endpoints from `post /v1/chat/completions` to `post /v1/responses`.
91 92
92If you are not using functions or multimodal inputs, then you're done! Simple message inputs are compatible from one API to the other:93If you are not using functions or multimodal inputs, simple message inputs are compatible from one API to the other:
93 94
94Web search tool95Reuse simple message input
95 96
96```bash97```bash
97INPUT='[98INPUT='[
155 156
156<div data-content-switcher-pane data-value="chat-completions">157<div data-content-switcher-pane data-value="chat-completions">
157 <div class="hidden">Chat Completions</div>158 <div class="hidden">Chat Completions</div>
158 <>159 With Chat Completions, you create a `messages` array and read the model text
159 With Chat Completions, you need to create an array of messages that specify different roles and content for each role.160 from `completion.choices[0].message.content`.
160
161 Generate text from a model161 Generate text from a model
162 162
163```javascript163```javascript
201 }'201 }'
202```202```
203 203
204 </>
205 204
206 </div>205 </div>
207 <div data-content-switcher-pane data-value="responses" hidden>206 <div data-content-switcher-pane data-value="responses" hidden>
208 <div class="hidden">Responses</div>207 <div class="hidden">Responses</div>
209 <>208 With Responses, you can separate `instructions` and `input` at the top level
210 With Responses, you can separate instructions and input at the top-level. The API shape is similar to Chat Completions but has cleaner semantics.209 and read generated text from `response.output_text`.
211
212 Generate text from a model210 Generate text from a model
213 211
214```javascript212```javascript
247 }'245 }'
248```246```
249 247
250 </>
251
252 </div>
253
254
255
256### 2. Update item definitions
257
258
259
260<div data-content-switcher-pane data-value="chat-completions">
261 <div class="hidden">Chat Completions</div>
262 <>
263 With Chat Completions, you need to create an array of messages that specify different roles and content for each role.
264
265 Generate text from a model
266
267```javascript
268import OpenAI from 'openai';
269const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
270
271const completion = await client.chat.completions.create({
272 model: 'gpt-5.5',
273 messages: [
274 { 'role': 'system', 'content': 'You are a helpful assistant.' },
275 { 'role': 'user', 'content': 'Hello!' }
276 ]
277});
278console.log(completion.choices[0].message.content);
279```
280
281```python
282from openai import OpenAI
283client = OpenAI()
284
285completion = client.chat.completions.create(
286 model="gpt-5.5",
287 messages=[
288 {"role": "system", "content": "You are a helpful assistant."},
289 {"role": "user", "content": "Hello!"}
290 ]
291)
292print(completion.choices[0].message.content)
293```
294
295```bash
296curl https://api.openai.com/v1/chat/completions \\
297 -H "Content-Type: application/json" \\
298 -H "Authorization: Bearer $OPENAI_API_KEY" \\
299 -d '{
300 "model": "gpt-5.5",
301 "messages": [
302 {"role": "system", "content": "You are a helpful assistant."},
303 {"role": "user", "content": "Hello!"}
304 ]
305 }'
306```
307
308 </>
309 248
310 </div>249 </div>
311 <div data-content-switcher-pane data-value="responses" hidden>
312 <div class="hidden">Responses</div>
313 <>
314 With Responses, you can separate instructions and input at the top-level. The API shape is similar to Chat Completions but has cleaner semantics.
315
316 Generate text from a model
317
318```javascript
319import OpenAI from 'openai';
320const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
321
322const response = await client.responses.create({
323 model: 'gpt-5.5',
324 instructions: 'You are a helpful assistant.',
325 input: 'Hello!'
326});
327
328console.log(response.output_text);
329```
330
331```python
332from openai import OpenAI
333client = OpenAI()
334 250
335response = client.responses.create(
336 model="gpt-5.5",
337 instructions="You are a helpful assistant.",
338 input="Hello!"
339)
340print(response.output_text)
341```
342 251
343```bash
344curl https://api.openai.com/v1/responses \\
345 -H "Content-Type: application/json" \\
346 -H "Authorization: Bearer $OPENAI_API_KEY" \\
347 -d '{
348 "model": "gpt-5.5",
349 "instructions": "You are a helpful assistant.",
350 "input": "Hello!"
351 }'
352```
353 252
354 </>253### 2. Map Messages to Items
355 254
356 </div>255Chat Completions uses `messages` as both input and output. Responses uses `input` and `output` arrays of typed Items. A `message` is one Item type, alongside Items such as `reasoning`, `function_call`, and `function_call_output`.
357 256
257| Chat Completions concept | Responses mapping |
258| ----------------------------- | ------------------------------------------------------------------------------------------------------ |
259| `messages[]` | `input`, as a string or an array of input Items |
260| System or developer guidance | Top-level `instructions`, or compatible message Items when you need to preserve an existing transcript |
261| User message | An input message Item with `role: "user"` |
262| Assistant message | An output message Item in `response.output`; pass it back in `input` if you manually manage state |
263| Tool or function call | A `function_call` output Item |
264| Tool or function result | A `function_call_output` input Item linked to the call with `call_id` |
265| Multiple generations with `n` | Not available in Responses; make separate requests if you need multiple candidate outputs |
358 266
267When you only need the final text, use the SDK `output_text` helper. When your flow uses reasoning, tools, or multimodal output, iterate over `response.output` and handle each Item by its `type`.
359 268
360### 3. Update multi-turn conversations269### 3. Update multi-turn conversations
361 270
362If you have multi-turn conversations in your application, update your context logic.271If you have multi-turn conversations in your application, update your context logic. Responses gives you three common state-management options:
272
273- Use `previous_response_id` when you want OpenAI to manage prior response context. Resend stable `instructions` on each request, because `previous_response_id` does not carry over the previous response's top-level `instructions`.
274- Pass prior `output` Items back into the next request when you need to manage or trim context yourself.
275- Use the [Conversations API](https://developers.openai.com/api/docs/guides/conversation-state?api-mode=responses#using-the-conversations-api) when you need a persistent conversation object.
363 276
364 277
365 278
366<div data-content-switcher-pane data-value="chat-completions">279<div data-content-switcher-pane data-value="chat-completions">
367 <div class="hidden">Chat Completions</div>280 <div class="hidden">Chat Completions</div>
368 <>281 In Chat Completions, you store the transcript and send the accumulated
369 In Chat Completions, you have to store and manage context yourself.282 `messages` array on each request.
370
371 Multi-turn conversation283 Multi-turn conversation
372 284
373```javascript285```javascript
402res2 = client.chat.completions.create(model="gpt-5.5", messages=messages)314res2 = client.chat.completions.create(model="gpt-5.5", messages=messages)
403```315```
404 316
405 </>
406 317
407 </div>318 </div>
408 <div data-content-switcher-pane data-value="responses" hidden>319 <div data-content-switcher-pane data-value="responses" hidden>
409 <div class="hidden">Responses</div>320 <div class="hidden">Responses</div>
410 <>321 With Responses, you can manually pass outputs from one response into the
411 With responses, the pattern is similar, you can pass outputs from one response to the input of another.322 input of another.
412
413 Multi-turn conversation323 Multi-turn conversation
414 324
415```python325```python
457});367});
458```368```
459 369
460 370 You can also use `previous_response_id` to reference the previous response
461 As a simplification, we've also built a way to simply reference inputs and outputs from a previous response by passing its id.371 and create response chains or forks.
462 You can use `previous_response_id` to form chains of responses that build upon one other or create forks in a history.
463
464 Multi-turn conversation372 Multi-turn conversation
465 373
466```javascript374```javascript
493)401)
494```402```
495 403
496 </>
497 404
498 </div>405 </div>
499 406
500 407
501 408
502 ### 4. Decide when to use statefulness409Even when using `previous_response_id`, all previous input tokens for responses in the chain are billed as input tokens in the API.
410
411### 4. Decide when to use statefulness
503 412
504 Some organizations—such as those with Zero Data Retention (ZDR) requirements—cannot use the Responses API in a stateful way due to compliance or data retention policies. To support these cases, OpenAI offers encrypted reasoning items, allowing you to keep your workflow stateless while still benefiting from reasoning items.413Responses are stored by default. Chat Completions are stored by default for new accounts. To disable storage in either API, set `store: false`.
505 414
506 To disable statefulness, but still take advantage of reasoning:415Some organizations, such as those with Zero Data Retention (ZDR) requirements, cannot use the Responses API in a stateful way due to compliance or data retention policies. To support these cases, OpenAI offers encrypted reasoning items, allowing you to keep your workflow stateless while still benefiting from reasoning items.
507 - set `store: false` in the [store field](https://developers.openai.com/api/docs/api-reference/responses/create#responses_create-store)
508 - add `["reasoning.encrypted_content"]` to the [include field](https://developers.openai.com/api/docs/api-reference/responses/create#responses_create-include)
509 416
510 The API will then return an encrypted version of the reasoning tokens, which you can pass back in future requests just like regular reasoning items.417To disable statefulness but still take advantage of reasoning:
511 For ZDR organizations, OpenAI enforces store=false automatically. When a request includes encrypted_content, it is decrypted in-memory (never written to disk), used for generating the next response, and then securely discarded. Any new reasoning tokens are immediately encrypted and returned to you, ensuring no intermediate state is ever persisted.
512 418
419- Set `store: false` in the [store field](https://developers.openai.com/api/docs/api-reference/responses/create#responses_create-store).
420- Add `["reasoning.encrypted_content"]` to the [include field](https://developers.openai.com/api/docs/api-reference/responses/create#responses_create-include).
513 421
514 ### 5. Update function definitions422The API will then return an encrypted version of the reasoning tokens, which you can pass back in future requests just like regular reasoning items.
423For ZDR organizations, OpenAI enforces `store: false` automatically. When a request includes `encrypted_content`, it is decrypted in memory, used for generating the next response, and then securely discarded. Any new reasoning tokens are immediately encrypted and returned to you, ensuring no intermediate state is persisted.
515 424
516 There are two minor, but notable, differences in how functions are defined between Chat Completions and Responses.425### 5. Update function definitions and outputs
517 426
518 1. In Chat Completions, functions are defined using externally tagged polymorphism, whereas in Responses, they are internally-tagged.427There are two minor, but notable, differences in how functions are defined between Chat Completions and Responses.
519 2. In Chat Completions, functions are non-strict by default, whereas in the Responses API, functions _are_ strict by default.
520 428
521 The Responses API function example on the right is functionally equivalent to the Chat Completions example on the left.4291. In Chat Completions, function definitions are externally tagged. In Responses, they are internally tagged.
4302. In Chat Completions, functions are non-strict by default. In Responses, function schemas are normalized into strict mode by default. To keep non-strict, best-effort function calling in Responses, explicitly set `strict: false`.
522 431
523 #### Follow function-calling best practices432The Responses API function example on the right is functionally equivalent to the Chat Completions example on the left.
524 433
525 In Responses, tool calls and their outputs are two distinct types of Items that are correlated using a `call_id`. See434#### Follow function-calling best practices
526 the [tool calling docs](https://developers.openai.com/api/docs/guides/function-calling#function-tool-example) for more detail on how function calling works in Responses.
527 435
528 ### 6. Update Structured Outputs definition436In Responses, tool calls and their outputs are two distinct types of Items that are correlated using a `call_id`. See
437the [function calling docs](https://developers.openai.com/api/docs/guides/function-calling#function-tool-example) for more detail on how function calling works in Responses.
529 438
530 In the Responses API, defining structured outputs have moved from `response_format` to `text.format`:439### 6. Update Structured Outputs definitions
440
441In the Responses API, Structured Outputs definitions have moved from `response_format` to `text.format`:
531 442
532 443
533 444
769 680
770 681
771 682
772 ### 7. Upgrade to native tools683### 7. Update streaming consumers
684
685Chat Completions streaming returns incremental chunks with a `delta` field. Responses streaming uses typed server-sent events. Update stream consumers to branch on each event's `type` and handle the events your UI or orchestration layer needs.
686
687For text streaming, listen for events such as:
773 688
774 If your application has use cases that would benefit from OpenAI's native [tools](https://developers.openai.com/api/docs/guides/tools), you can update your tool calls to use OpenAI's tools out of the box.689- `response.created`
690- `response.output_text.delta`
691- `response.completed`
692- `error`
693
694Function-calling streams can also emit events such as `response.function_call_arguments.delta` and `response.function_call_arguments.done`. See the [streaming Responses guide](https://developers.openai.com/api/docs/guides/streaming-responses?api-mode=responses) and [Responses streaming events reference](https://developers.openai.com/api/docs/api-reference/responses-streaming).
695
696### 8. Upgrade to native tools
697
698If your application has use cases that would benefit from OpenAI's native [tools](https://developers.openai.com/api/docs/guides/tools), you can update your tool calls to use OpenAI's tools out of the box.
775 699
776 700
777 701
778<div data-content-switcher-pane data-value="chat-completions">702<div data-content-switcher-pane data-value="chat-completions">
779 <div class="hidden">Chat Completions</div>703 <div class="hidden">Chat Completions</div>
780 <>704 With Chat Completions, you cannot use OpenAI-hosted tools natively and have
781 With Chat Completions, you cannot use OpenAI's tools natively and have to write your own.705 to write your own tool integration.
782 Web search tool706 Web search tool
783 707
784```javascript708```javascript
843 --data-urlencode "key=$SEARCH_API_KEY"\767 --data-urlencode "key=$SEARCH_API_KEY"\
844```768```
845 769
846 </>
847 </div>770 </div>
848 <div data-content-switcher-pane data-value="responses" hidden>771 <div data-content-switcher-pane data-value="responses" hidden>
849 <div class="hidden">Responses</div>772 <div class="hidden">Responses</div>
850 <>773 With Responses, you can specify the tools that you want the model to use.
851 With Responses, you can simply specify the tools that you are interested in.
852
853 Web search tool774 Web search tool
854 775
855```javascript776```javascript
883 }'804 }'
884```805```
885 806
886 </>
887 807
888 </div>808 </div>
889 809
890 810
891 811
892## Incremental migration812### 9. Check common migration errors
813
814Watch for these issues when moving code from Chat Completions to Responses:
815
816- Reading `choices[0].message.content` instead of `response.output_text` or `response.output`.
817- Treating every `output` entry as a message. Reasoning, tool, and function calls are separate Item types.
818- Dropping reasoning, function call, or function call output Items when manually carrying context into the next response.
819- Sending a function result without the matching `call_id`.
820- Using `response_format` in a Responses request instead of `text.format`.
821- Reusing Chat Completions streaming chunk handlers without handling typed Responses events.
822- Assuming `previous_response_id` removes billing for prior context. Previous input tokens in the response chain are still billed as input tokens.
823
824## Incremental rollout checklist
893 825
894The Responses API is a superset of the Chat Completions API, and Chat Completions remains supported. You can migrate one user flow at a time:826Chat Completions remains supported, so you can migrate one user flow at a time.
895 827
8961. Start with a simple text-generation flow and update the endpoint, input, and output handling.828- [ ] Start with a simple text-generation flow.
8972. Update multi-turn state management, then migrate function calling and Structured Outputs.829- [ ] Update the endpoint, request body, and output handling.
8983. For streaming flows, update consumers to handle typed Responses events such as `response.output_text.delta`. See the [streaming Responses guide](https://developers.openai.com/api/docs/guides/streaming-responses?api-mode=responses).830- [ ] Decide whether the flow uses `previous_response_id`, manual Item replay, or the Conversations API.
8994. Compare behavior, latency, and errors before routing more traffic to Responses.831- [ ] If the flow is stateless or ZDR, add `store: false` and include encrypted reasoning items when reasoning context must continue across turns.
832- [ ] Migrate function definitions and verify function call outputs include the correct `call_id`.
833- [ ] Move Structured Outputs schemas from `response_format` to `text.format`.
834- [ ] Update streaming consumers to handle typed Responses events.
835- [ ] Replace custom orchestration with OpenAI-hosted tools where they fit the workflow.
836- [ ] Compare behavior, latency, token usage, and errors before routing more traffic to Responses.
900 837
901We recommend migrating all flows to the Responses API over time to take advantage of the latest OpenAI features and improvements.838We recommend migrating all flows to the Responses API over time to take advantage of the latest OpenAI features and improvements.
902 839
904 841
905Based on developer feedback from the [Assistants API](https://developers.openai.com/api/docs/api-reference/assistants) beta, we've incorporated key improvements into the Responses API to make it more flexible, faster, and easier to use. The Responses API represents the future direction for building agents on OpenAI.842Based on developer feedback from the [Assistants API](https://developers.openai.com/api/docs/api-reference/assistants) beta, we've incorporated key improvements into the Responses API to make it more flexible, faster, and easier to use. The Responses API represents the future direction for building agents on OpenAI.
906 843
907We now have Assistant-like and Thread-like objects in the Responses API. Learn more in the [migration guide](https://developers.openai.com/api/docs/guides/assistants/migration). As of August 26th, 2025, we're deprecating the Assistants API, with a sunset date of August 26, 2026.
844We now have Assistant-like and Thread-like objects in the Responses API. Learn more in the [migration guide](https://developers.openai.com/api/docs/guides/assistants/migration). As of August 26, 2025, we're deprecating the Assistants API, with a sunset date of August 26, 2026.