Documentation — Spybara

assistants/deep-dive.md +2 −3

Details

389 389

390## Creating assistants390## Creating assistants

391 391

392We recommend using OpenAI's{" "}392We recommend using OpenAI's <a href="/api/docs/models">latest models</a> with

393 <a href="/api/docs/models#gpt-4-turbo-and-gpt-4">latest models</a> with the393 the Assistants API for best results and maximum compatibility with tools.

394 Assistants API for best results and maximum compatibility with tools.

395 394

396To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:395To get started, creating an Assistant only requires specifying the `model` to use. But you can further customize the behavior of the Assistant:

397 396

assistants/migration.md +40 −0

Details

278 278

279<div data-content-switcher-pane data-value="assistants">279<div data-content-switcher-pane data-value="assistants">

280 <div class="hidden">Assistants API</div>280 <div class="hidden">Assistants API</div>

281 ```python

282thread = openai.threads.create()

283

284 @app.post("/messages")

285 async def message(message: Message):

286 openai.beta.threads.messages.create(

287 role="user",

288 content=message.content

289 )

290

291 run = openai.beta.threads.runs.create(

292 assistant_id=os.getenv("ASSISTANT_ID"),

293 thread_id=thread.id

294 )

295 while run.status in ("queued", "in_progress"):

296 await asyncio.sleep(1)

297 run = openai.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run.id)

298

299 messages = openai.beta.threads.messages.list(

300 order="desc", limit=1, thread_id=thread.id

301 )

302

303 return { "content": messages[-1].content }

304```

305

306

281 </div>307 </div>

282 <div data-content-switcher-pane data-value="responses" hidden>308 <div data-content-switcher-pane data-value="responses" hidden>

283 <div class="hidden">Responses API</div>309 <div class="hidden">Responses API</div>

310 ```python

311conversation = openai.conversations.create()

312

313 @app.post("/messages")

314 async def message(message: Message):

315 response = openai.responses.create(

316 prompt={ "id": os.getenv("PROMPT_ID") },

317 input=[{ "role": "user", "content": message.content }]

318 )

319

320 return { "content": response.output_text }'

321```

322

323

284 </div>324 </div>

concepts.md +1 −1

Details

1# Key concepts1# Key concepts

2 2

3At OpenAI, protecting user data is fundamental to our mission. We do not train3At OpenAI, protecting user data is fundamental to our mission. We do not train

~~4 our models on inputs and outputs through our API. Learn more on our{" "}~~4 our models on inputs and outputs through our API. Learn more on our

5 <a href="https://openai.com/api-data-privacy">API data privacy page</a>.5 <a href="https://openai.com/api-data-privacy">API data privacy page</a>.

6 6

7## Text generation models7## Text generation models

guides/advanced-usage.md +2 −2

Details

33 33

34As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text.34As a rough rule of thumb, 1 token is approximately 4 characters or 0.75 words for English text.

35 35

~~36Check out our{" "}~~36Check out our

37 <a37 <a

38 href="https://platform.openai.com/tokenizer"38 href="https://platform.openai.com/tokenizer"

39 target="_blank"39 target="_blank"

40 rel="noreferrer"40 rel="noreferrer"

41 >41 >

42 Tokenizer tool42 Tokenizer tool

~~43 </a>{" "}~~43 </a>

44 to test specific strings and see how they are translated into tokens.44 to test specific strings and see how they are translated into tokens.

45 45

46For example, the string `"ChatGPT is great!"` is encoded into six tokens: `["Chat", "G", "PT", " is", " great", "!"]`.46For example, the string `"ChatGPT is great!"` is encoded into six tokens: `["Chat", "G", "PT", " is", " great", "!"]`.

guides/agents/define-agents.md +2 −2

Details

41const agent = new Agent({41const agent = new Agent({

42 name: "Weather bot",42 name: "Weather bot",

43 instructions: "You are a helpful weather bot.",43 instructions: "You are a helpful weather bot.",

~~44 model: "gpt-5.5",~~44 model: "${latestMainlineModelSlug}",

45 tools: [getWeather],45 tools: [getWeather],

46});46});

47```47```

59agent = Agent(59agent = Agent(

60 name="Weather bot",60 name="Weather bot",

61 instructions="You are a helpful weather bot.",61 instructions="You are a helpful weather bot.",

~~62 model="gpt-5.5",~~62 model="${latestMainlineModelSlug}",

63 tools=[get_weather],63 tools=[get_weather],

64)64)

65```65```

guides/agents/models.md +3 −3

Details

27});27});

28 28

29const runner = new Runner({29const runner = new Runner({

~~30 model: "gpt-5.5",~~30 model: "${latestMainlineModelSlug}",

31});31});

32 32

33await runner.run(fastAgent, "Summarize ticket 123.");33await runner.run(fastAgent, "Summarize ticket 123.");

62 result = await Runner.run(62 result = await Runner.run(

63 general_agent,63 general_agent,

64 "Investigate the billing issue on account 456.",64 "Investigate the billing issue on account 456.",

~~65 run_config=RunConfig(model="gpt-5.5"),~~65 run_config=RunConfig(model="${latestMainlineModelSlug}"),

66 )66 )

67 print(result.final_output)67 print(result.final_output)

68 68

72```72```

73 73

74 74

75For most new SDK workflows, start with [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) and move to a smaller variant only when latency or cost matters enough to justify it. Use the platform-wide [Using GPT-5.5](https://developers.openai.com/api/docs/guides/latest-model) guide for current model-selection advice.75For most new SDK workflows, start with [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) and move to a smaller variant only when latency or cost matters enough to justify it. Use the platform-wide <a href="/api/docs/guides/latest-model">Using GPT-5.5</a> guide for current model-selection advice.

76 76

77## Choose the simplest default strategy77## Choose the simplest default strategy

78 78

guides/agents/orchestration.md +1 −2

Details

142 142

143<span slot="icon">143<span slot="icon">

144 </span>144 </span>

145 See how{" "}145 See how

146 {" "}

147 and resumable state affect the next turn.146 and resumable state affect the next turn.

148 147

149 148

guides/agents/quickstart.md +2 −2

Details

36 name: "History tutor",36 name: "History tutor",

37 instructions:37 instructions:

38 "You answer history questions clearly and concisely.",38 "You answer history questions clearly and concisely.",

~~39 model: "gpt-5.5",~~39 model: "${latestMainlineModelSlug}",

40});40});

41 41

42const result = await run(agent, "When did the Roman Empire fall?");42const result = await run(agent, "When did the Roman Empire fall?");

51agent = Agent(51agent = Agent(

52 name="History tutor",52 name="History tutor",

53 instructions="You answer history questions clearly and concisely.",53 instructions="You answer history questions clearly and concisely.",

~~54 model="gpt-5.5",~~54 model="${latestMainlineModelSlug}",

55)55)

56 56

57 57

guides/agents/sandboxes.md +4 −4

Details

294 294

295const agent = new SandboxAgent({295const agent = new SandboxAgent({

296 name: "Renewal Packet Analyst",296 name: "Renewal Packet Analyst",

297 model: "gpt-5.5",297 model: "${latestMainlineModelSlug}",

298 instructions:298 instructions:

299 "Review the workspace before answering. Keep the response concise, " +299 "Review the workspace before answering. Keep the response concise, " +

300 "business-focused, and cite the file names that support each conclusion.",300 "business-focused, and cite the file names that support each conclusion.",

346 346

347agent = SandboxAgent(347agent = SandboxAgent(

348 name="Renewal Packet Analyst",348 name="Renewal Packet Analyst",

349 model="gpt-5.5",349 model="${latestMainlineModelSlug}",

350 instructions=(350 instructions=(

351 "Review the workspace before answering. Keep the response concise, "351 "Review the workspace before answering. Keep the response concise, "

352 "business-focused, and cite the file names that support each conclusion."352 "business-focused, and cite the file names that support each conclusion."

392 392

393const agent = new SandboxAgent({393const agent = new SandboxAgent({

394 name: "Workspace reviewer",394 name: "Workspace reviewer",

395 model: "gpt-5.5",395 model: "${latestMainlineModelSlug}",

396 instructions: "Inspect the sandbox workspace before answering.",396 instructions: "Inspect the sandbox workspace before answering.",

397});397});

398 398

487});487});

488const agent = new SandboxAgent({488const agent = new SandboxAgent({

489 name: "Workspace builder",489 name: "Workspace builder",

490 model: "gpt-5.5",490 model: "${latestMainlineModelSlug}",

491 instructions: "Inspect the sandbox workspace before answering.",491 instructions: "Inspect the sandbox workspace before answering.",

492});492});

493 493

guides/audio.md +8 −8

Details

43 43

44## Add audio to your existing application44## Add audio to your existing application

45 45

~~46Models such as `gpt-realtime` and `gpt-audio` are natively multimodal, meaning they can understand and generate audio and text as input and output.~~46Models such as [`gpt-realtime-2`](https://developers.openai.com/api/docs/models/gpt-realtime-2) and [`gpt-audio-1.5`](https://developers.openai.com/api/docs/models/gpt-audio-1.5) are natively multimodal, meaning they can understand and generate audio and text as input and output.

47 47

48For live browser speech-to-speech interactions, start with a realtime session in the JavaScript SDK:48For live browser speech-to-speech interactions, start with a realtime session in the JavaScript SDK:

49 49

69 69

70This example uses JavaScript because browser voice agents connect with WebRTC from the client. For Python voice workflows, use the [Voice agents guide](https://developers.openai.com/api/docs/guides/voice-agents), which covers chained voice pipelines.70This example uses JavaScript because browser voice agents connect with WebRTC from the client. For Python voice workflows, use the [Voice agents guide](https://developers.openai.com/api/docs/guides/voice-agents), which covers chained voice pipelines.

71 71

72If you already have a text-based LLM application with the [Chat Completions endpoint](https://developers.openai.com/api/docs/api-reference/chat/), you may want to add audio capabilities. For example, if your chat application supports text input, you can add audio input and output: include `audio` in the `modalities` array and use an audio model, like `gpt-audio`.72If you already have a text-based LLM application with the [Chat Completions endpoint](https://developers.openai.com/api/docs/api-reference/chat/), you may want to add audio capabilities. For example, if your chat application supports text input, you can add audio input and output: include `audio` in the `modalities` array and use an audio model, like [`gpt-audio-1.5`](https://developers.openai.com/api/docs/models/gpt-audio-1.5).

73 73

74The [Responses API](https://developers.openai.com/api/docs/api-reference/responses) docs currently describe74The [Responses API](https://developers.openai.com/api/docs/api-reference/responses) docs currently describe

75 text and image inputs with text outputs. For this audio-chat pattern, use Chat75 text and image inputs with text outputs. For this audio-chat pattern, use Chat

89 89

90// Generate an audio response to the given prompt90// Generate an audio response to the given prompt

91const response = await openai.chat.completions.create({91const response = await openai.chat.completions.create({

~~92 model: "gpt-audio",~~92 model: "gpt-audio-1.5",

93 modalities: ["text", "audio"],93 modalities: ["text", "audio"],

94 audio: { voice: "alloy", format: "wav" },94 audio: { voice: "alloy", format: "wav" },

95 messages: [95 messages: [

119client = OpenAI()119client = OpenAI()

120 120

121completion = client.chat.completions.create(121completion = client.chat.completions.create(

122 model="gpt-audio",122 model="gpt-audio-1.5",

123 modalities=["text", "audio"],123 modalities=["text", "audio"],

124 audio={"voice": "alloy", "format": "wav"},124 audio={"voice": "alloy", "format": "wav"},

125 messages=[125 messages=[

142 -H "Content-Type: application/json" \\142 -H "Content-Type: application/json" \\

143 -H "Authorization: Bearer $OPENAI_API_KEY" \\143 -H "Authorization: Bearer $OPENAI_API_KEY" \\

144 -d '{144 -d '{

145 "model": "gpt-audio",145 "model": "gpt-audio-1.5",

146 "modalities": ["text", "audio"],146 "modalities": ["text", "audio"],

147 "audio": { "voice": "alloy", "format": "wav" },147 "audio": { "voice": "alloy", "format": "wav" },

148 "messages": [148 "messages": [

170const base64str = Buffer.from(buffer).toString("base64");170const base64str = Buffer.from(buffer).toString("base64");

171 171

172const response = await openai.chat.completions.create({172const response = await openai.chat.completions.create({

173 model: "gpt-audio",173 model: "gpt-audio-1.5",

174 modalities: ["text", "audio"],174 modalities: ["text", "audio"],

175 audio: { voice: "alloy", format: "wav" },175 audio: { voice: "alloy", format: "wav" },

176 messages: [176 messages: [

203encoded_string = base64.b64encode(wav_data).decode('utf-8')203encoded_string = base64.b64encode(wav_data).decode('utf-8')

204 204

205completion = client.chat.completions.create(205completion = client.chat.completions.create(

206 model="gpt-audio",206 model="gpt-audio-1.5",

207 modalities=["text", "audio"],207 modalities=["text", "audio"],

208 audio={"voice": "alloy", "format": "wav"},208 audio={"voice": "alloy", "format": "wav"},

209 messages=[209 messages=[

234 -H "Content-Type: application/json" \\234 -H "Content-Type: application/json" \\

235 -H "Authorization: Bearer $OPENAI_API_KEY" \\235 -H "Authorization: Bearer $OPENAI_API_KEY" \\

236 -d '{236 -d '{

237 "model": "gpt-audio",237 "model": "gpt-audio-1.5",

238 "modalities": ["text", "audio"],238 "modalities": ["text", "audio"],

239 "audio": { "voice": "alloy", "format": "wav" },239 "audio": { "voice": "alloy", "format": "wav" },

240 "messages": [240 "messages": [

guides/citation-formatting.md +5 −5

Details

244 244

245<strong>Source IDs vs. locators:</strong> A source ID is a stable,245<strong>Source IDs vs. locators:</strong> A source ID is a stable,

246 model-generated identifier such as <code>block1</code>. A locator is the246 model-generated identifier such as <code>block1</code>. A locator is the

247 precise UI-rendered highlight, such as <code>lines L8-L13</code> or{" "}247 precise UI-rendered highlight, such as <code>lines L8-L13</code> or

248 <code>Paragraph 21</code>. In general, the model should emit the source ID,248 <code>Paragraph 21</code>. In general, the model should emit the source ID,

249 while your system resolves or renders the locator. Mixing the two too early249 while your system resolves or renders the locator. Mixing the two too early

250 tends to increase formatting errors.250 tends to increase formatting errors.

271For tool calls, <code>turnN</code> increments once per tool invocation, not271For tool calls, <code>turnN</code> increments once per tool invocation, not

272 once per individual result. Within a single invocation, sources are272 once per individual result. Within a single invocation, sources are

273 distinguished by suffixes such as <code>file0</code>, <code>file1</code>, and273 distinguished by suffixes such as <code>file0</code>, <code>file1</code>, and

274 so on. In a single-response system, all references will be{" "}274 so on. In a single-response system, all references will be

275 <code>turn0...</code> only if the model makes exactly one tool call before275 <code>turn0...</code> only if the model makes exactly one tool call before

276 answering. If it makes multiple tool calls, you may instead see references276 answering. If it makes multiple tool calls, you may instead see references

277 like <code>turn0fileX</code>, <code>turn1fileX</code>, and so on.277 like <code>turn0fileX</code>, <code>turn1fileX</code>, and so on.

537```537```

538 538

539<strong>Note:</strong> OpenAI-hosted tools such as web search provide539<strong>Note:</strong> OpenAI-hosted tools such as web search provide

540 automatic inline citations. If you want to use hosted tools instead, see the{" "}540 automatic inline citations. If you want to use hosted tools instead, see the

541 <a href="/api/docs/guides/tools">tools overview</a>,{" "}541 <a href="/api/docs/guides/tools">tools overview</a>,

542 <a href="/api/docs/guides/tools-web-search">web search guide</a>, and{" "}542 <a href="/api/docs/guides/tools-web-search">web search guide</a>, and

543 <a href="/api/docs/guides/tools-file-search">file search guide</a>.543 <a href="/api/docs/guides/tools-file-search">file search guide</a>.

guides/code-generation.md +6 −6

Details

17 17

18## Integrate with coding models18## Integrate with coding models

19 19

20For most API-based code generation, start with **`gpt-5.5`**. It handles both general-purpose work and coding, which makes it a strong default when your application needs to write code, reason about requirements, inspect docs, and handle broader workflows in one place.20For most API-based code generation, start with <strong>`gpt-5.5`</strong>. It handles both general-purpose work and coding, which makes it a strong default when your application needs to write code, reason about requirements, inspect docs, and handle broader workflows in one place.

21 21

22This example shows how you can use the [Responses API](https://developers.openai.com/api/docs/api-reference/responses) for a code generation use case:22This example shows how you can use the [Responses API](https://developers.openai.com/api/docs/api-reference/responses) for a code generation use case:

23 23

28const openai = new OpenAI();28const openai = new OpenAI();

29 29

30const result = await openai.responses.create({30const result = await openai.responses.create({

~~31 model: "gpt-5.5",~~31 model: "${latestMainlineModelSlug}",

32 input: "Find the null pointer exception: ...your code here...",32 input: "Find the null pointer exception: ...your code here...",

33 reasoning: { effort: "high" },33 reasoning: { effort: "high" },

34});34});

41client = OpenAI()41client = OpenAI()

42 42

43result = client.responses.create(43result = client.responses.create(

~~44 model="gpt-5.5",~~44 model="${latestMainlineModelSlug}",

45 input="Find the null pointer exception: ...your code here...",45 input="Find the null pointer exception: ...your code here...",

46 reasoning={ "effort": "high" },46 reasoning={ "effort": "high" },

47)47)

54 -H "Content-Type: application/json" \\54 -H "Content-Type: application/json" \\

55 -H "Authorization: Bearer $OPENAI_API_KEY" \\55 -H "Authorization: Bearer $OPENAI_API_KEY" \\

56 -d '{56 -d '{

~~57 "model": "gpt-5.5",~~57 "model": "${latestMainlineModelSlug}",

58 "input": "Find the null pointer exception: ...your code here...",58 "input": "Find the null pointer exception: ...your code here...",

59 "reasoning": { "effort": "high" }59 "reasoning": { "effort": "high" }

60 }'60 }'

70## Next steps70## Next steps

71 71

72- Visit the [Codex docs](https://developers.openai.com/codex) to learn what you can do with Codex, set up Codex in whichever interface you choose, or find more details.72- Visit the [Codex docs](https://developers.openai.com/codex) to learn what you can do with Codex, set up Codex in whichever interface you choose, or find more details.

~~73- Read [Using GPT-5.5](https://developers.openai.com/api/docs/guides/latest-model) for model selection, features, and migration guidance.~~73- Read <a href="/api/docs/guides/latest-model">Using GPT-5.5</a> for model selection, features, and migration guidance.

~~74- See [Prompt guidance for GPT-5.5](https://developers.openai.com/api/docs/guides/prompt-guidance) for prompting patterns that work well on coding and agentic tasks.~~74- See <a href="/api/docs/guides/prompt-guidance">Prompt guidance for GPT-5.5</a> for prompting patterns that work well on coding and agentic tasks.

75- Compare [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) and [`gpt-5.3-codex`](https://developers.openai.com/api/docs/models/gpt-5.3-codex) on the model pages.75- Compare [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) and [`gpt-5.3-codex`](https://developers.openai.com/api/docs/models/gpt-5.3-codex) on the model pages.

guides/image-generation.md +2 −2

Details

1058 <td>1058 <td>

1059 <ul>1059 <ul>

1060 <li>1060 <li>

1061 Maximum edge length must be less than or equal to{" "}1061 Maximum edge length must be less than or equal to

1062 <code>3840px</code>1062 <code>3840px</code>

1063 </li>1063 </li>

1064 <li>1064 <li>

1068 Long edge to short edge ratio must not exceed <code>3:1</code>1068 Long edge to short edge ratio must not exceed <code>3:1</code>

1069 </li>1069 </li>

1070 <li>1070 <li>

1071 Total pixels must be at least <code>655,360</code> and no more than{" "}1071 Total pixels must be at least <code>655,360</code> and no more than

1072 <code>8,294,400</code>1072 <code>8,294,400</code>

1073 </li>1073 </li>

1074 </ul>1074 </ul>

guides/images-vision.md +2 −2

Details

542 <code>gpt-5-mini</code>, <code>gpt-5-nano</code>, <code>gpt-5.2</code>,542 <code>gpt-5-mini</code>, <code>gpt-5-nano</code>, <code>gpt-5.2</code>,

543 <code>gpt-5.3-codex</code>, <code>gpt-5-codex-mini</code>,543 <code>gpt-5.3-codex</code>, <code>gpt-5-codex-mini</code>,

544 <code>gpt-5.1-codex-mini</code>, <code>gpt-5.2-codex</code>,544 <code>gpt-5.1-codex-mini</code>, <code>gpt-5.2-codex</code>,

545 <code>gpt-5.2-chat-latest</code>, <code>o4-mini</code>, and the{" "}545 <code>gpt-5.2-chat-latest</code>, <code>o4-mini</code>, and the

546 <code>gpt-4.1-mini</code> and <code>gpt-4.1-nano</code> 2025-04-14546 <code>gpt-4.1-mini</code> and <code>gpt-4.1-nano</code> 2025-04-14

547 snapshot variants547 snapshot variants

548 </td>548 </td>

566 <code>low</code>, <code>high</code>, <code>auto</code>566 <code>low</code>, <code>high</code>, <code>auto</code>

567 </td>567 </td>

568 <td>568 <td>

569 Use tile-based resizing behavior. See{" "}569 Use tile-based resizing behavior. See

570 <a href="#gpt-4o-gpt-41-gpt-4o-mini-cua-and-o-series-except-o4-mini">570 <a href="#gpt-4o-gpt-41-gpt-4o-mini-cua-and-o-series-except-o4-mini">

571 the detailed behavior below571 the detailed behavior below

572 </a>572 </a>

guides/latency-optimization.md +5 −5

Details

30 30

31 31

32 32

~~33Other factors that affect inference speed are the amount of{" "}~~33Other factors that affect inference speed are the amount of

~~34 <strong>compute</strong> you have available and any additional{" "}~~34 <strong>compute</strong> you have available and any additional

35 <strong>inference optimizations</strong> you employ. <br /> <br />35 <strong>inference optimizations</strong> you employ. <br /> <br />

36 Most people can't influence these factors directly, but if you're curious, and36 Most people can't influence these factors directly, but if you're curious, and

~~37 have some control over your infra, <strong>faster hardware</strong> or{" "}~~37 have some control over your infra, <strong>faster hardware</strong> or

38 <strong>running engines at a lower saturation</strong> may give you a modest38 <strong>running engines at a lower saturation</strong> may give you a modest

~~39 TPM boost. And if you're down in the trenches, there's a myriad of other{" "}~~39 TPM boost. And if you're down in the trenches, there's a myriad of other

40 <a href="https://lilianweng.github.io/posts/2023-01-10-inference-optimization/">40 <a href="https://lilianweng.github.io/posts/2023-01-10-inference-optimization/">

41 inference optimizations41 inference optimizations

~~42 </a>{" "}~~42 </a>

43 that are a bit beyond the scope of this guide.43 that are a bit beyond the scope of this guide.

44 44

45 45

guides/migrate-to-responses.md +1 −1

Details

81- Structured Outputs API shape is different. Instead of `response_format`, use `text.format` in Responses. Learn more in the [Structured Outputs](https://developers.openai.com/api/docs/guides/structured-outputs) guide.81- Structured Outputs API shape is different. Instead of `response_format`, use `text.format` in Responses. Learn more in the [Structured Outputs](https://developers.openai.com/api/docs/guides/structured-outputs) guide.

82- The function-calling API shape is different, both for the function config on the request, and function calls sent back in the response. See the full difference in the [function calling guide](https://developers.openai.com/api/docs/guides/function-calling).82- The function-calling API shape is different, both for the function config on the request, and function calls sent back in the response. See the full difference in the [function calling guide](https://developers.openai.com/api/docs/guides/function-calling).

83- The Responses SDK has an `output_text` helper, which the Chat Completions SDK does not have.83- The Responses SDK has an `output_text` helper, which the Chat Completions SDK does not have.

84- In Chat Completions, conversation state must be managed manually. The Responses API has compatibility with the [Conversations API](https://developers.openai.com/api/docs/guides/docs/guides/conversation-state?api-mode=responses#using-the-conversations-api) for persistent conversations, or the ability to pass a `previous_response_id` to easily chain Responses together.84- In Chat Completions, conversation state must be managed manually. The Responses API has compatibility with the [Conversations API](https://developers.openai.com/api/docs/guides/conversation-state?api-mode=responses#using-the-conversations-api) for persistent conversations, or the ability to pass a `previous_response_id` to easily chain Responses together.

85 85

86## Migrating from Chat Completions86## Migrating from Chat Completions

87 87

guides/production-best-practices.md +1 −1

Details

86 86

87#### Model87#### Model

88 88

89Our API offers different models with varying levels of complexity and generality. The most capable models, such as `gpt-5`, can generate more complex and diverse completions, but they also take longer to process your query.89Our API offers different models with varying levels of complexity and generality. The most capable models, such as `gpt-5.5`, can generate more complex and diverse completions, but they also take longer to process your query.

90Models such as `gpt-5.4-mini` and `gpt-5.4-nano` can generate faster and cheaper Responses, while `gpt-5.5` is a stronger default when you want more headroom on complex tasks. You can choose the model that best suits your use case and the trade-off between speed, cost, and quality.90Models such as `gpt-5.4-mini` and `gpt-5.4-nano` can generate faster and cheaper Responses, while `gpt-5.5` is a stronger default when you want more headroom on complex tasks. You can choose the model that best suits your use case and the trade-off between speed, cost, and quality.

91 91

92#### Number of completion tokens92#### Number of completion tokens

guides/prompt-engineering.md +17 −19

Details

43- **GPT models** are fast, cost-efficient, and highly intelligent, but benefit from more explicit instructions around how to accomplish tasks.43- **GPT models** are fast, cost-efficient, and highly intelligent, but benefit from more explicit instructions around how to accomplish tasks.

44- **Large and small (mini or nano) models** offer trade-offs for speed, cost, and intelligence. Large models are more effective at understanding prompts and solving problems across domains, while small models are generally faster and cheaper to use.44- **Large and small (mini or nano) models** offer trade-offs for speed, cost, and intelligence. Large models are more effective at understanding prompts and solving problems across domains, while small models are generally faster and cheaper to use.

45 45

~~46When in doubt, [`gpt-4.1`](https://developers.openai.com/api/docs/models/gpt-4.1) offers a solid combination of intelligence, speed, and cost effectiveness.~~46When in doubt, [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) offers a strong default for general-purpose text generation and prompt iteration.

47 47

48## Prompt engineering48## Prompt engineering

49 49

497 497

498Models have different context window sizes from the low 100k range up to one million tokens for newer GPT-4.1 models. [Refer to the model docs](https://developers.openai.com/api/docs/models) for specific context window sizes per model.498Models have different context window sizes from the low 100k range up to one million tokens for newer GPT-4.1 models. [Refer to the model docs](https://developers.openai.com/api/docs/models) for specific context window sizes per model.

499 499

500## Prompting GPT-5 models500## Prompting current GPT-5 series models

501 501

502GPT models like [`gpt-5`](https://developers.openai.com/api/docs/models/gpt-5) benefit from precise instructions that explicitly provide the logic and data required to complete the task in the prompt. GPT-5 in particular is highly steerable and responsive to well-specified prompts. To get the most out of GPT-5, refer to the prompting guide in the cookbook.502GPT models like [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5) benefit from precise instructions that explicitly provide the logic and data required to complete the task in the prompt. To get the most out of the latest GPT-5 series model, start with the current prompting guide.

503 503

504<a504<a href="/api/docs/guides/prompt-guidance">

505 href="https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide"

506 target="_blank"

507 rel="noreferrer"

508>

509 505

510 506

511<span slot="icon">507<span slot="icon">

512 </span>508 </span>

513 Get the most out of prompting GPT-5 with the tips and tricks in this509 Get the most out of prompting the latest GPT-5 series model with current

514 prompting guide, extracted from real-world use cases and practical510 guidance, practical examples, and migration notes.

515 experience.

516 511

517 512

518</a>513</a>

519 514

520### GPT-5 prompting best practices515### Prompting best practices for the latest GPT-5 series model

521 516

522While the [cookbook](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_prompting_guide) has the best and most comprehensive guidance for prompting this model, here are a few best practices to keep in mind.517For the full current treatment, use the [prompt guidance](https://developers.openai.com/api/docs/guides/prompt-guidance) guide. The practical reminders below still apply.

523 518

524Coding519Coding

525 520

526#### Coding521#### Coding

527 522

528Prompting GPT-5 for coding tasks is most effective when following a few best practices: define the agent's role, enforce structured tool use with examples, require thorough testing for correctness, and set Markdown standards for clean output.523Prompting `gpt-5.5` for coding tasks is most effective when following a few best practices: define the agent's role, enforce structured tool use with examples, require thorough testing for correctness, and set Markdown standards for clean output.

529 524

530**Explicit role and workflow guidance**525**Explicit role and workflow guidance**

531Frame the model as a software engineering agent with well-defined responsibilities. Provide clear instructions for using tools like `functions.run` for code tasks, and specify when not to use certain modes—for example, avoid interactive execution unless necessary.526Frame the model as a software engineering agent with well-defined responsibilities. Provide clear instructions for using tools like `functions.run` for code tasks, and specify when not to use certain modes—for example, avoid interactive execution unless necessary.

539**Markdown standards**534**Markdown standards**

540Guide the model to generate clean, semantically correct markdown using inline code, code fences, lists, and tables where appropriate—and to format file paths, functions, and classes with backticks.535Guide the model to generate clean, semantically correct markdown using inline code, code fences, lists, and tables where appropriate—and to format file paths, functions, and classes with backticks.

541 536

542For detailed guidance and prompt samples specific to coding, see our [GPT-5 prompting guide](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_prompting_guide).537For detailed guidance and prompt samples specific to coding, see our [prompt guidance](https://developers.openai.com/api/docs/guides/prompt-guidance) guide.

543 538

544Front-end engineering539Front-end engineering

545 540

546[GPT-5](https://developers.openai.com/api/docs/guides/latest-model) performs well at building front ends from scratch as well as contributing to large, established codebases. To get the best results, we recommend using the following libraries:541[GPT-5.5](https://developers.openai.com/api/docs/models/gpt-5.5)

542performs well at building front ends from scratch as well as contributing to

543large, established codebases. To get the best results, we recommend using the

544following libraries:

547 545

548- **Styling / UI:** Tailwind CSS, shadcn/ui, Radix Themes546- **Styling / UI:** Tailwind CSS, shadcn/ui, Radix Themes

549- **Icons:** Lucide, Material Symbols, Heroicons547- **Icons:** Lucide, Material Symbols, Heroicons

573- **Pages:** Provide templates for common layouts.571- **Pages:** Provide templates for common layouts.

574- **Agent Instructions:** Ask the model to confirm design assumptions, scaffold projects, enforce standards, integrate APIs, test states, and document code.572- **Agent Instructions:** Ask the model to confirm design assumptions, scaffold projects, enforce standards, integrate APIs, test states, and document code.

575 573

576For detailed guidance and prompt samples specific to frontend development, see our [frontend engineering cookbook.](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_frontend)574For detailed guidance and prompt samples specific to frontend development, see our [prompt guidance](https://developers.openai.com/api/docs/guides/prompt-guidance) guide.

577 575

578Agentic tasks576Agentic tasks

579 577

580For agentic and long-running rollouts with GPT-5, focus your prompts on three core practices: plan tasks thoroughly to ensure complete resolution, provide clear preambles for major tool usage decisions, and use a TODO tool to track workflow and progress in an organized manner.578For agentic and long-running rollouts with `gpt-5.5`, focus your prompts on three core practices: plan tasks thoroughly to ensure complete resolution, provide clear preambles for major tool usage decisions, and use a TODO tool to track workflow and progress in an organized manner.

581 579

582**Planning and persistence**580**Planning and persistence**

583Instruct the model to resolve the full query before yielding control, decomposing it into sub-tasks and reflecting after each tool call to confirm completeness.581Instruct the model to resolve the full query before yielding control, decomposing it into sub-tasks and reflecting after each tool call to confirm completeness.

611 609

612Use a TODO list tool or rubric to enforce structured planning and avoid missed steps.610Use a TODO list tool or rubric to enforce structured planning and avoid missed steps.

613 611

614For detailed guidance and prompt samples specific to building agents with GPT-5 , see the [GPT-5 prompting guide.](https://developers.openai.com/cookbook/examples/gpt-5/gpt-5_prompting_guide)612For detailed guidance and prompt samples specific to building agents, see the [prompt guidance](https://developers.openai.com/api/docs/guides/prompt-guidance) guide.

615 613

616## Prompting reasoning models614## Prompting reasoning models

617 615

guides/prompt-optimizer.md +1 −1

Details

1# Prompt optimizer1# Prompt optimizer

2 2

3The [prompt optimizer](https://platform.openai.com/chat/edit?models=gpt-5&optimize=true) is a chat interface in the dashboard, where you enter a prompt, and we optimize it according to current best practices before returning it to you. Pairing the prompt optimizer with [datasets](https://developers.openai.com/api/docs/guides/evaluation-getting-started) is a powerful way to automatically improve prompts.3The [prompt optimizer](https://platform.openai.com/chat/edit?optimize=true) is a chat interface in the dashboard, where you enter a prompt, and we optimize it according to current best practices before returning it to you. Pairing the prompt optimizer with [datasets](https://developers.openai.com/api/docs/guides/evaluation-getting-started) is a powerful way to automatically improve prompts.

4 4

5## Prepare your data5## Prepare your data

6 6

guides/realtime.md +4 −10

Details

27 <tr>27 <tr>

28 <td>Build a low-latency voice agent</td>28 <td>Build a low-latency voice agent</td>

29 <td className="whitespace-nowrap">29 <td className="whitespace-nowrap">

~~30 <a href="/api/docs/models/gpt-realtime-2">~~30 [`gpt-realtime-2`](https://developers.openai.com/api/docs/models/gpt-realtime-2)

~~31 <code>gpt-realtime-2</code>~~

~~32 </a>~~

33 </td>31 </td>

34 <td>32 <td>

35 <a href="/api/docs/guides/voice-agents">Voice agents</a>33 <a href="/api/docs/guides/voice-agents">Voice agents</a>

38 <tr>36 <tr>

39 <td>Translate live speech into another language</td>37 <td>Translate live speech into another language</td>

40 <td className="whitespace-nowrap">38 <td className="whitespace-nowrap">

~~41 <a href="/api/docs/models/gpt-realtime-translate">~~39 [`gpt-realtime-translate`](https://developers.openai.com/api/docs/models/gpt-realtime-translate)

~~42 <code>gpt-realtime-translate</code>~~

~~43 </a>~~

44 </td>40 </td>

45 <td>41 <td>

46 <a href="/api/docs/guides/realtime-translation">Realtime translation</a>42 <a href="/api/docs/guides/realtime-translation">Realtime translation</a>

49 <tr>45 <tr>

50 <td>Transcribe live audio into streaming text</td>46 <td>Transcribe live audio into streaming text</td>

51 <td className="whitespace-nowrap">47 <td className="whitespace-nowrap">

~~52 <a href="/api/docs/models/gpt-realtime-whisper">~~48 [`gpt-realtime-whisper`](https://developers.openai.com/api/docs/models/gpt-realtime-whisper)

~~53 <code>gpt-realtime-whisper</code>~~

~~54 </a>~~

55 </td>49 </td>

56 <td>50 <td>

57 <a href="/api/docs/guides/realtime-transcription">51 <a href="/api/docs/guides/realtime-transcription">

152 146

153You can transcribe audio in more than one way. Use a realtime transcription session when your application needs live transcript deltas from streaming audio. Use the [Speech to text](https://developers.openai.com/api/docs/guides/speech-to-text) guide for file uploads, request-based transcription, or diarization-focused workflows.147You can transcribe audio in more than one way. Use a realtime transcription session when your application needs live transcript deltas from streaming audio. Use the [Speech to text](https://developers.openai.com/api/docs/guides/speech-to-text) guide for file uploads, request-based transcription, or diarization-focused workflows.

154 148

155For realtime transcription, `gpt-realtime-whisper` gives you controllable latency. Lower delay settings produce earlier partial text, while higher delay settings can improve transcript quality. Test with your real audio conditions, target languages, accents, and domain vocabulary before choosing a production default.149For realtime transcription, [`gpt-realtime-whisper`](https://developers.openai.com/api/docs/models/gpt-realtime-whisper) gives you controllable latency. Lower delay settings produce earlier partial text, while higher delay settings can improve transcript quality. Test with your real audio conditions, target languages, accents, and domain vocabulary before choosing a production default.

156 150

157See [Realtime transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) for session configuration and event handling.151See [Realtime transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) for session configuration and event handling.

158 152

guides/realtime-conversations.md +1 −1

Details

1# Realtime conversations1# Realtime conversations

2 2

3Once you have connected to the Realtime API through either [WebRTC](https://developers.openai.com/api/docs/guides/realtime-webrtc) or [WebSocket](https://developers.openai.com/api/docs/guides/realtime-websocket), you can call a Realtime model (such as [gpt-realtime-2](https://developers.openai.com/api/docs/models/gpt-realtime-2)) to have speech-to-speech conversations. Doing so will require you to **send client events** to initiate actions, and **listen for server events** to respond to actions taken by the Realtime API.3Once you have connected to the Realtime API through either [WebRTC](https://developers.openai.com/api/docs/guides/realtime-webrtc) or [WebSocket](https://developers.openai.com/api/docs/guides/realtime-websocket), you can call a Realtime model (such as [`gpt-realtime-2`](https://developers.openai.com/api/docs/models/gpt-realtime-2)) to have speech-to-speech conversations. Doing so will require you to **send client events** to initiate actions, and **listen for server events** to respond to actions taken by the Realtime API.

4 4

5This guide will walk through the event flows required to use model capabilities like audio and text generation, image input, and function calling, and how to think about the state of a Realtime Session.5This guide will walk through the event flows required to use model capabilities like audio and text generation, image input, and function calling, and how to think about the state of a Realtime Session.

6 6

guides/realtime-mcp.md +1 −1

Details

246```246```

247 247

248 248

249Remote MCP servers{" "}249Remote MCP servers

250 <strong>don't automatically receive the full conversation context</strong>,250 <strong>don't automatically receive the full conversation context</strong>,

251 but <strong>they can see any data the model sends in a tool call</strong>.251 but <strong>they can see any data the model sends in a tool call</strong>.

252 <strong>Keep the tool surface narrow</strong> with <code>allowed_tools</code>,252 <strong>Keep the tool surface narrow</strong> with <code>allowed_tools</code>,

guides/realtime-sip.md +1 −1

Details

67 -H "Content-Type: application/json" \67 -H "Content-Type: application/json" \

68 -d '{68 -d '{

69 "type": "realtime",69 "type": "realtime",

~~70 "model": "gpt-realtime",~~70 "model": "gpt-realtime-2",

71 "instructions": "You are Alex, a friendly concierge for Example Corp."71 "instructions": "You are Alex, a friendly concierge for Example Corp."

72 }'72 }'

73```73```

guides/realtime-transcription.md +1 −1

Details

53 </td>53 </td>

54 <td>Existing Whisper integrations.</td>54 <td>Existing Whisper integrations.</td>

55 <td>55 <td>

~~56 Not natively streaming in the same way as{" "}~~56 Not natively streaming in the same way as

57 <code>gpt-realtime-whisper</code>.57 <code>gpt-realtime-whisper</code>.

58 </td>58 </td>

59 </tr>59 </tr>

guides/realtime-webrtc.md +2 −2

Details

37 37

38const sessionConfig = JSON.stringify({38const sessionConfig = JSON.stringify({

39 type: "realtime",39 type: "realtime",

~~40 model: "gpt-realtime",~~40 model: "gpt-realtime-2",

41 audio: { output: { voice: "marin" } },41 audio: { output: { voice: "marin" } },

42});42});

43 43

139const sessionConfig = JSON.stringify({139const sessionConfig = JSON.stringify({

140 session: {140 session: {

141 type: "realtime",141 type: "realtime",

142 model: "gpt-realtime",142 model: "gpt-realtime-2",

143 audio: {143 audio: {

144 output: {144 output: {

145 voice: "marin",145 voice: "marin",

guides/realtime-websocket.md +1 −1

Details

126```javascript126```javascript

127 127

128 128

129const url = "wss://api.openai.com/v1/realtime?model=gpt-realtime";129const url = "wss://api.openai.com/v1/realtime?model=gpt-realtime-2";

130const ws = new WebSocket(url, {130const ws = new WebSocket(url, {

131 headers: {131 headers: {

132 Authorization: "Bearer " + process.env.OPENAI_API_KEY,132 Authorization: "Bearer " + process.env.OPENAI_API_KEY,

guides/reasoning.md +8 −8

Details

38\`;39\`;

39 40

40const response = await openai.responses.create({41const response = await openai.responses.create({

~~41 model: "gpt-5.5",~~42 model: "${latestMainlineModelSlug}",

42 reasoning: { effort: "low" },43 reasoning: { effort: "low" },

43 input: [44 input: [

44 {45 {

62"""63"""

63 64

64response = client.responses.create(65response = client.responses.create(

~~65 model="gpt-5.5",~~66 model="${latestMainlineModelSlug}",

66 reasoning={"effort": "low"},67 reasoning={"effort": "low"},

67 input=[68 input=[

68 {69 {

80 -H "Content-Type: application/json" \\81 -H "Content-Type: application/json" \\

81 -H "Authorization: Bearer $OPENAI_API_KEY" \\82 -H "Authorization: Bearer $OPENAI_API_KEY" \\

82 -d '{83 -d '{

~~83 "model": "gpt-5.5",~~84 "model": "${latestMainlineModelSlug}",

84 "reasoning": {"effort": "low"},85 "reasoning": {"effort": "low"},

85 "input": [86 "input": [

86 {87 {

172\`;173\`;

173 174

174const response = await openai.responses.create({175const response = await openai.responses.create({

175 model: "gpt-5.5",176 model: "${latestMainlineModelSlug}",

176 reasoning: { effort: "medium" },177 reasoning: { effort: "medium" },

177 input: [178 input: [

178 {179 {

207"""208"""

208 209

209response = client.responses.create(210response = client.responses.create(

210 model="gpt-5.5",211 model="${latestMainlineModelSlug}",

211 reasoning={"effort": "medium"},212 reasoning={"effort": "medium"},

212 input=[213 input=[

213 {214 {

273const openai = new OpenAI();274const openai = new OpenAI();

274 275

275const response = await openai.responses.create({276const response = await openai.responses.create({

276 model: "gpt-5.5",277 model: "${latestMainlineModelSlug}",

277 input: "What is the capital of France?",278 input: "What is the capital of France?",

278 reasoning: {279 reasoning: {

279 effort: "low",280 effort: "low",

289client = OpenAI()290client = OpenAI()

290 291

291response = client.responses.create(292response = client.responses.create(

292 model="gpt-5.5",293 model="${latestMainlineModelSlug}",

293 input="What is the capital of France?",294 input="What is the capital of France?",

294 reasoning={295 reasoning={

295 "effort": "low",296 "effort": "low",

305 -H "Content-Type: application/json" \\306 -H "Content-Type: application/json" \\

306 -H "Authorization: Bearer $OPENAI_API_KEY" \\307 -H "Authorization: Bearer $OPENAI_API_KEY" \\

307 -d '{308 -d '{

308 "model": "gpt-5.5",309 "model": "${latestMainlineModelSlug}",

309 "input": "What is the capital of France?",310 "input": "What is the capital of France?",

310 "reasoning": {311 "reasoning": {

311 "effort": "low",312 "effort": "low",

guides/secure-mcp-tunnels.md +34 −4

Details

2 2

3Secure MCP Tunnel lets you connect private MCP servers to supported OpenAI products without opening inbound firewall ports or exposing those servers to the public internet. Run `tunnel-client` inside the network that can already reach your MCP server; it opens an outbound HTTPS path to OpenAI, pulls queued MCP work, forwards requests locally, and returns responses through the same tunnel.3Secure MCP Tunnel lets you connect private MCP servers to supported OpenAI products without opening inbound firewall ports or exposing those servers to the public internet. Run `tunnel-client` inside the network that can already reach your MCP server; it opens an outbound HTTPS path to OpenAI, pulls queued MCP work, forwards requests locally, and returns responses through the same tunnel.

4 4

5## What is an MCP tunnel?

7An MCP tunnel is an outbound-only connection from a host inside your network to an OpenAI-hosted MCP endpoint. Use it when your MCP server is private, on-premises, or behind a firewall, but ChatGPT, Codex, the Responses API, or another supported OpenAI surface still needs to call it.

9Secure MCP Tunnel keeps the MCP server private while giving supported OpenAI products a normal MCP request path. `tunnel-client` polls OpenAI for work, forwards MCP requests locally, and returns responses through the same tunnel.

5## Use Secure MCP Tunnel when11## Use Secure MCP Tunnel when

6 12

7- Your MCP server runs on a private network, on-premises, on a developer machine, or behind existing access controls.13- Your MCP server runs on a private network, on-premises, on a developer machine, or behind existing access controls.

8- You want ChatGPT, Codex, the Responses API, or another supported OpenAI surface to use that server without making the MCP server public.14- You want ChatGPT, Codex, the Responses API, or another supported OpenAI surface to use that server without making the MCP server public.

~~9- Your network allows the host running `tunnel-client` to make outbound HTTPS requests to OpenAI.~~15- Your network allows the host running `tunnel-client` to make outbound HTTPS requests to `api.openai.com:443` by default, or `mtls.api.openai.com:443` when control-plane mTLS is configured, and reach the private MCP server.

10- Start with the [MCP and Connectors guide](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) for general MCP concepts.16- Start with the [MCP and Connectors guide](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) for general MCP concepts.

11 17

12## How it works18## How it works

36- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.42- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.

37- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.43- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.

38 44

45## Network requirements

47`tunnel-client` does not need inbound internet access. It needs outbound HTTPS to OpenAI and local reachability to the private MCP server:

49| From | To | Used for |

50| ---------------------------- | ------------------------------------------------------ | ------------------------------------------------------------------- |

51| Host running `tunnel-client` | `api.openai.com:443` over HTTPS on `/v1/tunnel/*` | Default polling and response posting. |

52| Host running `tunnel-client` | `mtls.api.openai.com:443` over HTTPS on `/v1/tunnel/*` | Polling and response posting when control-plane mTLS is configured. |

53| Host running `tunnel-client` | The configured stdio command or MCP server URL | Forwarding MCP requests from inside your network. |

39## Set up tunnel-client55## Set up tunnel-client

40 56

41Open [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels), then download the latest public `tunnel-client` release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest). Keep your runbook pointed at the latest-release URL instead of hard-coding a specific release URL.57Open [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels), then use the download link there or the latest public `tunnel-client` release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest). Keep your runbook pointed at the latest-release URL instead of hard-coding a specific release URL.

42 58

~~43For a local stdio MCP server, the shortest profile-based flow is:~~59If you already have a binary, start with `tunnel-client help quickstart`. For a named local stdio profile, use:

44 60

45```bash61```bash

46export CONTROL_PLANE_API_KEY="sk-..."62export CONTROL_PLANE_API_KEY="sk-..."

59 75

60Keep `tunnel-client run ...` healthy while you create or test the connector. Connector discovery and MCP tool calls depend on the running client.76Keep `tunnel-client run ...` healthy while you create or test the connector. Connector discovery and MCP tool calls depend on the running client.

61 77

78## Choose where to run tunnel-client

80Run `tunnel-client` in the same trust boundary that can already reach the private MCP server. Common deployment patterns are:

82- **Kubernetes sidecar:** Run `tunnel-client` beside the MCP server in one Pod and connect over `localhost`.

83- **Dedicated Kubernetes deployment:** Run `tunnel-client` separately when the MCP server is already reachable through a private Service.

84- **VM or systemd service:** Run `tunnel-client` on a host that can reach the MCP server over private networking.

62## Connect from ChatGPT86## Connect from ChatGPT

63 87

64Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.88Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.

79- `tunnel-client` authenticates to the OpenAI tunnel control plane; supported OpenAI products use the OpenAI-hosted tunnel endpoint.103- `tunnel-client` authenticates to the OpenAI tunnel control plane; supported OpenAI products use the OpenAI-hosted tunnel endpoint.

80- Tunnel access follows the existing organization and workspace context instead of introducing a separate public ingress path.104- Tunnel access follows the existing organization and workspace context instead of introducing a separate public ingress path.

81- `tunnel-client` supports enterprise networking requirements such as outbound proxies, custom CA bundles, control-plane client certificates, and MCP-side `mTLS`.105- `tunnel-client` supports enterprise networking requirements such as outbound proxies, custom CA bundles, control-plane client certificates, and MCP-side `mTLS`.

~~82- The embedded Harpoon MCP server is limited to labeled, allowlisted HTTP callouts used by flows such as OAuth metadata handling. It is not a general-purpose outbound proxy.~~106

107## Advanced: allowlisted HTTP callouts

108

109Secure MCP Tunnel can also support narrowly scoped HTTP callouts from supported agent or API flows into a customer network. `tunnel-client` includes an embedded MCP server, Harpoon, that exposes configured HTTP targets by label and lets callers invoke them through the tunnel with bounded request/response limits.

110

111Use this when you need to reach a small set of private REST endpoints without exposing them publicly. Harpoon is not a general-purpose proxy: callers cannot choose arbitrary hosts, and requests are limited to the targets and methods configured by the customer.

83 112

84## Troubleshooting113## Troubleshooting

85 114

87- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.116- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.

88- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.117- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.

89- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.118- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.

119- The admin UI is loopback-only by default. Expose it remotely only when you intentionally need an operator network to reach it.

90- Use those surfaces to confirm that the client is healthy, ready, and polling before testing from ChatGPT, Codex, or an API flow.120- Use those surfaces to confirm that the client is healthy, ready, and polling before testing from ChatGPT, Codex, or an API flow.

91- If the client is not connected, requests through the tunnel fail until `tunnel-client` reconnects.121- If the client is not connected, requests through the tunnel fail until `tunnel-client` reconnects.

92- Raw HTTP logging is disabled by default, and support exports are redacted.122- Raw HTTP logging is disabled by default, and support exports are redacted.

guides/speech-to-text.md +1 −75

Details

494 494

495### Streaming the transcription of an ongoing audio recording495### Streaming the transcription of an ongoing audio recording

496 496

497In the Realtime API, you can stream the transcription of an ongoing audio recording. To start a streaming session with the Realtime API, create a WebSocket connection with the following URL:497For live audio from a microphone, call, or media stream, use the [Realtime transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) guide instead of the file-oriented streaming path above. It covers the current transcription-session flow and the recommended realtime path with [`gpt-realtime-whisper`](https://developers.openai.com/api/docs/models/gpt-realtime-whisper).

~~498~~

499```

500wss://api.openai.com/v1/realtime?intent=transcription

501```

~~502~~

503Below is an example payload for setting up a transcription session:

~~504~~

505```json

506{

507 "type": "transcription_session.update",

508 "input_audio_format": "pcm16",

509 "input_audio_transcription": {

510 "model": "gpt-4o-transcribe",

511 "prompt": "",

512 "language": ""

513 },

514 "turn_detection": {

515 "type": "server_vad",

516 "threshold": 0.5,

517 "prefix_padding_ms": 300,

518 "silence_duration_ms": 500

519 },

520 "input_audio_noise_reduction": {

521 "type": "near_field"

522 },

523 "include": ["item.input_audio_transcription.logprobs"]

524}

525```

~~526~~

527To stream audio data to the API, append audio buffers:

~~528~~

529```json

530{

531 "type": "input_audio_buffer.append",

532 "audio": "Base64EncodedAudioData"

533}

534```

~~535~~

536When in VAD mode, the API will respond with `input_audio_buffer.committed` every time a chunk of speech has been detected. Use `input_audio_buffer.committed.item_id` and `input_audio_buffer.committed.previous_item_id` to enforce the ordering.

~~537~~

538The API responds with transcription events indicating speech start, stop, and completed transcriptions.

~~539~~

540The primary resource used by the streaming ASR API is the `TranscriptionSession`:

~~541~~

542```json

543{

544 "object": "realtime.transcription_session",

545 "id": "string",

546 "input_audio_format": "pcm16",

547 "input_audio_transcription": [{

548 "model": "whisper-1" | "gpt-4o-transcribe" | "gpt-4o-mini-transcribe",

549 "prompt": "string",

550 "language": "string"

551 }],

552 "turn_detection": {

553 "type": "server_vad",

554 "threshold": "float",

555 "prefix_padding_ms": "integer",

556 "silence_duration_ms": "integer",

557 } | null,

558 "input_audio_noise_reduction": {

559 "type": "near_field" | "far_field"

560 },

561 "include": ["string"]

562}

563```

~~564~~

565Authenticate directly through the WebSocket connection using your API key or an ephemeral token obtained from:

~~566~~

567```

568POST /v1/realtime/transcription_sessions

569```

~~570~~

571This endpoint returns an ephemeral token (`client_secret`) to securely authenticate WebSocket connections.

572 498

573## Improving reliability499## Improving reliability

574 500

guides/structured-outputs.md +3 −3

Details

232 232

233 233

234 234

235When to use Structured Outputs via function calling vs via{" "}235When to use Structured Outputs via function calling vs via

236 <span className="monospace">text.format</span>236 <span className="monospace">text.format</span>

237 237

238 238

267 267

268 The remainder of this guide will focus on non-function calling use cases in268 The remainder of this guide will focus on non-function calling use cases in

269 the Responses API. To learn more about how to use Structured Outputs with269 the Responses API. To learn more about how to use Structured Outputs with

270 function calling, check out the{" "}270 function calling, check out the

271 [Function Calling](https://developers.openai.com/api/docs/guides/function-calling#function-calling-with-structured-outputs){" "}271 [Function Calling](https://developers.openai.com/api/docs/guides/function-calling#function-calling-with-structured-outputs)

272 guide.272 guide.

273 273

274 274

guides/tools-connectors-mcp.md +17 −3

Details

19 19

20## Secure MCP Tunnel20## Secure MCP Tunnel

21 21

22If your MCP server is private, use [Secure MCP Tunnel](https://developers.openai.com/api/docs/guides/secure-mcp-tunnels) to connect it to supported OpenAI products without exposing the server to the public internet. Download the latest public release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest).22If your MCP server is private, on-premises, or behind a firewall, use [Secure MCP Tunnel](https://developers.openai.com/api/docs/guides/secure-mcp-tunnels) to connect it to supported OpenAI products without exposing the server to the public internet. Download the latest public release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest).

23 23

24## Quickstart24## Quickstart

25 25

124 124

125 It is very important that developers trust any remote MCP server they use with125 It is very important that developers trust any remote MCP server they use with

126 the Responses API. A malicious server can exfiltrate sensitive data from126 the Responses API. A malicious server can exfiltrate sensitive data from

127 anything that enters the model's context. Carefully review the{" "}127 anything that enters the model's context. Carefully review the

128 <strong>Risks and Safety</strong> section below before using this tool.128 <strong>Risks and Safety</strong> section below before using this tool.

129 129

130 </div>130 </div>

1183 1183

1184When you defer loading an MCP server, the model can still use the MCP server's label and description to decide when to search it, but the individual function definitions are loaded only when needed. This can help reduce overall token usage, and it is most useful for MCP servers that expose large numbers of functions.1184When you defer loading an MCP server, the model can still use the MCP server's label and description to decide when to search it, but the individual function definitions are loaded only when needed. This can help reduce overall token usage, and it is most useful for MCP servers that expose large numbers of functions.

1185 1185

1186```json

1187{

1188 "type": "mcp",

1189 "server_label": "dmcp",

1190 "server_description": "A Dungeons and Dragons MCP server to assist with dice rolling.",

1191 "server_url": "https://dmcp-server.deno.dev/sse",

1192// highlight-start:subtle

1193 "defer_loading": true,

1194// highlight-end

1195 "require_approval": "never"

1196}

1197```

1198

1199

1186## Risks and safety1200## Risks and safety

1187 1201

1188The MCP tool permits you to connect OpenAI models to external services. This is a powerful feature that comes with some risks.1202The MCP tool permits you to connect OpenAI models to external services. This is a powerful feature that comes with some risks.

1232<table>1246<table>

1233 <tbody>1247 <tbody>

1234 1248

1235{" "}1249

1236 1250

1237<tr>1251<tr>

1238 <th>API Availability</th>1252 <th>API Availability</th>

guides/tools-image-generation.md +9 −9

Details

383 383

384The following models support the image generation tool:384The following models support the image generation tool:

385 385

386- `gpt-4o`386- `gpt-5.5`

387- `gpt-4o-mini`

388- `gpt-4.1`

389- `gpt-4.1-mini`

390- `gpt-4.1-nano`

391- `o3`

392- `gpt-5`

393- `gpt-5.4-mini`387- `gpt-5.4-mini`

394- `gpt-5.4-nano`388- `gpt-5.4-nano`

395- `gpt-5-nano`

396- `gpt-5.5`

397- `gpt-5.2`389- `gpt-5.2`

390- `gpt-5`

391- `gpt-5-nano`

392- `o3`

393- `gpt-4.1`

394- `gpt-4.1-mini`

395- `gpt-4.1-nano`

396- `gpt-4o`

397- `gpt-4o-mini`

398 398

399The model used for the image generation process is always a GPT Image model, including `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`, but these models aren't valid values for the `model` field in the Responses API. Use a text-capable mainline model (for example, `gpt-5.5` or `gpt-5`) with the hosted `image_generation` tool.399The model used for the image generation process is always a GPT Image model, including `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`, but these models aren't valid values for the `model` field in the Responses API. Use a text-capable mainline model (for example, `gpt-5.5` or `gpt-5`) with the hosted `image_generation` tool.

guides/tools-tool-search.md +36 −0

Details

28 28

29For maximum token savings, we recommend grouping deferred functions into namespaces or MCP servers with clear, high-level descriptions that give the model a strong overview of what is contained within them, so it can effectively search and load only the relevant functions. As a best practice, aim to keep each namespace to fewer than 10 functions for better token efficiency and model performance.29For maximum token savings, we recommend grouping deferred functions into namespaces or MCP servers with clear, high-level descriptions that give the model a strong overview of what is contained within them, so it can effectively search and load only the relevant functions. As a best practice, aim to keep each namespace to fewer than 10 functions for better token efficiency and model performance.

30 30

31```json

32{

33 "tools": [

34 {

35// highlight-start:subtle

36 "type": "namespace",

37// highlight-end

38 "name": "crm",

39 "description": "CRM tools for customer lookup and order management.",

40 "tools": [

41 {

42 "type": "function",

43 "name": "list_open_orders",

44 "description": "List open orders for a customer ID.",

45// highlight-start:subtle

46 "defer_loading": true,

47// highlight-end

48 "parameters": {

49 "type": "object",

50 "properties": {

51 "customer_id": { "type": "string" }

52 },

53 "required": ["customer_id"],

54 "additionalProperties": false

55 }

56 }

57 ]

58 },

59 {

60 "type": "tool_search"

61 }

62 ]

63 }

64```

31Namespaces can have a mix of tools that are deferred and not deferred. Tools without `defer_loading: true` are callable immediately, while deferred tools in the same namespace are loaded through tool search.67Namespaces can have a mix of tools that are deferred and not deferred. Tools without `defer_loading: true` are callable immediately, while deferred tools in the same namespace are loaded through tool search.

32 68

33### Tool search types69### Tool search types

tutorials/meeting-minutes.md +5 −5

Details

25 <div className="preview-info">25 <div className="preview-info">

26 <div className="description">26 <div className="description">

27 The first step in transcribing the audio from a meeting is to pass the27 The first step in transcribing the audio from a meeting is to pass the

~~28 audio file of the meeting into our{" "}~~28 audio file of the meeting into our

29 <a href="/api/docs/api-reference/audio">/v1/audio API</a>. Whisper, the29 <a href="/api/docs/api-reference/audio">/v1/audio API</a>. Whisper, the

30 model that powers the audio API, is capable of converting spoken language30 model that powers the audio API, is capable of converting spoken language

~~31 into written text. To start, we will avoid passing a{" "}~~31 into written text. To start, we will avoid passing a

32 <a href="/api/docs/api-reference/audio/createTranscription#audio/createTranscription-prompt">32 <a href="/api/docs/api-reference/audio/createTranscription#audio/createTranscription-prompt">

33 prompt33 prompt

~~34 </a>{" "}~~34 </a>

~~35 or{" "}~~35 or

36 <a href="/api/docs/api-reference/audio/createTranscription#audio/createTranscription-temperature-4">36 <a href="/api/docs/api-reference/audio/createTranscription#audio/createTranscription-temperature-4">

37 temperature37 temperature

~~38 </a>{" "}~~38 </a>

39 (optional parameters to control the model's output) and stick with the39 (optional parameters to control the model's output) and stick with the

40 default values.40 default values.

41 </div>41 </div>

Documentation 2026-05-20 06:35 UTC to 2026-05-21 06:36 UTC