SpyBara
Go Premium

Documentation 2026-05-19 11:58 UTC to 2026-05-20 06:35 UTC

5 files changed +178 −22. View all changes and history on the product overview
2026
Fri 29 06:38 Thu 28 06:37 Wed 27 06:42 Sun 24 06:25 Fri 22 06:33 Thu 21 06:36 Wed 20 06:35 Tue 19 11:58 Mon 18 22:01 Thu 14 21:00 Tue 12 18:57 Thu 7 21:57 Wed 6 00:01 Tue 5 23:00 Sat 2 05:57

deprecations.md +0 −3

Details

46| Shutdown date | Model snapshot | Substitute model |46| Shutdown date | Model snapshot | Substitute model |

47| ------------- | ---------------------------------------------------------------------- | ---------------------------- |47| ------------- | ---------------------------------------------------------------------- | ---------------------------- |

48| 2026-07-23 | `computer-use-preview-2025-03-11` \| `computer-use-preview` | `gpt-5.4-mini` |48| 2026-07-23 | `computer-use-preview-2025-03-11` \| `computer-use-preview` | `gpt-5.4-mini` |

49| 2026-07-23 | `gpt-4o-audio-preview-2024-12-17` | `gpt-audio-1.5` |

50| 2026-07-23 | `gpt-4o-mini-audio-preview-2024-12-17` | `gpt-audio-mini` |

51| 2026-07-23 | `gpt-4o-mini-realtime-preview-2024-12-17` | `gpt-realtime-mini` |

52| 2026-07-23 | `gpt-4o-mini-search-preview-2025-03-11` | `gpt-5.4-mini` |49| 2026-07-23 | `gpt-4o-mini-search-preview-2025-03-11` | `gpt-5.4-mini` |

53| 2026-07-23 | `gpt-4o-mini-tts-2025-03-20` | `gpt-4o-mini-tts-2025-12-15` |50| 2026-07-23 | `gpt-4o-mini-tts-2025-03-20` | `gpt-4o-mini-tts-2025-12-15` |

54| 2026-07-23 | `gpt-4o-search-preview-2025-03-11` | `gpt-5.4-mini` |51| 2026-07-23 | `gpt-4o-search-preview-2025-03-11` | `gpt-5.4-mini` |

Details

80 "transcription": {80 "transcription": {

81 "model": "gpt-realtime-whisper",81 "model": "gpt-realtime-whisper",

82 "language": "en"82 "language": "en"

83 },

84 "turn_detection": {

85 "type": "server_vad",

86 "threshold": 0.5,

87 "prefix_padding_ms": 300,

88 "silence_duration_ms": 500

89 }83 }

90 }84 }

91 }85 }


99- `audio.input.format`: Input encoding for audio appended to the buffer. Use 24 kHz mono PCM when sending `audio/pcm`.93- `audio.input.format`: Input encoding for audio appended to the buffer. Use 24 kHz mono PCM when sending `audio/pcm`.

100- `audio.input.transcription.model`: Use `gpt-realtime-whisper` for streaming transcription.94- `audio.input.transcription.model`: Use `gpt-realtime-whisper` for streaming transcription.

101- `audio.input.transcription.language`: Optional language hint such as `en`.95- `audio.input.transcription.language`: Optional language hint such as `en`.

102- `audio.input.turn_detection`: Optional voice activity detection. Set it to `null` if you want to commit audio manually.96- `audio.input.transcription.delay`: Optional latency/accuracy tradeoff for `gpt-realtime-whisper`. Supported values are `minimal`, `low`, `medium`, `high`, and `xhigh`.

97- `audio.input.turn_detection`: Optional voice activity detection for models that support it. For `gpt-realtime-whisper`, omit this field or set it to `null`, then commit audio manually.

103 98 

104## Stream audio99## Stream audio

105 100 


124);119);

125```120```

126 121 

127With server VAD enabled, the session commits audio automatically when it detects a turn boundary.122For models that support server VAD, the session commits audio automatically when it detects a turn boundary.

128 123 

129## Handle transcript events124## Handle transcript events

130 125 


172 167 

173Streaming transcription trades latency for transcript quality. Lower delay settings can produce earlier partial text. Higher delay settings give the model more audio context before emitting text and can improve word error rate.168Streaming transcription trades latency for transcript quality. Lower delay settings can produce earlier partial text. Higher delay settings give the model more audio context before emitting text and can improve word error rate.

174 169 

175Start by testing a few delay targets against your real audio. Useful evaluation points are:170Start by setting `audio.input.transcription.delay` and testing against your real audio. Useful starting points are:

171 

172- `minimal` for the most latency-sensitive interactions;

173- `low` for low-latency live captions;

174- `medium` for a balanced latency/accuracy tradeoff;

175- `high` when accuracy matters more than immediate display;

176- `xhigh` when your workflow can tolerate the most delay for additional context.

176 177 

177- 0.4 seconds for the most latency-sensitive interactions;178The exact delay in milliseconds can vary by model configuration, so benchmark with representative audio instead of assuming a fixed timing per level.

178- 0.8 to 1.2 seconds for balanced live captions;

179- 1.5 to 2.0 seconds when accuracy matters more than immediate display;

180- 3.0 seconds for workflows that can tolerate more delay.

181 179 

182Don't choose a setting from synthetic audio alone. Test with representative microphones, telephony audio, accents, background noise, code-switching, domain vocabulary, and long sessions.180Don't choose a setting from synthetic audio alone. Test with representative microphones, telephony audio, accents, background noise, code-switching, domain vocabulary, and long sessions.

183 181 

184## Guide vocabulary and domain terms182## Guide vocabulary and domain terms

185 183 

186If your application depends on exact domain vocabulary, include a language hint and test whether your model and endpoint support prompt or keyword steering before relying on it. Where supported, use short keyword lists rather than long instructions.184If your application depends on exact domain vocabulary, include a language hint and use prompt or keyword steering only when your selected model supports it. For `gpt-realtime-whisper` in GA Realtime sessions, `prompt` is not supported.

185 

186Where prompt steering is available, use short keyword lists rather than long instructions. The model is already instructed to transcribe, so focus prompts on domain vocabulary, spelling, or style rather than re-stating the transcription task.

187 187 

188Example keyword style:188Example keyword style:

189 189 

Details

1# Voice activity detection (VAD)1# Voice activity detection (VAD)

2 2 

3Voice activity detection (VAD) is a feature available in the Realtime API allowing to automatically detect when the user has started or stopped speaking.3Voice activity detection (VAD) is a feature available in the Realtime API allowing to automatically detect when the user has started or stopped speaking.

4It is enabled by default in [speech-to-speech](https://developers.openai.com/api/docs/guides/realtime-conversations) or [transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) Realtime sessions, but is optional and can be turned off.4It is enabled by default in [speech-to-speech](https://developers.openai.com/api/docs/guides/realtime-conversations) Realtime sessions, but is optional and can be turned off.

5In [transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) Realtime sessions, turn detection support depends on the transcription model. Models that support VAD default to `server_vad`, while `gpt-realtime-whisper` requires turn detection to be omitted or set to `null`.

5 6 

6## Overview7## Overview

7 8 


12 13 

13You can use these events to handle speech turns in your application. For example, you can use them to manage conversation state or process transcripts in chunks.14You can use these events to handle speech turns in your application. For example, you can use them to manage conversation state or process transcripts in chunks.

14 15 

15You can use the `turn_detection` property of the `session.update` event to configure how audio is chunked within each speech-to-text sample.16You can configure VAD with the [`session.update`](https://developers.openai.com/api/docs/api-reference/realtime-client-events/session/update) client event by setting `session.audio.input.turn_detection`.

16 17 

17There are two modes for VAD:18There are two modes for VAD:

18 19 

19- `server_vad`: Automatically chunks the audio based on periods of silence.20- `server_vad`: Automatically chunks the audio based on periods of silence.

20- `semantic_vad`: Chunks the audio when the model believes based on the words said by the user that they have completed their utterance.21- `semantic_vad`: Chunks the audio when the model believes based on the words said by the user that they have completed their utterance.

21 22 

22The default value is `server_vad`.23For sessions and models that support VAD, the default value is `server_vad`.

23 24 

24Read below to learn more about the different modes.25Read below to learn more about the different modes.

25 26 

26## Server VAD27## Server VAD

27 28 

28Server VAD is the default mode for Realtime sessions, and uses periods of silence to automatically chunk the audio.29Server VAD is the default mode for speech-to-speech sessions, and for transcription sessions on models that support turn detection. It uses periods of silence to automatically chunk the audio.

29 30 

30You can adjust the following properties to fine-tune the VAD settings:31You can adjust the following properties to fine-tune the VAD settings:

31 32 


39{40{

40 "type": "session.update",41 "type": "session.update",

41 "session": {42 "session": {

43 "type": "realtime",

44 "audio": {

45 "input": {

42 "turn_detection": {46 "turn_detection": {

43 "type": "server_vad",47 "type": "server_vad",

44 "threshold": 0.5,48 "threshold": 0.5,


48 "interrupt_response": true // only in conversation mode52 "interrupt_response": true // only in conversation mode

49 }53 }

50 }54 }

55 }

56 }

51}57}

52```58```

53 59 

60Use the same `session.audio.input.turn_detection` field in transcription sessions. For `gpt-realtime-whisper`, omit turn detection or set it to `null`.

61 

62The `create_response` and `interrupt_response` fields are only used in speech-to-speech conversations. In transcription sessions, VAD only controls how audio is chunked.

63 

54## Semantic VAD64## Semantic VAD

55 65 

56Semantic VAD is a new mode that uses a semantic classifier to detect when the user has finished speaking, based on the words they have uttered.66Semantic VAD is a new mode that uses a semantic classifier to detect when the user has finished speaking, based on the words they have uttered.


59 69 

60With this mode, the model is less likely to interrupt the user during a speech-to-speech conversation, or chunk a transcript before the user is done speaking.70With this mode, the model is less likely to interrupt the user during a speech-to-speech conversation, or chunk a transcript before the user is done speaking.

61 71 

62Semantic VAD can be activated by setting `turn_detection.type` to `semantic_vad` in a [`session.update`](https://developers.openai.com/api/docs/api-reference/realtime-client-events/session/update) event.72Semantic VAD can be activated by setting `session.audio.input.turn_detection.type` to `semantic_vad`.

63 73 

64It can be configured like this:74It can be configured like this:

65 75 


67{77{

68 "type": "session.update",78 "type": "session.update",

69 "session": {79 "session": {

80 "type": "realtime",

81 "audio": {

82 "input": {

70 "turn_detection": {83 "turn_detection": {

71 "type": "semantic_vad",84 "type": "semantic_vad",

72 "eagerness": "low" | "medium" | "high" | "auto", // optional85 "eagerness": "low" | "medium" | "high" | "auto", // optional


74 "interrupt_response": true, // only in conversation mode87 "interrupt_response": true, // only in conversation mode

75 }88 }

76 }89 }

90 }

91 }

77}92}

78```93```

79 94 

95The same `session.audio.input.turn_detection` field applies in transcription sessions. The `create_response` and `interrupt_response` fields are conversation-only.

96 

80The optional `eagerness` property is a way to control how eager the model is to interrupt the user, tuning the maximum wait timeout. In transcription mode, even if the model doesn't reply, it affects how the audio is chunked.97The optional `eagerness` property is a way to control how eager the model is to interrupt the user, tuning the maximum wait timeout. In transcription mode, even if the model doesn't reply, it affects how the audio is chunked.

81 98 

82- `auto` is the default value, and is equivalent to `medium`.99- `auto` is the default value, and is equivalent to `medium`.

guides/secure-mcp-tunnels.md +138 −0 created

Details

1# Secure MCP Tunnel

2 

3Secure MCP Tunnel lets you connect private MCP servers to supported OpenAI products without opening inbound firewall ports or exposing those servers to the public internet. Run `tunnel-client` inside the network that can already reach your MCP server; it opens an outbound HTTPS path to OpenAI, pulls queued MCP work, forwards requests locally, and returns responses through the same tunnel.

4 

5## Use Secure MCP Tunnel when

6 

7- Your MCP server runs on a private network, on-premises, on a developer machine, or behind existing access controls.

8- You want ChatGPT, Codex, the Responses API, or another supported OpenAI surface to use that server without making the MCP server public.

9- Your network allows the host running `tunnel-client` to make outbound HTTPS requests to OpenAI.

10- Start with the [MCP and Connectors guide](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) for general MCP concepts.

11 

12## How it works

13 

141. Create or manage an OpenAI-hosted MCP tunnel endpoint in Platform tunnel settings.

152. Run `tunnel-client` inside the network that can reach your private MCP server.

163. Configure `tunnel-client` with the tunnel identity and the private MCP server address.

174. OpenAI products send MCP requests to the OpenAI-hosted tunnel endpoint.

185. `tunnel-client` long-polls for queued work, forwards each `JSON-RPC` request to the private MCP server, and posts the response back through the tunnel.

19 

20The private MCP server does not need a public listener. The OpenAI-hosted endpoint gives supported products a normal MCP request path, while the network initiation point stays inside your boundary. When a connector asks for streamed results, the tunnel path can forward intermediate server-sent events.

21 

22<figure className="not-prose my-8">

23 <figcaption className="mt-3 text-sm text-gray-600 dark:text-gray-400">

24 OpenAI products call the OpenAI-hosted tunnel endpoint; `tunnel-client`

25 long-polls for queued work and returns the MCP response through the same

26 tunnel.

27 </figcaption>

28</figure>

29 

30## Before you start

31 

32You need:

33 

34- A `tunnel_id` from [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

35- A runtime API key for `tunnel-client`. The key principal needs Tunnels **Read** + **Use** for the target tunnel.

36- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.

37- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.

38 

39## Set up tunnel-client

40 

41Open [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels), then download the latest public `tunnel-client` release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest). Keep your runbook pointed at the latest-release URL instead of hard-coding a specific release URL.

42 

43For a local stdio MCP server, the shortest profile-based flow is:

44 

45```bash

46export CONTROL_PLANE_API_KEY="sk-..."

47 

48tunnel-client init \

49 --sample sample_mcp_stdio_local \

50 --profile local-stdio \

51 --tunnel-id tunnel_0123456789abcdef0123456789abcdef \

52 --mcp-command "python /path/to/server.py"

53 

54tunnel-client doctor --profile local-stdio --explain

55tunnel-client run --profile local-stdio

56```

57 

58For an HTTP MCP server, use `--mcp-server-url https://mcp.internal.example.com/mcp` instead of `--mcp-command`.

59 

60Keep `tunnel-client run ...` healthy while you create or test the connector. Connector discovery and MCP tool calls depend on the running client.

61 

62## Connect from ChatGPT

63 

64Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.

65 

66If the tunnel does not appear in ChatGPT, verify that the tunnel is associated with the target workspace and that the connector operator has Tunnels **Read** + **Use**.

67 

68## Security and networking

69 

70<figure className="not-prose my-8">

71 <figcaption className="mt-3 text-sm text-gray-600 dark:text-gray-400">

72 The private MCP server stays inside the customer-controlled environment.

73 `tunnel-client` reaches OpenAI over outbound HTTPS using the runtime API key

74 and, when required, optional control-plane mTLS.

75 </figcaption>

76</figure>

77 

78- The MCP server address stays private and is used only from inside the environment where `tunnel-client` runs.

79- `tunnel-client` authenticates to the OpenAI tunnel control plane; supported OpenAI products use the OpenAI-hosted tunnel endpoint.

80- Tunnel access follows the existing organization and workspace context instead of introducing a separate public ingress path.

81- `tunnel-client` supports enterprise networking requirements such as outbound proxies, custom CA bundles, control-plane client certificates, and MCP-side `mTLS`.

82- The embedded Harpoon MCP server is limited to labeled, allowlisted HTTP callouts used by flows such as OAuth metadata handling. It is not a general-purpose outbound proxy.

83 

84## Troubleshooting

85 

86- **Tunnel not visible in ChatGPT:** Check the tunnel workspace scope and the connector operator's Tunnels **Use** permission.

87- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.

88- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.

89- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.

90- Use those surfaces to confirm that the client is healthy, ready, and polling before testing from ChatGPT, Codex, or an API flow.

91- If the client is not connected, requests through the tunnel fail until `tunnel-client` reconnects.

92- Raw HTTP logging is disabled by default, and support exports are redacted.

93 

94## OAuth

95 

96- OAuth discovery can travel through the tunnel path so the MCP server itself can remain private.

97- The tunnel preserves the upstream authorization server metadata needed for browser-facing OAuth flows.

98- The authorization server itself is not automatically tunneled. If it is unreachable from the public internet and from the `tunnel-client` host, the OAuth flow can still fail even when the MCP server is reachable.

99 

100## Where to configure it

101 

102- Manage OpenAI-hosted MCP tunnel endpoints in [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

103- Use a tunnel when creating a connector from [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors).

104- For Codex or API flows, use the tunnel-backed MCP target exposed by the supported product surface.

105 

106## Next steps

107 

108- Create or manage the tunnel in [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

109- Validate your `tunnel-client` profile with `tunnel-client doctor --profile <profile> --explain`.

110- Connect the tunnel from [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors) or the supported OpenAI surface you are using.

111 

112<div class="not-prose my-8 grid gap-4 lg:grid-cols-2">

113 <figure>

114 <a href="https://platform.openai.com/settings/organization/tunnels">

115 <img src="https://developers.openai.com/images/platform/guides/secure-mcp-tunnels/platform-tunnels-settings.png"

116 alt="Sanitized OpenAI Platform tunnel settings screenshot."

117 loading="lazy"

118 class="w-full rounded-md border border-gray-200 dark:border-gray-800"

119 />

120 </a>

121 <figcaption class="mt-3 text-sm text-gray-600 dark:text-gray-400">

122 Create and manage OpenAI-hosted MCP tunnel endpoints from Platform tunnel

123 settings.

124 </figcaption>

125 </figure>

126 <figure>

127 <a href="https://chatgpt.com/#settings/Connectors">

128 <img src="https://developers.openai.com/images/platform/guides/secure-mcp-tunnels/chatgpt-connectors-tunnel.png"

129 alt="Sanitized ChatGPT connector settings screenshot with Tunnel selected."

130 loading="lazy"

131 class="w-full rounded-md border border-gray-200 dark:border-gray-800"

132 />

133 </a>

134 <figcaption class="mt-3 text-sm text-gray-600 dark:text-gray-400">

135 Select Tunnel when connecting a ChatGPT connector to a private MCP server.

136 </figcaption>

137 </figure>

138</div>

Details

17 17 

18This guide will show how to use both remote MCP servers and connectors to give the model access to new capabilities.18This guide will show how to use both remote MCP servers and connectors to give the model access to new capabilities.

19 19 

20## Secure MCP Tunnel

21 

22If your MCP server is private, use [Secure MCP Tunnel](https://developers.openai.com/api/docs/guides/secure-mcp-tunnels) to connect it to supported OpenAI products without exposing the server to the public internet. Download the latest public release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest).

23 

20## Quickstart24## Quickstart

21 25 

22Check out the examples below to see how remote MCP servers and connectors work through the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Both connectors and remote MCP servers can be used with the `mcp` built-in tool type.26Check out the examples below to see how remote MCP servers and connectors work through the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Both connectors and remote MCP servers can be used with the `mcp` built-in tool type.