Documentation — Spybara

deprecations.md +0 −3

Details

47| ------------- | ---------------------------------------------------------------------- | ---------------------------- |47| ------------- | ---------------------------------------------------------------------- | ---------------------------- |

48| 2026-07-23 | `computer-use-preview-2025-03-11` \| `computer-use-preview` | `gpt-5.4-mini` |48| 2026-07-23 | `computer-use-preview-2025-03-11` \| `computer-use-preview` | `gpt-5.4-mini` |

~~49| 2026-07-23 | `gpt-4o-audio-preview-2024-12-17` | `gpt-audio-1.5` |~~

~~50| 2026-07-23 | `gpt-4o-mini-audio-preview-2024-12-17` | `gpt-audio-mini` |~~

~~51| 2026-07-23 | `gpt-4o-mini-realtime-preview-2024-12-17` | `gpt-realtime-mini` |~~

52| 2026-07-23 | `gpt-4o-mini-search-preview-2025-03-11` | `gpt-5.4-mini` |49| 2026-07-23 | `gpt-4o-mini-search-preview-2025-03-11` | `gpt-5.4-mini` |

53| 2026-07-23 | `gpt-4o-mini-tts-2025-03-20` | `gpt-4o-mini-tts-2025-12-15` |50| 2026-07-23 | `gpt-4o-mini-tts-2025-03-20` | `gpt-4o-mini-tts-2025-12-15` |

54| 2026-07-23 | `gpt-4o-search-preview-2025-03-11` | `gpt-5.4-mini` |51| 2026-07-23 | `gpt-4o-search-preview-2025-03-11` | `gpt-5.4-mini` |

guides/realtime-transcription.md +14 −14

Details

80 "transcription": {80 "transcription": {

81 "model": "gpt-realtime-whisper",81 "model": "gpt-realtime-whisper",

82 "language": "en"82 "language": "en"

~~83 },~~

~~84 "turn_detection": {~~

~~85 "type": "server_vad",~~

~~86 "threshold": 0.5,~~

~~87 "prefix_padding_ms": 300,~~

~~88 "silence_duration_ms": 500~~

89 }83 }

90 }84 }

91 }85 }

99- `audio.input.format`: Input encoding for audio appended to the buffer. Use 24 kHz mono PCM when sending `audio/pcm`.93- `audio.input.format`: Input encoding for audio appended to the buffer. Use 24 kHz mono PCM when sending `audio/pcm`.

100- `audio.input.transcription.model`: Use `gpt-realtime-whisper` for streaming transcription.94- `audio.input.transcription.model`: Use `gpt-realtime-whisper` for streaming transcription.

101- `audio.input.transcription.language`: Optional language hint such as `en`.95- `audio.input.transcription.language`: Optional language hint such as `en`.

102- `audio.input.turn_detection`: Optional voice activity detection. Set it to `null` if you want to commit audio manually.96- `audio.input.transcription.delay`: Optional latency/accuracy tradeoff for `gpt-realtime-whisper`. Supported values are `minimal`, `low`, `medium`, `high`, and `xhigh`.

97- `audio.input.turn_detection`: Optional voice activity detection for models that support it. For `gpt-realtime-whisper`, omit this field or set it to `null`, then commit audio manually.

103 98

104## Stream audio99## Stream audio

105 100

124);119);

125```120```

126 121

127With server VAD enabled, the session commits audio automatically when it detects a turn boundary.122For models that support server VAD, the session commits audio automatically when it detects a turn boundary.

128 123

129## Handle transcript events124## Handle transcript events

130 125

172 167

173Streaming transcription trades latency for transcript quality. Lower delay settings can produce earlier partial text. Higher delay settings give the model more audio context before emitting text and can improve word error rate.168Streaming transcription trades latency for transcript quality. Lower delay settings can produce earlier partial text. Higher delay settings give the model more audio context before emitting text and can improve word error rate.

174 169

175Start by testing a few delay targets against your real audio. Useful evaluation points are:170Start by setting `audio.input.transcription.delay` and testing against your real audio. Useful starting points are:

171

172- `minimal` for the most latency-sensitive interactions;

173- `low` for low-latency live captions;

174- `medium` for a balanced latency/accuracy tradeoff;

175- `high` when accuracy matters more than immediate display;

176- `xhigh` when your workflow can tolerate the most delay for additional context.

176 177

177- 0.4 seconds for the most latency-sensitive interactions;178The exact delay in milliseconds can vary by model configuration, so benchmark with representative audio instead of assuming a fixed timing per level.

178- 0.8 to 1.2 seconds for balanced live captions;

179- 1.5 to 2.0 seconds when accuracy matters more than immediate display;

180- 3.0 seconds for workflows that can tolerate more delay.

181 179

182Don't choose a setting from synthetic audio alone. Test with representative microphones, telephony audio, accents, background noise, code-switching, domain vocabulary, and long sessions.180Don't choose a setting from synthetic audio alone. Test with representative microphones, telephony audio, accents, background noise, code-switching, domain vocabulary, and long sessions.

183 181

184## Guide vocabulary and domain terms182## Guide vocabulary and domain terms

185 183

186If your application depends on exact domain vocabulary, include a language hint and test whether your model and endpoint support prompt or keyword steering before relying on it. Where supported, use short keyword lists rather than long instructions.184If your application depends on exact domain vocabulary, include a language hint and use prompt or keyword steering only when your selected model supports it. For `gpt-realtime-whisper` in GA Realtime sessions, `prompt` is not supported.

185

186Where prompt steering is available, use short keyword lists rather than long instructions. The model is already instructed to transcribe, so focus prompts on domain vocabulary, spelling, or style rather than re-stating the transcription task.

187 187

188Example keyword style:188Example keyword style:

189 189

guides/realtime-vad.md +22 −5

Details

1# Voice activity detection (VAD)1# Voice activity detection (VAD)

2 2

3Voice activity detection (VAD) is a feature available in the Realtime API allowing to automatically detect when the user has started or stopped speaking.3Voice activity detection (VAD) is a feature available in the Realtime API allowing to automatically detect when the user has started or stopped speaking.

4It is enabled by default in [speech-to-speech](https://developers.openai.com/api/docs/guides/realtime-conversations) or [transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) Realtime sessions, but is optional and can be turned off.4It is enabled by default in [speech-to-speech](https://developers.openai.com/api/docs/guides/realtime-conversations) Realtime sessions, but is optional and can be turned off.

5In [transcription](https://developers.openai.com/api/docs/guides/realtime-transcription) Realtime sessions, turn detection support depends on the transcription model. Models that support VAD default to `server_vad`, while `gpt-realtime-whisper` requires turn detection to be omitted or set to `null`.

5 6

6## Overview7## Overview

7 8

12 13

13You can use these events to handle speech turns in your application. For example, you can use them to manage conversation state or process transcripts in chunks.14You can use these events to handle speech turns in your application. For example, you can use them to manage conversation state or process transcripts in chunks.

14 15

~~15You can use the `turn_detection` property of the `session.update` event to configure how audio is chunked within each speech-to-text sample.~~16You can configure VAD with the [`session.update`](https://developers.openai.com/api/docs/api-reference/realtime-client-events/session/update) client event by setting `session.audio.input.turn_detection`.

16 17

17There are two modes for VAD:18There are two modes for VAD:

18 19

19- `server_vad`: Automatically chunks the audio based on periods of silence.20- `server_vad`: Automatically chunks the audio based on periods of silence.

20- `semantic_vad`: Chunks the audio when the model believes based on the words said by the user that they have completed their utterance.21- `semantic_vad`: Chunks the audio when the model believes based on the words said by the user that they have completed their utterance.

21 22

~~22The default value is `server_vad`.~~23For sessions and models that support VAD, the default value is `server_vad`.

23 24

24Read below to learn more about the different modes.25Read below to learn more about the different modes.

25 26

26## Server VAD27## Server VAD

27 28

~~28Server VAD is the default mode for Realtime sessions, and uses periods of silence to automatically chunk the audio.~~29Server VAD is the default mode for speech-to-speech sessions, and for transcription sessions on models that support turn detection. It uses periods of silence to automatically chunk the audio.

29 30

30You can adjust the following properties to fine-tune the VAD settings:31You can adjust the following properties to fine-tune the VAD settings:

31 32

39{40{

40 "type": "session.update",41 "type": "session.update",

41 "session": {42 "session": {

43 "type": "realtime",

44 "audio": {

45 "input": {

42 "turn_detection": {46 "turn_detection": {

43 "type": "server_vad",47 "type": "server_vad",

44 "threshold": 0.5,48 "threshold": 0.5,

48 "interrupt_response": true // only in conversation mode52 "interrupt_response": true // only in conversation mode

49 }53 }

50 }54 }

55 }

56 }

51}57}

52```58```

53 59

60Use the same `session.audio.input.turn_detection` field in transcription sessions. For `gpt-realtime-whisper`, omit turn detection or set it to `null`.

62The `create_response` and `interrupt_response` fields are only used in speech-to-speech conversations. In transcription sessions, VAD only controls how audio is chunked.

54## Semantic VAD64## Semantic VAD

55 65

56Semantic VAD is a new mode that uses a semantic classifier to detect when the user has finished speaking, based on the words they have uttered.66Semantic VAD is a new mode that uses a semantic classifier to detect when the user has finished speaking, based on the words they have uttered.

59 69

60With this mode, the model is less likely to interrupt the user during a speech-to-speech conversation, or chunk a transcript before the user is done speaking.70With this mode, the model is less likely to interrupt the user during a speech-to-speech conversation, or chunk a transcript before the user is done speaking.

61 71

62Semantic VAD can be activated by setting `turn_detection.type` to `semantic_vad` in a [`session.update`](https://developers.openai.com/api/docs/api-reference/realtime-client-events/session/update) event.72Semantic VAD can be activated by setting `session.audio.input.turn_detection.type` to `semantic_vad`.

63 73

64It can be configured like this:74It can be configured like this:

65 75

67{77{

68 "type": "session.update",78 "type": "session.update",

69 "session": {79 "session": {

80 "type": "realtime",

81 "audio": {

82 "input": {

70 "turn_detection": {83 "turn_detection": {

71 "type": "semantic_vad",84 "type": "semantic_vad",

74 "interrupt_response": true, // only in conversation mode87 "interrupt_response": true, // only in conversation mode

75 }88 }

76 }89 }

90 }

91 }

77}92}

78```93```

79 94

95The same `session.audio.input.turn_detection` field applies in transcription sessions. The `create_response` and `interrupt_response` fields are conversation-only.

80The optional `eagerness` property is a way to control how eager the model is to interrupt the user, tuning the maximum wait timeout. In transcription mode, even if the model doesn't reply, it affects how the audio is chunked.97The optional `eagerness` property is a way to control how eager the model is to interrupt the user, tuning the maximum wait timeout. In transcription mode, even if the model doesn't reply, it affects how the audio is chunked.

81 98

82- `auto` is the default value, and is equivalent to `medium`.99- `auto` is the default value, and is equivalent to `medium`.

guides/secure-mcp-tunnels.md +138 −0 created

Details

1# Secure MCP Tunnel

3Secure MCP Tunnel lets you connect private MCP servers to supported OpenAI products without opening inbound firewall ports or exposing those servers to the public internet. Run `tunnel-client` inside the network that can already reach your MCP server; it opens an outbound HTTPS path to OpenAI, pulls queued MCP work, forwards requests locally, and returns responses through the same tunnel.

5## Use Secure MCP Tunnel when

7- Your MCP server runs on a private network, on-premises, on a developer machine, or behind existing access controls.

8- You want ChatGPT, Codex, the Responses API, or another supported OpenAI surface to use that server without making the MCP server public.

9- Your network allows the host running `tunnel-client` to make outbound HTTPS requests to OpenAI.

10- Start with the [MCP and Connectors guide](https://developers.openai.com/api/docs/guides/tools-connectors-mcp) for general MCP concepts.

12## How it works

141. Create or manage an OpenAI-hosted MCP tunnel endpoint in Platform tunnel settings.

152. Run `tunnel-client` inside the network that can reach your private MCP server.

163. Configure `tunnel-client` with the tunnel identity and the private MCP server address.

174. OpenAI products send MCP requests to the OpenAI-hosted tunnel endpoint.

185. `tunnel-client` long-polls for queued work, forwards each `JSON-RPC` request to the private MCP server, and posts the response back through the tunnel.

20The private MCP server does not need a public listener. The OpenAI-hosted endpoint gives supported products a normal MCP request path, while the network initiation point stays inside your boundary. When a connector asks for streamed results, the tunnel path can forward intermediate server-sent events.

22<figure className="not-prose my-8">

23 <figcaption className="mt-3 text-sm text-gray-600 dark:text-gray-400">

24 OpenAI products call the OpenAI-hosted tunnel endpoint; `tunnel-client`

25 long-polls for queued work and returns the MCP response through the same

26 tunnel.

27 </figcaption>

28</figure>

30## Before you start

32You need:

34- A `tunnel_id` from [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

35- A runtime API key for `tunnel-client`. The key principal needs Tunnels **Read** + **Use** for the target tunnel.

36- A tunnel manager with Tunnels **Read** + **Manage** if you need to create or edit tunnel metadata.

37- An MCP server that `tunnel-client` can reach over stdio or HTTP from inside your network.

39## Set up tunnel-client

41Open [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels), then download the latest public `tunnel-client` release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest). Keep your runbook pointed at the latest-release URL instead of hard-coding a specific release URL.

43For a local stdio MCP server, the shortest profile-based flow is:

45```bash

46export CONTROL_PLANE_API_KEY="sk-..."

48tunnel-client init \

49 --sample sample_mcp_stdio_local \

50 --profile local-stdio \

51 --tunnel-id tunnel_0123456789abcdef0123456789abcdef \

52 --mcp-command "python /path/to/server.py"

54tunnel-client doctor --profile local-stdio --explain

55tunnel-client run --profile local-stdio

56```

58For an HTTP MCP server, use `--mcp-server-url https://mcp.internal.example.com/mcp` instead of `--mcp-command`.

60Keep `tunnel-client run ...` healthy while you create or test the connector. Connector discovery and MCP tool calls depend on the running client.

62## Connect from ChatGPT

64Open [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors), create a custom connector, and choose **Tunnel** under **Connection**. Select an available tunnel when ChatGPT lists it, or paste a valid `tunnel_id` if you already have one.

66If the tunnel does not appear in ChatGPT, verify that the tunnel is associated with the target workspace and that the connector operator has Tunnels **Read** + **Use**.

68## Security and networking

70<figure className="not-prose my-8">

71 <figcaption className="mt-3 text-sm text-gray-600 dark:text-gray-400">

72 The private MCP server stays inside the customer-controlled environment.

73 `tunnel-client` reaches OpenAI over outbound HTTPS using the runtime API key

74 and, when required, optional control-plane mTLS.

75 </figcaption>

76</figure>

78- The MCP server address stays private and is used only from inside the environment where `tunnel-client` runs.

79- `tunnel-client` authenticates to the OpenAI tunnel control plane; supported OpenAI products use the OpenAI-hosted tunnel endpoint.

80- Tunnel access follows the existing organization and workspace context instead of introducing a separate public ingress path.

81- `tunnel-client` supports enterprise networking requirements such as outbound proxies, custom CA bundles, control-plane client certificates, and MCP-side `mTLS`.

82- The embedded Harpoon MCP server is limited to labeled, allowlisted HTTP callouts used by flows such as OAuth metadata handling. It is not a general-purpose outbound proxy.

84## Troubleshooting

86- **Tunnel not visible in ChatGPT:** Check the tunnel workspace scope and the connector operator's Tunnels **Use** permission.

87- **Connector discovery or tool calls fail:** Confirm that `tunnel-client run ...` is still running, then re-run `tunnel-client doctor --profile <name> --explain`.

88- **You can inspect a tunnel but cannot edit it:** The operator likely has Tunnels **Read** but not Tunnels **Manage**.

89- `tunnel-client` exposes `/healthz`, `/readyz`, `/metrics`, and a local admin UI at `/ui`.

90- Use those surfaces to confirm that the client is healthy, ready, and polling before testing from ChatGPT, Codex, or an API flow.

91- If the client is not connected, requests through the tunnel fail until `tunnel-client` reconnects.

92- Raw HTTP logging is disabled by default, and support exports are redacted.

94## OAuth

96- OAuth discovery can travel through the tunnel path so the MCP server itself can remain private.

97- The tunnel preserves the upstream authorization server metadata needed for browser-facing OAuth flows.

98- The authorization server itself is not automatically tunneled. If it is unreachable from the public internet and from the `tunnel-client` host, the OAuth flow can still fail even when the MCP server is reachable.

100## Where to configure it

101

102- Manage OpenAI-hosted MCP tunnel endpoints in [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

103- Use a tunnel when creating a connector from [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors).

104- For Codex or API flows, use the tunnel-backed MCP target exposed by the supported product surface.

105

106## Next steps

107

108- Create or manage the tunnel in [Platform tunnel settings](https://platform.openai.com/settings/organization/tunnels).

109- Validate your `tunnel-client` profile with `tunnel-client doctor --profile <profile> --explain`.

110- Connect the tunnel from [ChatGPT connector settings](https://chatgpt.com/#settings/Connectors) or the supported OpenAI surface you are using.

111

112<div class="not-prose my-8 grid gap-4 lg:grid-cols-2">

113 <figure>

114 <a href="https://platform.openai.com/settings/organization/tunnels">

115 <img src="https://developers.openai.com/images/platform/guides/secure-mcp-tunnels/platform-tunnels-settings.png"

116 alt="Sanitized OpenAI Platform tunnel settings screenshot."

117 loading="lazy"

118 class="w-full rounded-md border border-gray-200 dark:border-gray-800"

119 />

120 </a>

121 <figcaption class="mt-3 text-sm text-gray-600 dark:text-gray-400">

122 Create and manage OpenAI-hosted MCP tunnel endpoints from Platform tunnel

123 settings.

124 </figcaption>

125 </figure>

126 <figure>

127 <a href="https://chatgpt.com/#settings/Connectors">

128 <img src="https://developers.openai.com/images/platform/guides/secure-mcp-tunnels/chatgpt-connectors-tunnel.png"

129 alt="Sanitized ChatGPT connector settings screenshot with Tunnel selected."

130 loading="lazy"

131 class="w-full rounded-md border border-gray-200 dark:border-gray-800"

132 />

133 </a>

134 <figcaption class="mt-3 text-sm text-gray-600 dark:text-gray-400">

135 Select Tunnel when connecting a ChatGPT connector to a private MCP server.

136 </figcaption>

137 </figure>

138</div>

guides/tools-connectors-mcp.md +4 −0

Details

17 17

18This guide will show how to use both remote MCP servers and connectors to give the model access to new capabilities.18This guide will show how to use both remote MCP servers and connectors to give the model access to new capabilities.

19 19

20## Secure MCP Tunnel

22If your MCP server is private, use [Secure MCP Tunnel](https://developers.openai.com/api/docs/guides/secure-mcp-tunnels) to connect it to supported OpenAI products without exposing the server to the public internet. Download the latest public release from [openai/tunnel-client](https://github.com/openai/tunnel-client/releases/latest).

20## Quickstart24## Quickstart

21 25

22Check out the examples below to see how remote MCP servers and connectors work through the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Both connectors and remote MCP servers can be used with the `mcp` built-in tool type.26Check out the examples below to see how remote MCP servers and connectors work through the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Both connectors and remote MCP servers can be used with the `mcp` built-in tool type.

Documentation 2026-05-19 11:58 UTC to 2026-05-20 06:35 UTC