python/resources/realtime/subresources/calls/index.md +0 −1685 deleted
File Deleted View Diff
1# Calls
2
3## Accept call
4
5`realtime.calls.accept(strcall_id, CallAcceptParams**kwargs)`
6
7**post** `/realtime/calls/{call_id}/accept`
8
9Accept an incoming SIP call and configure the realtime session that will
10handle it.
11
12### Parameters
13
14- `call_id: str`
15
16- `type: Literal["realtime"]`
17
18 The type of session to create. Always `realtime` for the Realtime API.
19
20 - `"realtime"`
21
22- `audio: Optional[RealtimeAudioConfigParam]`
23
24 Configuration for input and output audio.
25
26 - `input: Optional[RealtimeAudioConfigInput]`
27
28 - `format: Optional[RealtimeAudioFormats]`
29
30 The format of the input audio.
31
32 - `class AudioPCM: …`
33
34 The PCM audio format. Only a 24kHz sample rate is supported.
35
36 - `rate: Optional[Literal[24000]]`
37
38 The sample rate of the audio. Always `24000`.
39
40 - `24000`
41
42 - `type: Optional[Literal["audio/pcm"]]`
43
44 The audio format. Always `audio/pcm`.
45
46 - `"audio/pcm"`
47
48 - `class AudioPCMU: …`
49
50 The G.711 μ-law format.
51
52 - `type: Optional[Literal["audio/pcmu"]]`
53
54 The audio format. Always `audio/pcmu`.
55
56 - `"audio/pcmu"`
57
58 - `class AudioPCMA: …`
59
60 The G.711 A-law format.
61
62 - `type: Optional[Literal["audio/pcma"]]`
63
64 The audio format. Always `audio/pcma`.
65
66 - `"audio/pcma"`
67
68 - `noise_reduction: Optional[NoiseReduction]`
69
70 Configuration for input audio noise reduction. This can be set to `null` to turn off.
71 Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model.
72 Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.
73
74 - `type: Optional[NoiseReductionType]`
75
76 Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones.
77
78 - `"near_field"`
79
80 - `"far_field"`
81
82 - `transcription: Optional[AudioTranscription]`
83
84 Configuration for input audio transcription, defaults to off and can be set to `null` to turn off once on. Input audio transcription is not native to the model, since the model consumes audio directly. Transcription runs asynchronously through [the /audio/transcriptions endpoint](https://platform.openai.com/docs/api-reference/audio/createTranscription) and should be treated as guidance of input audio content rather than precisely what the model heard. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.
85
86 - `delay: Optional[Literal["minimal", "low", "medium", 2 more]]`
87
88 Controls how long the model waits before emitting transcription text.
89 Higher values can improve transcription accuracy at the cost of latency.
90 Only supported with `gpt-realtime-whisper` in GA Realtime sessions.
91
92 - `"minimal"`
93
94 - `"low"`
95
96 - `"medium"`
97
98 - `"high"`
99
100 - `"xhigh"`
101
102 - `language: Optional[str]`
103
104 The language of the input audio. Supplying the input language in
105 [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) (e.g. `en`) format
106 will improve accuracy and latency.
107
108 - `model: Optional[Union[str, Literal["whisper-1", "gpt-4o-mini-transcribe", "gpt-4o-mini-transcribe-2025-12-15", 3 more], null]]`
109
110 The model to use for transcription. Current options are `whisper-1`, `gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `gpt-4o-transcribe`, `gpt-4o-transcribe-diarize`, and `gpt-realtime-whisper`. Use `gpt-4o-transcribe-diarize` when you need diarization with speaker labels.
111
112 - `str`
113
114 - `Literal["whisper-1", "gpt-4o-mini-transcribe", "gpt-4o-mini-transcribe-2025-12-15", 3 more]`
115
116 The model to use for transcription. Current options are `whisper-1`, `gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `gpt-4o-transcribe`, `gpt-4o-transcribe-diarize`, and `gpt-realtime-whisper`. Use `gpt-4o-transcribe-diarize` when you need diarization with speaker labels.
117
118 - `"whisper-1"`
119
120 - `"gpt-4o-mini-transcribe"`
121
122 - `"gpt-4o-mini-transcribe-2025-12-15"`
123
124 - `"gpt-4o-transcribe"`
125
126 - `"gpt-4o-transcribe-diarize"`
127
128 - `"gpt-realtime-whisper"`
129
130 - `prompt: Optional[str]`
131
132 An optional text to guide the model's style or continue a previous audio
133 segment.
134 For `whisper-1`, the [prompt is a list of keywords](https://platform.openai.com/docs/guides/speech-to-text#prompting).
135 For `gpt-4o-transcribe` models (excluding `gpt-4o-transcribe-diarize`), the prompt is a free text string, for example "expect words related to technology".
136 Prompt is not supported with `gpt-realtime-whisper` in GA Realtime sessions.
137
138 - `turn_detection: Optional[RealtimeAudioInputTurnDetection]`
139
140 Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.
141
142 Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.
143
144 Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with "uhhm", the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.
145
146 For `gpt-realtime-whisper` transcription sessions, turn detection must be
147 set to `null`; VAD is not supported.
148
149 - `class ServerVad: …`
150
151 Server-side voice activity detection (VAD) which flips on when user speech is detected and off after a period of silence.
152
153 - `type: Literal["server_vad"]`
154
155 Type of turn detection, `server_vad` to turn on simple Server VAD.
156
157 - `"server_vad"`
158
159 - `create_response: Optional[bool]`
160
161 Whether or not to automatically generate a response when a VAD stop event occurs. If `interrupt_response` is set to `false` this may fail to create a response if the model is already responding.
162
163 If both `create_response` and `interrupt_response` are set to `false`, the model will never respond automatically but VAD events will still be emitted.
164
165 - `idle_timeout_ms: Optional[int]`
166
167 Optional timeout after which a model response will be triggered automatically. This is
168 useful for situations in which a long pause from the user is unexpected, such as a phone
169 call. The model will effectively prompt the user to continue the conversation based
170 on the current context.
171
172 The timeout value will be applied after the last model response's audio has finished playing,
173 i.e. it's set to the `response.done` time plus audio playback duration.
174
175 An `input_audio_buffer.timeout_triggered` event (plus events
176 associated with the Response) will be emitted when the timeout is reached.
177 Idle timeout is currently only supported for `server_vad` mode.
178
179 - `interrupt_response: Optional[bool]`
180
181 Whether or not to automatically interrupt (cancel) any ongoing response with output to the default
182 conversation (i.e. `conversation` of `auto`) when a VAD start event occurs. If `true` then the response will be cancelled, otherwise it will continue until complete.
183
184 If both `create_response` and `interrupt_response` are set to `false`, the model will never respond automatically but VAD events will still be emitted.
185
186 - `prefix_padding_ms: Optional[int]`
187
188 Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in
189 milliseconds). Defaults to 300ms.
190
191 - `silence_duration_ms: Optional[int]`
192
193 Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults
194 to 500ms. With shorter values the model will respond more quickly,
195 but may jump in on short pauses from the user.
196
197 - `threshold: Optional[float]`
198
199 Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A
200 higher threshold will require louder audio to activate the model, and
201 thus might perform better in noisy environments.
202
203 - `class SemanticVad: …`
204
205 Server-side semantic turn detection which uses a model to determine when the user has finished speaking.
206
207 - `type: Literal["semantic_vad"]`
208
209 Type of turn detection, `semantic_vad` to turn on Semantic VAD.
210
211 - `"semantic_vad"`
212
213 - `create_response: Optional[bool]`
214
215 Whether or not to automatically generate a response when a VAD stop event occurs.
216
217 - `eagerness: Optional[Literal["low", "medium", "high", "auto"]]`
218
219 Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. `low`, `medium`, and `high` have max timeouts of 8s, 4s, and 2s respectively.
220
221 - `"low"`
222
223 - `"medium"`
224
225 - `"high"`
226
227 - `"auto"`
228
229 - `interrupt_response: Optional[bool]`
230
231 Whether or not to automatically interrupt any ongoing response with output to the default
232 conversation (i.e. `conversation` of `auto`) when a VAD start event occurs.
233
234 - `output: Optional[RealtimeAudioConfigOutput]`
235
236 - `format: Optional[RealtimeAudioFormats]`
237
238 The format of the output audio.
239
240 - `speed: Optional[float]`
241
242 The speed of the model's spoken response as a multiple of the original speed.
243 1.0 is the default speed. 0.25 is the minimum speed. 1.5 is the maximum speed. This value can only be changed in between model turns, not while a response is in progress.
244
245 This parameter is a post-processing adjustment to the audio after it is generated, it's
246 also possible to prompt the model to speak faster or slower.
247
248 - `voice: Optional[Voice]`
249
250 The voice the model uses to respond. Supported built-in voices are
251 `alloy`, `ash`, `ballad`, `coral`, `echo`, `sage`, `shimmer`, `verse`,
252 `marin`, and `cedar`. You may also provide a custom voice object with
253 an `id`, for example `{ "id": "voice_1234" }`. Voice cannot be changed
254 during the session once the model has responded with audio at least once.
255 We recommend `marin` and `cedar` for best quality.
256
257 - `str`
258
259 - `Literal["alloy", "ash", "ballad", 7 more]`
260
261 - `"alloy"`
262
263 - `"ash"`
264
265 - `"ballad"`
266
267 - `"coral"`
268
269 - `"echo"`
270
271 - `"sage"`
272
273 - `"shimmer"`
274
275 - `"verse"`
276
277 - `"marin"`
278
279 - `"cedar"`
280
281 - `class VoiceID: …`
282
283 Custom voice reference.
284
285 - `id: str`
286
287 The custom voice ID, e.g. `voice_1234`.
288
289- `include: Optional[List[Literal["item.input_audio_transcription.logprobs"]]]`
290
291 Additional fields to include in server outputs.
292
293 `item.input_audio_transcription.logprobs`: Include logprobs for input audio transcription.
294
295 - `"item.input_audio_transcription.logprobs"`
296
297- `instructions: Optional[str]`
298
299 The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior.
300
301 Note that the server sets default instructions which will be used if this field is not set and are visible in the `session.created` event at the start of the session.
302
303- `max_output_tokens: Optional[Union[int, Literal["inf"]]]`
304
305 Maximum number of output tokens for a single assistant response,
306 inclusive of tool calls. Provide an integer between 1 and 4096 to
307 limit output tokens, or `inf` for the maximum available tokens for a
308 given model. Defaults to `inf`.
309
310 - `int`
311
312 - `Literal["inf"]`
313
314 - `"inf"`
315
316- `model: Optional[Union[str, Literal["gpt-realtime", "gpt-realtime-1.5", "gpt-realtime-2", 14 more]]]`
317
318 The Realtime model used for this session.
319
320 - `str`
321
322 - `Literal["gpt-realtime", "gpt-realtime-1.5", "gpt-realtime-2", 14 more]`
323
324 The Realtime model used for this session.
325
326 - `"gpt-realtime"`
327
328 - `"gpt-realtime-1.5"`
329
330 - `"gpt-realtime-2"`
331
332 - `"gpt-realtime-2025-08-28"`
333
334 - `"gpt-4o-realtime-preview"`
335
336 - `"gpt-4o-realtime-preview-2024-10-01"`
337
338 - `"gpt-4o-realtime-preview-2024-12-17"`
339
340 - `"gpt-4o-realtime-preview-2025-06-03"`
341
342 - `"gpt-4o-mini-realtime-preview"`
343
344 - `"gpt-4o-mini-realtime-preview-2024-12-17"`
345
346 - `"gpt-realtime-mini"`
347
348 - `"gpt-realtime-mini-2025-10-06"`
349
350 - `"gpt-realtime-mini-2025-12-15"`
351
352 - `"gpt-audio-1.5"`
353
354 - `"gpt-audio-mini"`
355
356 - `"gpt-audio-mini-2025-10-06"`
357
358 - `"gpt-audio-mini-2025-12-15"`
359
360- `output_modalities: Optional[List[Literal["text", "audio"]]]`
361
362 The set of modalities the model can respond with. It defaults to `["audio"]`, indicating
363 that the model will respond with audio plus a transcript. `["text"]` can be used to make
364 the model respond with text only. It is not possible to request both `text` and `audio` at the same time.
365
366 - `"text"`
367
368 - `"audio"`
369
370- `parallel_tool_calls: Optional[bool]`
371
372 Whether the model may call multiple tools in parallel. Only supported by
373 reasoning Realtime models such as `gpt-realtime-2`.
374
375- `prompt: Optional[ResponsePromptParam]`
376
377 Reference to a prompt template and its variables.
378 [Learn more](https://platform.openai.com/docs/guides/text?api-mode=responses#reusable-prompts).
379
380 - `id: str`
381
382 The unique identifier of the prompt template to use.
383
384 - `variables: Optional[Dict[str, Variables]]`
385
386 Optional map of values to substitute in for variables in your
387 prompt. The substitution values can either be strings, or other
388 Response input types like images or files.
389
390 - `str`
391
392 - `class ResponseInputText: …`
393
394 A text input to the model.
395
396 - `text: str`
397
398 The text input to the model.
399
400 - `type: Literal["input_text"]`
401
402 The type of the input item. Always `input_text`.
403
404 - `"input_text"`
405
406 - `class ResponseInputImage: …`
407
408 An image input to the model. Learn about [image inputs](https://platform.openai.com/docs/guides/vision).
409
410 - `detail: Literal["low", "high", "auto", "original"]`
411
412 The detail level of the image to be sent to the model. One of `high`, `low`, `auto`, or `original`. Defaults to `auto`.
413
414 - `"low"`
415
416 - `"high"`
417
418 - `"auto"`
419
420 - `"original"`
421
422 - `type: Literal["input_image"]`
423
424 The type of the input item. Always `input_image`.
425
426 - `"input_image"`
427
428 - `file_id: Optional[str]`
429
430 The ID of the file to be sent to the model.
431
432 - `image_url: Optional[str]`
433
434 The URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
435
436 - `class ResponseInputFile: …`
437
438 A file input to the model.
439
440 - `type: Literal["input_file"]`
441
442 The type of the input item. Always `input_file`.
443
444 - `"input_file"`
445
446 - `detail: Optional[Literal["low", "high"]]`
447
448 The detail level of the file to be sent to the model. Use `low` for the default rendering behavior, or `high` to render the file at higher quality. Defaults to `low`.
449
450 - `"low"`
451
452 - `"high"`
453
454 - `file_data: Optional[str]`
455
456 The content of the file to be sent to the model.
457
458 - `file_id: Optional[str]`
459
460 The ID of the file to be sent to the model.
461
462 - `file_url: Optional[str]`
463
464 The URL of the file to be sent to the model.
465
466 - `filename: Optional[str]`
467
468 The name of the file to be sent to the model.
469
470 - `version: Optional[str]`
471
472 Optional version of the prompt template.
473
474- `reasoning: Optional[RealtimeReasoningParam]`
475
476 Configuration for reasoning-capable Realtime models such as `gpt-realtime-2`.
477
478 - `effort: Optional[RealtimeReasoningEffort]`
479
480 Constrains effort on reasoning for reasoning-capable Realtime models such as
481 `gpt-realtime-2`.
482
483 - `"minimal"`
484
485 - `"low"`
486
487 - `"medium"`
488
489 - `"high"`
490
491 - `"xhigh"`
492
493- `tool_choice: Optional[RealtimeToolChoiceConfigParam]`
494
495 How the model chooses tools. Provide one of the string modes or force a specific
496 function/MCP tool.
497
498 - `Literal["none", "auto", "required"]`
499
500 - `"none"`
501
502 - `"auto"`
503
504 - `"required"`
505
506 - `class ToolChoiceFunction: …`
507
508 Use this option to force the model to call a specific function.
509
510 - `name: str`
511
512 The name of the function to call.
513
514 - `type: Literal["function"]`
515
516 For function calling, the type is always `function`.
517
518 - `"function"`
519
520 - `class ToolChoiceMcp: …`
521
522 Use this option to force the model to call a specific tool on a remote MCP server.
523
524 - `server_label: str`
525
526 The label of the MCP server to use.
527
528 - `type: Literal["mcp"]`
529
530 For MCP tools, the type is always `mcp`.
531
532 - `"mcp"`
533
534 - `name: Optional[str]`
535
536 The name of the tool to call on the server.
537
538- `tools: Optional[RealtimeToolsConfigParam]`
539
540 Tools available to the model.
541
542 - `class RealtimeFunctionTool: …`
543
544 - `description: Optional[str]`
545
546 The description of the function, including guidance on when and how
547 to call it, and guidance about what to tell the user when calling
548 (if anything).
549
550 - `name: Optional[str]`
551
552 The name of the function.
553
554 - `parameters: Optional[object]`
555
556 Parameters of the function in JSON Schema.
557
558 - `type: Optional[Literal["function"]]`
559
560 The type of the tool, i.e. `function`.
561
562 - `"function"`
563
564 - `class Mcp: …`
565
566 Give the model access to additional tools via remote Model Context Protocol
567 (MCP) servers. [Learn more about MCP](https://platform.openai.com/docs/guides/tools-remote-mcp).
568
569 - `server_label: str`
570
571 A label for this MCP server, used to identify it in tool calls.
572
573 - `type: Literal["mcp"]`
574
575 The type of the MCP tool. Always `mcp`.
576
577 - `"mcp"`
578
579 - `allowed_tools: Optional[McpAllowedTools]`
580
581 List of allowed tool names or a filter object.
582
583 - `List[str]`
584
585 A string array of allowed tool names
586
587 - `class McpAllowedToolsMcpToolFilter: …`
588
589 A filter object to specify which tools are allowed.
590
591 - `read_only: Optional[bool]`
592
593 Indicates whether or not a tool modifies data or is read-only. If an
594 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
595 it will match this filter.
596
597 - `tool_names: Optional[List[str]]`
598
599 List of allowed tool names.
600
601 - `authorization: Optional[str]`
602
603 An OAuth access token that can be used with a remote MCP server, either
604 with a custom MCP server URL or a service connector. Your application
605 must handle the OAuth authorization flow and provide the token here.
606
607 - `connector_id: Optional[Literal["connector_dropbox", "connector_gmail", "connector_googlecalendar", 5 more]]`
608
609 Identifier for service connectors, like those available in ChatGPT. One of
610 `server_url` or `connector_id` must be provided. Learn more about service
611 connectors [here](https://platform.openai.com/docs/guides/tools-remote-mcp#connectors).
612
613 Currently supported `connector_id` values are:
614
615 - Dropbox: `connector_dropbox`
616 - Gmail: `connector_gmail`
617 - Google Calendar: `connector_googlecalendar`
618 - Google Drive: `connector_googledrive`
619 - Microsoft Teams: `connector_microsoftteams`
620 - Outlook Calendar: `connector_outlookcalendar`
621 - Outlook Email: `connector_outlookemail`
622 - SharePoint: `connector_sharepoint`
623
624 - `"connector_dropbox"`
625
626 - `"connector_gmail"`
627
628 - `"connector_googlecalendar"`
629
630 - `"connector_googledrive"`
631
632 - `"connector_microsoftteams"`
633
634 - `"connector_outlookcalendar"`
635
636 - `"connector_outlookemail"`
637
638 - `"connector_sharepoint"`
639
640 - `defer_loading: Optional[bool]`
641
642 Whether this MCP tool is deferred and discovered via tool search.
643
644 - `headers: Optional[Dict[str, str]]`
645
646 Optional HTTP headers to send to the MCP server. Use for authentication
647 or other purposes.
648
649 - `require_approval: Optional[McpRequireApproval]`
650
651 Specify which of the MCP server's tools require approval.
652
653 - `class McpRequireApprovalMcpToolApprovalFilter: …`
654
655 Specify which of the MCP server's tools require approval. Can be
656 `always`, `never`, or a filter object associated with tools
657 that require approval.
658
659 - `always: Optional[McpRequireApprovalMcpToolApprovalFilterAlways]`
660
661 A filter object to specify which tools are allowed.
662
663 - `read_only: Optional[bool]`
664
665 Indicates whether or not a tool modifies data or is read-only. If an
666 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
667 it will match this filter.
668
669 - `tool_names: Optional[List[str]]`
670
671 List of allowed tool names.
672
673 - `never: Optional[McpRequireApprovalMcpToolApprovalFilterNever]`
674
675 A filter object to specify which tools are allowed.
676
677 - `read_only: Optional[bool]`
678
679 Indicates whether or not a tool modifies data or is read-only. If an
680 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
681 it will match this filter.
682
683 - `tool_names: Optional[List[str]]`
684
685 List of allowed tool names.
686
687 - `Literal["always", "never"]`
688
689 Specify a single approval policy for all tools. One of `always` or
690 `never`. When set to `always`, all tools will require approval. When
691 set to `never`, all tools will not require approval.
692
693 - `"always"`
694
695 - `"never"`
696
697 - `server_description: Optional[str]`
698
699 Optional description of the MCP server, used to provide more context.
700
701 - `server_url: Optional[str]`
702
703 The URL for the MCP server. One of `server_url` or `connector_id` must be
704 provided.
705
706- `tracing: Optional[RealtimeTracingConfigParam]`
707
708 Realtime API can write session traces to the [Traces Dashboard](https://platform.openai.com/logs?api=traces). Set to null to disable tracing. Once
709 tracing is enabled for a session, the configuration cannot be modified.
710
711 `auto` will create a trace for the session with default values for the
712 workflow name, group id, and metadata.
713
714 - `Literal["auto"]`
715
716 Enables tracing and sets default values for tracing configuration options. Always `auto`.
717
718 - `"auto"`
719
720 - `class TracingConfiguration: …`
721
722 Granular configuration for tracing.
723
724 - `group_id: Optional[str]`
725
726 The group id to attach to this trace to enable filtering and
727 grouping in the Traces Dashboard.
728
729 - `metadata: Optional[object]`
730
731 The arbitrary metadata to attach to this trace to enable
732 filtering in the Traces Dashboard.
733
734 - `workflow_name: Optional[str]`
735
736 The name of the workflow to attach to this trace. This is used to
737 name the trace in the Traces Dashboard.
738
739- `truncation: Optional[RealtimeTruncationParam]`
740
741 When the number of tokens in a conversation exceeds the model's input token limit, the conversation be truncated, meaning messages (starting from the oldest) will not be included in the model's context. A 32k context model with 4,096 max output tokens can only include 28,224 tokens in the context before truncation occurs.
742
743 Clients can configure truncation behavior to truncate with a lower max token limit, which is an effective way to control token usage and cost.
744
745 Truncation will reduce the number of cached tokens on the next turn (busting the cache), since messages are dropped from the beginning of the context. However, clients can also configure truncation to retain messages up to a fraction of the maximum context size, which will reduce the need for future truncations and thus improve the cache rate.
746
747 Truncation can be disabled entirely, which means the server will never truncate but would instead return an error if the conversation exceeds the model's input token limit.
748
749 - `Literal["auto", "disabled"]`
750
751 The truncation strategy to use for the session. `auto` is the default truncation strategy. `disabled` will disable truncation and emit errors when the conversation exceeds the input token limit.
752
753 - `"auto"`
754
755 - `"disabled"`
756
757 - `class RealtimeTruncationRetentionRatio: …`
758
759 Retain a fraction of the conversation tokens when the conversation exceeds the input token limit. This allows you to amortize truncations across multiple turns, which can help improve cached token usage.
760
761 - `retention_ratio: float`
762
763 Fraction of post-instruction conversation tokens to retain (`0.0` - `1.0`) when the conversation exceeds the input token limit. Setting this to `0.8` means that messages will be dropped until 80% of the maximum allowed tokens are used. This helps reduce the frequency of truncations and improve cache rates.
764
765 - `type: Literal["retention_ratio"]`
766
767 Use retention ratio truncation.
768
769 - `"retention_ratio"`
770
771 - `token_limits: Optional[TokenLimits]`
772
773 Optional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
774
775 - `post_instructions: Optional[int]`
776
777 Maximum tokens allowed in the conversation after instructions (which including tool definitions). For example, setting this to 5,000 would mean that truncation would occur when the conversation exceeds 5,000 tokens after instructions. This cannot be higher than the model's context window size minus the maximum output tokens.
778
779### Example
780
781```python
782import os
783from openai import OpenAI
784
785client = OpenAI(
786 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
787)
788client.realtime.calls.accept(
789 call_id="call_id",
790 type="realtime",
791)
792```
793
794## Hang up call
795
796`realtime.calls.hangup(strcall_id)`
797
798**post** `/realtime/calls/{call_id}/hangup`
799
800End an active Realtime API call, whether it was initiated over SIP or
801WebRTC.
802
803### Parameters
804
805- `call_id: str`
806
807### Example
808
809```python
810import os
811from openai import OpenAI
812
813client = OpenAI(
814 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
815)
816client.realtime.calls.hangup(
817 "call_id",
818)
819```
820
821## Refer call
822
823`realtime.calls.refer(strcall_id, CallReferParams**kwargs)`
824
825**post** `/realtime/calls/{call_id}/refer`
826
827Transfer an active SIP call to a new destination using the SIP REFER verb.
828
829### Parameters
830
831- `call_id: str`
832
833- `target_uri: str`
834
835 URI that should appear in the SIP Refer-To header. Supports values like
836 `tel:+14155550123` or `sip:agent@example.com`.
837
838### Example
839
840```python
841import os
842from openai import OpenAI
843
844client = OpenAI(
845 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
846)
847client.realtime.calls.refer(
848 call_id="call_id",
849 target_uri="tel:+14155550123",
850)
851```
852
853## Reject call
854
855`realtime.calls.reject(strcall_id, CallRejectParams**kwargs)`
856
857**post** `/realtime/calls/{call_id}/reject`
858
859Decline an incoming SIP call by returning a SIP status code to the caller.
860
861### Parameters
862
863- `call_id: str`
864
865- `status_code: Optional[int]`
866
867 SIP response code to send back to the caller. Defaults to `603` (Decline)
868 when omitted.
869
870### Example
871
872```python
873import os
874from openai import OpenAI
875
876client = OpenAI(
877 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
878)
879client.realtime.calls.reject(
880 call_id="call_id",
881)
882```
883
884## Create call
885
886`realtime.calls.create(CallCreateParams**kwargs) -> BinaryResponseContent`
887
888**post** `/realtime/calls`
889
890Create a new Realtime API call over WebRTC and receive the SDP answer needed
891to complete the peer connection.
892
893### Parameters
894
895- `sdp: str`
896
897 WebRTC Session Description Protocol (SDP) offer generated by the caller.
898
899- `session: Optional[RealtimeSessionCreateRequestParam]`
900
901 Realtime session object configuration.
902
903 - `type: Literal["realtime"]`
904
905 The type of session to create. Always `realtime` for the Realtime API.
906
907 - `"realtime"`
908
909 - `audio: Optional[RealtimeAudioConfig]`
910
911 Configuration for input and output audio.
912
913 - `input: Optional[RealtimeAudioConfigInput]`
914
915 - `format: Optional[RealtimeAudioFormats]`
916
917 The format of the input audio.
918
919 - `class AudioPCM: …`
920
921 The PCM audio format. Only a 24kHz sample rate is supported.
922
923 - `rate: Optional[Literal[24000]]`
924
925 The sample rate of the audio. Always `24000`.
926
927 - `24000`
928
929 - `type: Optional[Literal["audio/pcm"]]`
930
931 The audio format. Always `audio/pcm`.
932
933 - `"audio/pcm"`
934
935 - `class AudioPCMU: …`
936
937 The G.711 μ-law format.
938
939 - `type: Optional[Literal["audio/pcmu"]]`
940
941 The audio format. Always `audio/pcmu`.
942
943 - `"audio/pcmu"`
944
945 - `class AudioPCMA: …`
946
947 The G.711 A-law format.
948
949 - `type: Optional[Literal["audio/pcma"]]`
950
951 The audio format. Always `audio/pcma`.
952
953 - `"audio/pcma"`
954
955 - `noise_reduction: Optional[NoiseReduction]`
956
957 Configuration for input audio noise reduction. This can be set to `null` to turn off.
958 Noise reduction filters audio added to the input audio buffer before it is sent to VAD and the model.
959 Filtering the audio can improve VAD and turn detection accuracy (reducing false positives) and model performance by improving perception of the input audio.
960
961 - `type: Optional[NoiseReductionType]`
962
963 Type of noise reduction. `near_field` is for close-talking microphones such as headphones, `far_field` is for far-field microphones such as laptop or conference room microphones.
964
965 - `"near_field"`
966
967 - `"far_field"`
968
969 - `transcription: Optional[AudioTranscription]`
970
971 Configuration for input audio transcription, defaults to off and can be set to `null` to turn off once on. Input audio transcription is not native to the model, since the model consumes audio directly. Transcription runs asynchronously through [the /audio/transcriptions endpoint](https://platform.openai.com/docs/api-reference/audio/createTranscription) and should be treated as guidance of input audio content rather than precisely what the model heard. The client can optionally set the language and prompt for transcription, these offer additional guidance to the transcription service.
972
973 - `delay: Optional[Literal["minimal", "low", "medium", 2 more]]`
974
975 Controls how long the model waits before emitting transcription text.
976 Higher values can improve transcription accuracy at the cost of latency.
977 Only supported with `gpt-realtime-whisper` in GA Realtime sessions.
978
979 - `"minimal"`
980
981 - `"low"`
982
983 - `"medium"`
984
985 - `"high"`
986
987 - `"xhigh"`
988
989 - `language: Optional[str]`
990
991 The language of the input audio. Supplying the input language in
992 [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) (e.g. `en`) format
993 will improve accuracy and latency.
994
995 - `model: Optional[Union[str, Literal["whisper-1", "gpt-4o-mini-transcribe", "gpt-4o-mini-transcribe-2025-12-15", 3 more], null]]`
996
997 The model to use for transcription. Current options are `whisper-1`, `gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `gpt-4o-transcribe`, `gpt-4o-transcribe-diarize`, and `gpt-realtime-whisper`. Use `gpt-4o-transcribe-diarize` when you need diarization with speaker labels.
998
999 - `str`
1000
1001 - `Literal["whisper-1", "gpt-4o-mini-transcribe", "gpt-4o-mini-transcribe-2025-12-15", 3 more]`
1002
1003 The model to use for transcription. Current options are `whisper-1`, `gpt-4o-mini-transcribe`, `gpt-4o-mini-transcribe-2025-12-15`, `gpt-4o-transcribe`, `gpt-4o-transcribe-diarize`, and `gpt-realtime-whisper`. Use `gpt-4o-transcribe-diarize` when you need diarization with speaker labels.
1004
1005 - `"whisper-1"`
1006
1007 - `"gpt-4o-mini-transcribe"`
1008
1009 - `"gpt-4o-mini-transcribe-2025-12-15"`
1010
1011 - `"gpt-4o-transcribe"`
1012
1013 - `"gpt-4o-transcribe-diarize"`
1014
1015 - `"gpt-realtime-whisper"`
1016
1017 - `prompt: Optional[str]`
1018
1019 An optional text to guide the model's style or continue a previous audio
1020 segment.
1021 For `whisper-1`, the [prompt is a list of keywords](https://platform.openai.com/docs/guides/speech-to-text#prompting).
1022 For `gpt-4o-transcribe` models (excluding `gpt-4o-transcribe-diarize`), the prompt is a free text string, for example "expect words related to technology".
1023 Prompt is not supported with `gpt-realtime-whisper` in GA Realtime sessions.
1024
1025 - `turn_detection: Optional[RealtimeAudioInputTurnDetection]`
1026
1027 Configuration for turn detection, ether Server VAD or Semantic VAD. This can be set to `null` to turn off, in which case the client must manually trigger model response.
1028
1029 Server VAD means that the model will detect the start and end of speech based on audio volume and respond at the end of user speech.
1030
1031 Semantic VAD is more advanced and uses a turn detection model (in conjunction with VAD) to semantically estimate whether the user has finished speaking, then dynamically sets a timeout based on this probability. For example, if user audio trails off with "uhhm", the model will score a low probability of turn end and wait longer for the user to continue speaking. This can be useful for more natural conversations, but may have a higher latency.
1032
1033 For `gpt-realtime-whisper` transcription sessions, turn detection must be
1034 set to `null`; VAD is not supported.
1035
1036 - `class ServerVad: …`
1037
1038 Server-side voice activity detection (VAD) which flips on when user speech is detected and off after a period of silence.
1039
1040 - `type: Literal["server_vad"]`
1041
1042 Type of turn detection, `server_vad` to turn on simple Server VAD.
1043
1044 - `"server_vad"`
1045
1046 - `create_response: Optional[bool]`
1047
1048 Whether or not to automatically generate a response when a VAD stop event occurs. If `interrupt_response` is set to `false` this may fail to create a response if the model is already responding.
1049
1050 If both `create_response` and `interrupt_response` are set to `false`, the model will never respond automatically but VAD events will still be emitted.
1051
1052 - `idle_timeout_ms: Optional[int]`
1053
1054 Optional timeout after which a model response will be triggered automatically. This is
1055 useful for situations in which a long pause from the user is unexpected, such as a phone
1056 call. The model will effectively prompt the user to continue the conversation based
1057 on the current context.
1058
1059 The timeout value will be applied after the last model response's audio has finished playing,
1060 i.e. it's set to the `response.done` time plus audio playback duration.
1061
1062 An `input_audio_buffer.timeout_triggered` event (plus events
1063 associated with the Response) will be emitted when the timeout is reached.
1064 Idle timeout is currently only supported for `server_vad` mode.
1065
1066 - `interrupt_response: Optional[bool]`
1067
1068 Whether or not to automatically interrupt (cancel) any ongoing response with output to the default
1069 conversation (i.e. `conversation` of `auto`) when a VAD start event occurs. If `true` then the response will be cancelled, otherwise it will continue until complete.
1070
1071 If both `create_response` and `interrupt_response` are set to `false`, the model will never respond automatically but VAD events will still be emitted.
1072
1073 - `prefix_padding_ms: Optional[int]`
1074
1075 Used only for `server_vad` mode. Amount of audio to include before the VAD detected speech (in
1076 milliseconds). Defaults to 300ms.
1077
1078 - `silence_duration_ms: Optional[int]`
1079
1080 Used only for `server_vad` mode. Duration of silence to detect speech stop (in milliseconds). Defaults
1081 to 500ms. With shorter values the model will respond more quickly,
1082 but may jump in on short pauses from the user.
1083
1084 - `threshold: Optional[float]`
1085
1086 Used only for `server_vad` mode. Activation threshold for VAD (0.0 to 1.0), this defaults to 0.5. A
1087 higher threshold will require louder audio to activate the model, and
1088 thus might perform better in noisy environments.
1089
1090 - `class SemanticVad: …`
1091
1092 Server-side semantic turn detection which uses a model to determine when the user has finished speaking.
1093
1094 - `type: Literal["semantic_vad"]`
1095
1096 Type of turn detection, `semantic_vad` to turn on Semantic VAD.
1097
1098 - `"semantic_vad"`
1099
1100 - `create_response: Optional[bool]`
1101
1102 Whether or not to automatically generate a response when a VAD stop event occurs.
1103
1104 - `eagerness: Optional[Literal["low", "medium", "high", "auto"]]`
1105
1106 Used only for `semantic_vad` mode. The eagerness of the model to respond. `low` will wait longer for the user to continue speaking, `high` will respond more quickly. `auto` is the default and is equivalent to `medium`. `low`, `medium`, and `high` have max timeouts of 8s, 4s, and 2s respectively.
1107
1108 - `"low"`
1109
1110 - `"medium"`
1111
1112 - `"high"`
1113
1114 - `"auto"`
1115
1116 - `interrupt_response: Optional[bool]`
1117
1118 Whether or not to automatically interrupt any ongoing response with output to the default
1119 conversation (i.e. `conversation` of `auto`) when a VAD start event occurs.
1120
1121 - `output: Optional[RealtimeAudioConfigOutput]`
1122
1123 - `format: Optional[RealtimeAudioFormats]`
1124
1125 The format of the output audio.
1126
1127 - `speed: Optional[float]`
1128
1129 The speed of the model's spoken response as a multiple of the original speed.
1130 1.0 is the default speed. 0.25 is the minimum speed. 1.5 is the maximum speed. This value can only be changed in between model turns, not while a response is in progress.
1131
1132 This parameter is a post-processing adjustment to the audio after it is generated, it's
1133 also possible to prompt the model to speak faster or slower.
1134
1135 - `voice: Optional[Voice]`
1136
1137 The voice the model uses to respond. Supported built-in voices are
1138 `alloy`, `ash`, `ballad`, `coral`, `echo`, `sage`, `shimmer`, `verse`,
1139 `marin`, and `cedar`. You may also provide a custom voice object with
1140 an `id`, for example `{ "id": "voice_1234" }`. Voice cannot be changed
1141 during the session once the model has responded with audio at least once.
1142 We recommend `marin` and `cedar` for best quality.
1143
1144 - `str`
1145
1146 - `Literal["alloy", "ash", "ballad", 7 more]`
1147
1148 - `"alloy"`
1149
1150 - `"ash"`
1151
1152 - `"ballad"`
1153
1154 - `"coral"`
1155
1156 - `"echo"`
1157
1158 - `"sage"`
1159
1160 - `"shimmer"`
1161
1162 - `"verse"`
1163
1164 - `"marin"`
1165
1166 - `"cedar"`
1167
1168 - `class VoiceID: …`
1169
1170 Custom voice reference.
1171
1172 - `id: str`
1173
1174 The custom voice ID, e.g. `voice_1234`.
1175
1176 - `include: Optional[List[Literal["item.input_audio_transcription.logprobs"]]]`
1177
1178 Additional fields to include in server outputs.
1179
1180 `item.input_audio_transcription.logprobs`: Include logprobs for input audio transcription.
1181
1182 - `"item.input_audio_transcription.logprobs"`
1183
1184 - `instructions: Optional[str]`
1185
1186 The default system instructions (i.e. system message) prepended to model calls. This field allows the client to guide the model on desired responses. The model can be instructed on response content and format, (e.g. "be extremely succinct", "act friendly", "here are examples of good responses") and on audio behavior (e.g. "talk quickly", "inject emotion into your voice", "laugh frequently"). The instructions are not guaranteed to be followed by the model, but they provide guidance to the model on the desired behavior.
1187
1188 Note that the server sets default instructions which will be used if this field is not set and are visible in the `session.created` event at the start of the session.
1189
1190 - `max_output_tokens: Optional[Union[int, Literal["inf"], null]]`
1191
1192 Maximum number of output tokens for a single assistant response,
1193 inclusive of tool calls. Provide an integer between 1 and 4096 to
1194 limit output tokens, or `inf` for the maximum available tokens for a
1195 given model. Defaults to `inf`.
1196
1197 - `int`
1198
1199 - `Literal["inf"]`
1200
1201 - `"inf"`
1202
1203 - `model: Optional[Union[str, Literal["gpt-realtime", "gpt-realtime-1.5", "gpt-realtime-2", 14 more], null]]`
1204
1205 The Realtime model used for this session.
1206
1207 - `str`
1208
1209 - `Literal["gpt-realtime", "gpt-realtime-1.5", "gpt-realtime-2", 14 more]`
1210
1211 The Realtime model used for this session.
1212
1213 - `"gpt-realtime"`
1214
1215 - `"gpt-realtime-1.5"`
1216
1217 - `"gpt-realtime-2"`
1218
1219 - `"gpt-realtime-2025-08-28"`
1220
1221 - `"gpt-4o-realtime-preview"`
1222
1223 - `"gpt-4o-realtime-preview-2024-10-01"`
1224
1225 - `"gpt-4o-realtime-preview-2024-12-17"`
1226
1227 - `"gpt-4o-realtime-preview-2025-06-03"`
1228
1229 - `"gpt-4o-mini-realtime-preview"`
1230
1231 - `"gpt-4o-mini-realtime-preview-2024-12-17"`
1232
1233 - `"gpt-realtime-mini"`
1234
1235 - `"gpt-realtime-mini-2025-10-06"`
1236
1237 - `"gpt-realtime-mini-2025-12-15"`
1238
1239 - `"gpt-audio-1.5"`
1240
1241 - `"gpt-audio-mini"`
1242
1243 - `"gpt-audio-mini-2025-10-06"`
1244
1245 - `"gpt-audio-mini-2025-12-15"`
1246
1247 - `output_modalities: Optional[List[Literal["text", "audio"]]]`
1248
1249 The set of modalities the model can respond with. It defaults to `["audio"]`, indicating
1250 that the model will respond with audio plus a transcript. `["text"]` can be used to make
1251 the model respond with text only. It is not possible to request both `text` and `audio` at the same time.
1252
1253 - `"text"`
1254
1255 - `"audio"`
1256
1257 - `parallel_tool_calls: Optional[bool]`
1258
1259 Whether the model may call multiple tools in parallel. Only supported by
1260 reasoning Realtime models such as `gpt-realtime-2`.
1261
1262 - `prompt: Optional[ResponsePrompt]`
1263
1264 Reference to a prompt template and its variables.
1265 [Learn more](https://platform.openai.com/docs/guides/text?api-mode=responses#reusable-prompts).
1266
1267 - `id: str`
1268
1269 The unique identifier of the prompt template to use.
1270
1271 - `variables: Optional[Dict[str, Variables]]`
1272
1273 Optional map of values to substitute in for variables in your
1274 prompt. The substitution values can either be strings, or other
1275 Response input types like images or files.
1276
1277 - `str`
1278
1279 - `class ResponseInputText: …`
1280
1281 A text input to the model.
1282
1283 - `text: str`
1284
1285 The text input to the model.
1286
1287 - `type: Literal["input_text"]`
1288
1289 The type of the input item. Always `input_text`.
1290
1291 - `"input_text"`
1292
1293 - `class ResponseInputImage: …`
1294
1295 An image input to the model. Learn about [image inputs](https://platform.openai.com/docs/guides/vision).
1296
1297 - `detail: Literal["low", "high", "auto", "original"]`
1298
1299 The detail level of the image to be sent to the model. One of `high`, `low`, `auto`, or `original`. Defaults to `auto`.
1300
1301 - `"low"`
1302
1303 - `"high"`
1304
1305 - `"auto"`
1306
1307 - `"original"`
1308
1309 - `type: Literal["input_image"]`
1310
1311 The type of the input item. Always `input_image`.
1312
1313 - `"input_image"`
1314
1315 - `file_id: Optional[str]`
1316
1317 The ID of the file to be sent to the model.
1318
1319 - `image_url: Optional[str]`
1320
1321 The URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
1322
1323 - `class ResponseInputFile: …`
1324
1325 A file input to the model.
1326
1327 - `type: Literal["input_file"]`
1328
1329 The type of the input item. Always `input_file`.
1330
1331 - `"input_file"`
1332
1333 - `detail: Optional[Literal["low", "high"]]`
1334
1335 The detail level of the file to be sent to the model. Use `low` for the default rendering behavior, or `high` to render the file at higher quality. Defaults to `low`.
1336
1337 - `"low"`
1338
1339 - `"high"`
1340
1341 - `file_data: Optional[str]`
1342
1343 The content of the file to be sent to the model.
1344
1345 - `file_id: Optional[str]`
1346
1347 The ID of the file to be sent to the model.
1348
1349 - `file_url: Optional[str]`
1350
1351 The URL of the file to be sent to the model.
1352
1353 - `filename: Optional[str]`
1354
1355 The name of the file to be sent to the model.
1356
1357 - `version: Optional[str]`
1358
1359 Optional version of the prompt template.
1360
1361 - `reasoning: Optional[RealtimeReasoning]`
1362
1363 Configuration for reasoning-capable Realtime models such as `gpt-realtime-2`.
1364
1365 - `effort: Optional[RealtimeReasoningEffort]`
1366
1367 Constrains effort on reasoning for reasoning-capable Realtime models such as
1368 `gpt-realtime-2`.
1369
1370 - `"minimal"`
1371
1372 - `"low"`
1373
1374 - `"medium"`
1375
1376 - `"high"`
1377
1378 - `"xhigh"`
1379
1380 - `tool_choice: Optional[RealtimeToolChoiceConfig]`
1381
1382 How the model chooses tools. Provide one of the string modes or force a specific
1383 function/MCP tool.
1384
1385 - `Literal["none", "auto", "required"]`
1386
1387 - `"none"`
1388
1389 - `"auto"`
1390
1391 - `"required"`
1392
1393 - `class ToolChoiceFunction: …`
1394
1395 Use this option to force the model to call a specific function.
1396
1397 - `name: str`
1398
1399 The name of the function to call.
1400
1401 - `type: Literal["function"]`
1402
1403 For function calling, the type is always `function`.
1404
1405 - `"function"`
1406
1407 - `class ToolChoiceMcp: …`
1408
1409 Use this option to force the model to call a specific tool on a remote MCP server.
1410
1411 - `server_label: str`
1412
1413 The label of the MCP server to use.
1414
1415 - `type: Literal["mcp"]`
1416
1417 For MCP tools, the type is always `mcp`.
1418
1419 - `"mcp"`
1420
1421 - `name: Optional[str]`
1422
1423 The name of the tool to call on the server.
1424
1425 - `tools: Optional[RealtimeToolsConfig]`
1426
1427 Tools available to the model.
1428
1429 - `class RealtimeFunctionTool: …`
1430
1431 - `description: Optional[str]`
1432
1433 The description of the function, including guidance on when and how
1434 to call it, and guidance about what to tell the user when calling
1435 (if anything).
1436
1437 - `name: Optional[str]`
1438
1439 The name of the function.
1440
1441 - `parameters: Optional[object]`
1442
1443 Parameters of the function in JSON Schema.
1444
1445 - `type: Optional[Literal["function"]]`
1446
1447 The type of the tool, i.e. `function`.
1448
1449 - `"function"`
1450
1451 - `class Mcp: …`
1452
1453 Give the model access to additional tools via remote Model Context Protocol
1454 (MCP) servers. [Learn more about MCP](https://platform.openai.com/docs/guides/tools-remote-mcp).
1455
1456 - `server_label: str`
1457
1458 A label for this MCP server, used to identify it in tool calls.
1459
1460 - `type: Literal["mcp"]`
1461
1462 The type of the MCP tool. Always `mcp`.
1463
1464 - `"mcp"`
1465
1466 - `allowed_tools: Optional[McpAllowedTools]`
1467
1468 List of allowed tool names or a filter object.
1469
1470 - `List[str]`
1471
1472 A string array of allowed tool names
1473
1474 - `class McpAllowedToolsMcpToolFilter: …`
1475
1476 A filter object to specify which tools are allowed.
1477
1478 - `read_only: Optional[bool]`
1479
1480 Indicates whether or not a tool modifies data or is read-only. If an
1481 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
1482 it will match this filter.
1483
1484 - `tool_names: Optional[List[str]]`
1485
1486 List of allowed tool names.
1487
1488 - `authorization: Optional[str]`
1489
1490 An OAuth access token that can be used with a remote MCP server, either
1491 with a custom MCP server URL or a service connector. Your application
1492 must handle the OAuth authorization flow and provide the token here.
1493
1494 - `connector_id: Optional[Literal["connector_dropbox", "connector_gmail", "connector_googlecalendar", 5 more]]`
1495
1496 Identifier for service connectors, like those available in ChatGPT. One of
1497 `server_url` or `connector_id` must be provided. Learn more about service
1498 connectors [here](https://platform.openai.com/docs/guides/tools-remote-mcp#connectors).
1499
1500 Currently supported `connector_id` values are:
1501
1502 - Dropbox: `connector_dropbox`
1503 - Gmail: `connector_gmail`
1504 - Google Calendar: `connector_googlecalendar`
1505 - Google Drive: `connector_googledrive`
1506 - Microsoft Teams: `connector_microsoftteams`
1507 - Outlook Calendar: `connector_outlookcalendar`
1508 - Outlook Email: `connector_outlookemail`
1509 - SharePoint: `connector_sharepoint`
1510
1511 - `"connector_dropbox"`
1512
1513 - `"connector_gmail"`
1514
1515 - `"connector_googlecalendar"`
1516
1517 - `"connector_googledrive"`
1518
1519 - `"connector_microsoftteams"`
1520
1521 - `"connector_outlookcalendar"`
1522
1523 - `"connector_outlookemail"`
1524
1525 - `"connector_sharepoint"`
1526
1527 - `defer_loading: Optional[bool]`
1528
1529 Whether this MCP tool is deferred and discovered via tool search.
1530
1531 - `headers: Optional[Dict[str, str]]`
1532
1533 Optional HTTP headers to send to the MCP server. Use for authentication
1534 or other purposes.
1535
1536 - `require_approval: Optional[McpRequireApproval]`
1537
1538 Specify which of the MCP server's tools require approval.
1539
1540 - `class McpRequireApprovalMcpToolApprovalFilter: …`
1541
1542 Specify which of the MCP server's tools require approval. Can be
1543 `always`, `never`, or a filter object associated with tools
1544 that require approval.
1545
1546 - `always: Optional[McpRequireApprovalMcpToolApprovalFilterAlways]`
1547
1548 A filter object to specify which tools are allowed.
1549
1550 - `read_only: Optional[bool]`
1551
1552 Indicates whether or not a tool modifies data or is read-only. If an
1553 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
1554 it will match this filter.
1555
1556 - `tool_names: Optional[List[str]]`
1557
1558 List of allowed tool names.
1559
1560 - `never: Optional[McpRequireApprovalMcpToolApprovalFilterNever]`
1561
1562 A filter object to specify which tools are allowed.
1563
1564 - `read_only: Optional[bool]`
1565
1566 Indicates whether or not a tool modifies data or is read-only. If an
1567 MCP server is [annotated with `readOnlyHint`](https://modelcontextprotocol.io/specification/2025-06-18/schema#toolannotations-readonlyhint),
1568 it will match this filter.
1569
1570 - `tool_names: Optional[List[str]]`
1571
1572 List of allowed tool names.
1573
1574 - `Literal["always", "never"]`
1575
1576 Specify a single approval policy for all tools. One of `always` or
1577 `never`. When set to `always`, all tools will require approval. When
1578 set to `never`, all tools will not require approval.
1579
1580 - `"always"`
1581
1582 - `"never"`
1583
1584 - `server_description: Optional[str]`
1585
1586 Optional description of the MCP server, used to provide more context.
1587
1588 - `server_url: Optional[str]`
1589
1590 The URL for the MCP server. One of `server_url` or `connector_id` must be
1591 provided.
1592
1593 - `tracing: Optional[RealtimeTracingConfig]`
1594
1595 Realtime API can write session traces to the [Traces Dashboard](https://platform.openai.com/logs?api=traces). Set to null to disable tracing. Once
1596 tracing is enabled for a session, the configuration cannot be modified.
1597
1598 `auto` will create a trace for the session with default values for the
1599 workflow name, group id, and metadata.
1600
1601 - `Literal["auto"]`
1602
1603 Enables tracing and sets default values for tracing configuration options. Always `auto`.
1604
1605 - `"auto"`
1606
1607 - `class TracingConfiguration: …`
1608
1609 Granular configuration for tracing.
1610
1611 - `group_id: Optional[str]`
1612
1613 The group id to attach to this trace to enable filtering and
1614 grouping in the Traces Dashboard.
1615
1616 - `metadata: Optional[object]`
1617
1618 The arbitrary metadata to attach to this trace to enable
1619 filtering in the Traces Dashboard.
1620
1621 - `workflow_name: Optional[str]`
1622
1623 The name of the workflow to attach to this trace. This is used to
1624 name the trace in the Traces Dashboard.
1625
1626 - `truncation: Optional[RealtimeTruncation]`
1627
1628 When the number of tokens in a conversation exceeds the model's input token limit, the conversation be truncated, meaning messages (starting from the oldest) will not be included in the model's context. A 32k context model with 4,096 max output tokens can only include 28,224 tokens in the context before truncation occurs.
1629
1630 Clients can configure truncation behavior to truncate with a lower max token limit, which is an effective way to control token usage and cost.
1631
1632 Truncation will reduce the number of cached tokens on the next turn (busting the cache), since messages are dropped from the beginning of the context. However, clients can also configure truncation to retain messages up to a fraction of the maximum context size, which will reduce the need for future truncations and thus improve the cache rate.
1633
1634 Truncation can be disabled entirely, which means the server will never truncate but would instead return an error if the conversation exceeds the model's input token limit.
1635
1636 - `Literal["auto", "disabled"]`
1637
1638 The truncation strategy to use for the session. `auto` is the default truncation strategy. `disabled` will disable truncation and emit errors when the conversation exceeds the input token limit.
1639
1640 - `"auto"`
1641
1642 - `"disabled"`
1643
1644 - `class RealtimeTruncationRetentionRatio: …`
1645
1646 Retain a fraction of the conversation tokens when the conversation exceeds the input token limit. This allows you to amortize truncations across multiple turns, which can help improve cached token usage.
1647
1648 - `retention_ratio: float`
1649
1650 Fraction of post-instruction conversation tokens to retain (`0.0` - `1.0`) when the conversation exceeds the input token limit. Setting this to `0.8` means that messages will be dropped until 80% of the maximum allowed tokens are used. This helps reduce the frequency of truncations and improve cache rates.
1651
1652 - `type: Literal["retention_ratio"]`
1653
1654 Use retention ratio truncation.
1655
1656 - `"retention_ratio"`
1657
1658 - `token_limits: Optional[TokenLimits]`
1659
1660 Optional custom token limits for this truncation strategy. If not provided, the model's default token limits will be used.
1661
1662 - `post_instructions: Optional[int]`
1663
1664 Maximum tokens allowed in the conversation after instructions (which including tool definitions). For example, setting this to 5,000 would mean that truncation would occur when the conversation exceeds 5,000 tokens after instructions. This cannot be higher than the model's context window size minus the maximum output tokens.
1665
1666### Returns
1667
1668- `BinaryResponseContent`
1669
1670### Example
1671
1672```python
1673import os
1674from openai import OpenAI
1675
1676client = OpenAI(
1677 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
1678)
1679call = client.realtime.calls.create(
1680 sdp="sdp",
1681)
1682print(call)
1683content = call.read()
1684print(content)
1685```