Documentation — Spybara

deprecations.md +13 −0

Details

32 32

33Upcoming deprecations are listed below, with the most recent announcements at the top.33Upcoming deprecations are listed below, with the most recent announcements at the top.

34 34

35### 2026-06-11: GPT-5 and o3 model deprecations

37On June 11, 2026, we notified developers using older GPT-5 and o3 model snapshots of their deprecation and removal from the API on December 11, 2026.

39| Shutdown date | Model / system | Recommended replacement |

40| ------------- | ----------------------- | ----------------------- |

41| Dec 11, 2026 | `gpt-5-2025-08-07` | `gpt-5.5` |

42| Dec 11, 2026 | `gpt-5-mini-2025-08-07` | `gpt-5.4-mini` |

43| Dec 11, 2026 | `gpt-5-nano-2025-08-07` | `gpt-5.4-nano` |

44| Dec 11, 2026 | `gpt-5-pro-2025-10-06` | `gpt-5.5-pro` |

45| Dec 11, 2026 | `o3-2025-04-16` | `gpt-5.5` |

46| Dec 11, 2026 | `o3-pro-2025-06-10` | `gpt-5.5-pro` |

35### 2026-06-03: Reusable prompts48### 2026-06-03: Reusable prompts

36 49

37On June 3, 2026, we notified developers using reusable prompts in the dashboard and API that reusable prompt objects are being deprecated.50On June 3, 2026, we notified developers using reusable prompts in the dashboard and API that reusable prompt objects are being deprecated.

guides/reasoning.md +3 −2

Details

137### Controlling costs137### Controlling costs

138 138

139To manage costs with reasoning models, you can limit the total number of tokens the139To manage costs with reasoning models, you can limit the total number of tokens the

140model generates (including both reasoning and final output tokens) by using the140model generates, including reasoning tokens, visible output tokens, and non-visible

141formatting tokens, by using the

141[`max_output_tokens`](https://developers.openai.com/api/docs/api-reference/responses/create#responses-create-max_output_tokens)142[`max_output_tokens`](https://developers.openai.com/api/docs/api-reference/responses/create#responses-create-max_output_tokens)

142parameter.143parameter. See [output token counts](https://developers.openai.com/api/docs/guides/token-counting#understand-output-token-counts) for details about how generated tokens are reflected in usage and output limits.

143 144

144### Allocating space for reasoning145### Allocating space for reasoning

145 146

guides/token-counting.md +10 −0

Details

9 9

10The [input token count endpoint](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count) accepts the same input format as the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Pass text, messages, images, files, tools, or conversations—the API returns the exact count the model will receive.10The [input token count endpoint](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count) accepts the same input format as the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Pass text, messages, images, files, tools, or conversations—the API returns the exact count the model will receive.

11 11

12The count includes formatting tokens used to represent request structure, such as message roles and boundaries. These tokens might not appear in the text or fields you tokenize locally.

12## Why use the token counting API?14## Why use the token counting API?

13 15

14Local tokenizers like [tiktoken](https://github.com/openai/tiktoken) work for plain text, but they have limitations:16Local tokenizers like [tiktoken](https://github.com/openai/tiktoken) work for plain text, but they have limitations:

373 375

374[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.376[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.

375 377

378## Understand output token counts

379

380Reported output token usage includes all tokens generated by the model, not only the text visible in a response. The Responses API reports this total as `output_tokens`, while the Chat Completions API reports it as `completion_tokens`.

381

382Some models, including GPT-5 models, generate tokens used to format or delimit response channels, tool calls, and other message structure. These formatting tokens don't appear in message content or `logprobs`, and they aren't necessarily itemized separately in usage. As a result, the reported output or completion token count can be higher than the number of visible tokens or tokens included in `logprobs`, even when the reported `reasoning_tokens` value is `0`.

383

384The `max_output_tokens` and `max_completion_tokens` parameters limit all tokens generated by the model, including non-visible tokens. The number of non-visible tokens varies by model and response shape, so don't assume a fixed difference between reported usage and visible output. Leave headroom in these limits when you need a specific amount of visible output.

385

376## API reference386## API reference

377 387

378For full parameters and response shape, see the [Count input tokens API reference](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count). The endpoint is:388For full parameters and response shape, see the [Count input tokens API reference](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count). The endpoint is:

Documentation 2026-06-12 19:02 UTC to 2026-06-15 23:02 UTC