Documentation — Spybara

guides/prompt-caching.md +6 −1

Details

61 61

62### Configure per request62### Configure per request

63 63

64If you don’t specify a retention policy, for most models the default is `in_memory`. For `gpt-5.5`, `gpt-5.5-pro`, and all future models, the default is `24h` and `in_memory` is not supported. Allowed values are `in_memory` and `24h`.64For `gpt-5.5`, `gpt-5.5-pro`, and future models, only `24h` is supported.

66For older models that support both `in_memory` and `24h`, the default depends on your organization's data retention policy:

68- Organizations without ZDR enabled default to `24h`.

69- Organizations with ZDR enabled default to `in_memory` when `prompt_cache_retention` is not specified.

65 70

66```json71```json

67{72{

guides/structured-outputs.md +8 −138

Details

1# Structured model outputs1# Structured model outputs

2 2

~~3export const snippetRefusalsChatCompletionsApi = {~~3import {

~~4 python: `~~4 snippetRefusalsChatCompletionsApi,

~~5class Step(BaseModel):~~5 snippetRefusalsResponsesApi,

~~6 explanation: str~~6} from "./inline-examples";

~~7 output: str~~

~~9class MathReasoning(BaseModel):~~

~~10steps: list[Step]~~

~~11final_answer: str~~

~~13completion = client.chat.completions.parse(~~

~~14model="gpt-4o-2024-08-06",~~

~~15messages=[~~

~~16{"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},~~

~~17{"role": "user", "content": "how can I solve 8x + 7 = -23"},~~

~~18],~~

~~19response_format=MathReasoning,~~

~~20)~~

~~22math_reasoning = completion.choices[0].message~~

~~24# If the model refuses to respond, you will get a refusal message~~

~~26if math_reasoning.refusal:~~

~~27print(math_reasoning.refusal)~~

~~28else:~~

~~29print(math_reasoning.parsed)~~

~~30`.trim(),~~

~~31 "javascript": `~~

~~32const Step = z.object({~~

~~33explanation: z.string(),~~

~~34output: z.string(),~~

~~35});~~

~~37const MathReasoning = z.object({~~

~~38steps: z.array(Step),~~

~~39final_answer: z.string(),~~

~~40});~~

~~42const completion = await openai.chat.completions.parse({~~

~~43model: "gpt-4o-2024-08-06",~~

~~44messages: [~~

~~45{ role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },~~

~~46{ role: "user", content: "how can I solve 8x + 7 = -23" },~~

~~47],~~

~~48response_format: zodResponseFormat(MathReasoning, "math_reasoning"),~~

~~49});~~

~~51const math_reasoning = completion.choices[0].message~~

~~53// If the model refuses to respond, you will get a refusal message~~

~~54if (math_reasoning.refusal) {~~

~~55console.log(math_reasoning.refusal);~~

~~56} else {~~

~~57console.log(math_reasoning.parsed);~~

~~58}~~

~~59`.trim(),~~

~~60};~~

~~61export const snippetRefusalsResponsesApi = {~~

~~62 python: `~~

~~63class Step(BaseModel):~~

~~64explanation: str~~

~~65output: str~~

~~67class MathReasoning(BaseModel):~~

~~68steps: list[Step]~~

~~69final_answer: str~~

~~71response = client.responses.parse(~~

~~72model="gpt-4o-2024-08-06",~~

~~73input=[~~

~~74{"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},~~

~~75{"role": "user", "content": "how can I solve 8x + 7 = -23"},~~

~~76],~~

~~77text_format=MathReasoning,~~

~~78)~~

~~80for output in response.output:~~

~~81if output.type != "message":~~

~~82raise Exception("Unexpected non message")~~

~~84 for item in output.content:~~

~~85 if item.type == "refusal":~~

~~86 # If the model refuses to respond, you will get a refusal message~~

~~87 print(item.refusal)~~

~~88 continue~~

~~90 if not item.parsed:~~

~~91 raise Exception("Could not parse response")~~

~~93 print(item.parsed)~~

~~95`.trim(),~~

~~96 "javascript": `~~

~~97const Step = z.object({~~

~~98explanation: z.string(),~~

~~99output: z.string(),~~

100});

~~101~~

102const MathReasoning = z.object({

103steps: z.array(Step),

104final_answer: z.string(),

105});

~~106~~

107const response = await openai.responses.parse({

108model: "gpt-4o-2024-08-06",

109input: [

110{ role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },

111{ role: "user", content: "how can I solve 8x + 7 = -23" }

112],

113text: {

114format: zodTextFormat(MathReasoning, "math_response"),

115},

116});

~~117~~

118for (const output of response.output) {

119if (output.type != "message") {

120throw new Error("Unexpected non message");

121}

~~122~~

123 for (const item of output.content) {

124 if (item.type == "refusal") {

125 // If the model refuses to respond, you will get a refusal message

126 console.log(item.refusal);

127 continue;

128 }

~~129~~

130 if (!item.parsed) {

131 throw new Error("Could not parse response");

132 }

~~133~~

134 console.log(item.parsed);

135 }

~~136~~

137}

138`.trim(),

139};

140 7

141export const snippetRefusalApiResponseChatCompletionsApi = {8export const snippetRefusalApiResponseChatCompletionsApi = {

142 json: `9 json: `

326 193

327When the `refusal` property appears in your output object, you might present the refusal in your UI, or include conditional logic in code that consumes the response to handle the case of a refused request.194When the `refusal` property appears in your output object, you might present the refusal in your UI, or include conditional logic in code that consumes the response to handle the case of a refused request.

328 195

329The API response from a refusal will look something like this:196

197

198

199 The API response from a refusal will look something like this:

330 200

331 201

332 202

guides/your-data.md +1 −0

Details

94- MCP servers (used with the [remote MCP server tool](https://developers.openai.com/api/docs/guides/tools-remote-mcp)) are third-party services, and data sent to an MCP server is subject to their data retention policies.94- MCP servers (used with the [remote MCP server tool](https://developers.openai.com/api/docs/guides/tools-remote-mcp)) are third-party services, and data sent to an MCP server is subject to their data retention policies.

95- Hosted containers used by [Hosted Shell](https://developers.openai.com/api/docs/guides/tools-shell#hosted-shell-quickstart) and [Code Interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter) may write temporary application state to the container filesystem (backed by ephemeral block storage) while the container is active. Container data is deleted when the container expires or is explicitly deleted.95- Hosted containers used by [Hosted Shell](https://developers.openai.com/api/docs/guides/tools-shell#hosted-shell-quickstart) and [Code Interpreter](https://developers.openai.com/api/docs/guides/tools-code-interpreter) may write temporary application state to the container filesystem (backed by ephemeral block storage) while the container is active. Container data is deleted when the container expires or is explicitly deleted.

96- Extended prompt caching requires storing encrypted key/value tensors to GPU-local storage as application state. This data is stored on the local GPU machines and is not retained after the 24 hour data expiration. Requests to gpt-5.5, gpt-5.5-pro, and all future models require extended prompt caching, and setting a prompt_cache_retention value to in_memory will cause a request error. To learn more, see the [prompt caching guide](https://developers.openai.com/api/docs/guides/prompt-caching#prompt-cache-retention).96- Extended prompt caching requires storing encrypted key/value tensors to GPU-local storage as application state. This data is stored on the local GPU machines and is not retained after the 24 hour data expiration. Requests to gpt-5.5, gpt-5.5-pro, and all future models require extended prompt caching, and setting a prompt_cache_retention value to in_memory will cause a request error. To learn more, see the [prompt caching guide](https://developers.openai.com/api/docs/guides/prompt-caching#prompt-cache-retention).

97- When Zero Data Retention is not enabled for an organization, all queries use extended prompt caching for all supported models.

97- For server-side compaction, no data is retained when `store="false"`.98- For server-side compaction, no data is retained when `store="false"`.

98- We support [Skills](https://developers.openai.com/api/docs/guides/tools-skills) in two form factors, both local execution and hosted container-based execution. Hosted skills follow the same container lifecycle as hosted shell: mounted skills and container files remain available while the container is active and are discarded when the container expires or is deleted.99- We support [Skills](https://developers.openai.com/api/docs/guides/tools-skills) in two form factors, both local execution and hosted container-based execution. Hosted skills follow the same container lifecycle as hosted shell: mounted skills and container files remain available while the container is active and are discarded when the container expires or is deleted.

99- Data transmitted to third-party services over network connections is subject to their data retention policies.100- Data transmitted to third-party services over network connections is subject to their data retention policies.

Documentation 2026-05-29 06:38 UTC to 2026-06-01 06:53 UTC