SpyBara
Go Premium

Documentation 2026-06-10 15:48 UTC to 2026-06-11 08:59 UTC

65 files changed +4,125 −1,062. View all changes and history on the product overview
2026
Wed 24 22:02 Tue 23 15:59 Mon 22 22:58 Tue 16 21:57 Mon 15 23:02 Fri 12 19:02 Thu 11 08:59 Wed 10 15:48 Tue 9 06:34 Fri 5 06:45 Thu 4 06:52 Wed 3 06:53 Tue 2 06:51 Mon 1 06:53
Details

1# Assistants API tools1# Assistants API tools

2 2 

3import {

4 Code,

5 File,

6 Plugin,

7} from "@components/react/oai/platform/ui/Icon.react";

8 

9 

10 

11## Overview3## Overview

12 4 

13Assistants created using the Assistants API can be equipped with tools that allow them to perform more complex tasks or interact with your application.5Assistants created using the Assistants API can be equipped with tools that allow them to perform more complex tasks or interact with your application.

Details

153 `.trim(),153 `.trim(),

154 "node.js": `154 "node.js": `

155 155 

156 156import OpenAI from "openai";\n

157const openai = new OpenAI();\n157const openai = new OpenAI();\n

158async function main() {158async function main() {

159 const response = await openai.files.content("file-abc123");\n159 const response = await openai.files.content("file-abc123");\n

Details

1# Admin APIs1# Admin APIs

2 2 

3import {

4 adminClientExamples,

5 auditLogExamples,

6 dataRetentionExamples,

7 inviteUserExamples,

8 modelPermissionsExamples,

9 spendAlertExamples,

10} from "./admin-apis-examples";

11 

12Admin APIs let you automate organization management workflows such as user invitations, audit log review, project administration, API key management, spend alerts, data retention, and rate limit operations. Use them for back-office automation, security workflows, and operational tooling that should run outside the dashboard.3Admin APIs let you automate organization management workflows such as user invitations, audit log review, project administration, API key management, spend alerts, data retention, and rate limit operations. Use them for back-office automation, security workflows, and operational tooling that should run outside the dashboard.

13 4 

14For endpoint details, see the [Administration API reference](https://developers.openai.com/api/reference/administration/overview), including [Admin API keys](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/admin_api_keys), [Invites](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/invites), [Users](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/users), [Projects](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/projects), and [Audit logs](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/audit_logs).5For endpoint details, see the [Administration API reference](https://developers.openai.com/api/reference/administration/overview), including [Admin API keys](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/admin_api_keys), [Invites](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/invites), [Users](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/users), [Projects](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/projects), and [Audit logs](https://developers.openai.com/api/reference/resources/admin/subresources/organization/subresources/audit_logs).

Details

34 description: "Return the weather for a given city.",34 description: "Return the weather for a given city.",

35 parameters: z.object({ city: z.string() }),35 parameters: z.object({ city: z.string() }),

36 async execute({ city }) {36 async execute({ city }) {

37 return \`The weather in \${city} is sunny.\`;37 return `The weather in ${city} is sunny.`;

38 },38 },

39});39});

40 40 

41const agent = new Agent({41const agent = new Agent({

42 name: "Weather bot",42 name: "Weather bot",

43 instructions: "You are a helpful weather bot.",43 instructions: "You are a helpful weather bot.",

44 model: "${latestMainlineModelSlug}",44 model: "gpt-5.5",

45 tools: [getWeather],45 tools: [getWeather],

46});46});

47```47```


59agent = Agent(59agent = Agent(

60 name="Weather bot",60 name="Weather bot",

61 instructions="You are a helpful weather bot.",61 instructions="You are a helpful weather bot.",

62 model="${latestMainlineModelSlug}",62 model="gpt-5.5",

63 tools=[get_weather],63 tools=[get_weather],

64)64)

65```65```


155 description: "Return the age of the current user.",155 description: "Return the age of the current user.",

156 parameters: z.object({}),156 parameters: z.object({}),

157 async execute(_args, runContext?: RunContext<UserInfo>) {157 async execute(_args, runContext?: RunContext<UserInfo>) {

158 return \`User \${runContext?.context.name} is 47 years old\`;158 return `User ${runContext?.context.name} is 47 years old`;

159 },159 },

160});160});

161 161 

Details

142 parameters: z.object({ orderId: z.number() }),142 parameters: z.object({ orderId: z.number() }),

143 needsApproval: true,143 needsApproval: true,

144 async execute({ orderId }) {144 async execute({ orderId }) {

145 return \`Cancelled order \${orderId}\`;145 return `Cancelled order ${orderId}`;

146 },146 },

147});147});

148 148 

Details

147 147 

148await withTrace("Joke workflow", async () => {148await withTrace("Joke workflow", async () => {

149 const first = await run(agent, "Tell me a joke");149 const first = await run(agent, "Tell me a joke");

150 const second = await run(agent, \`Rate this joke: \${first.finalOutput}\`);150 const second = await run(agent, `Rate this joke: ${first.finalOutput}`);

151 console.log(first.finalOutput);151 console.log(first.finalOutput);

152 console.log(second.finalOutput);152 console.log(second.finalOutput);

153});153});

Details

27});27});

28 28 

29const runner = new Runner({29const runner = new Runner({

30 model: "${latestMainlineModelSlug}",30 model: "gpt-5.5",

31});31});

32 32 

33await runner.run(fastAgent, "Summarize ticket 123.");33await runner.run(fastAgent, "Summarize ticket 123.");


62 result = await Runner.run(62 result = await Runner.run(

63 general_agent,63 general_agent,

64 "Investigate the billing issue on account 456.",64 "Investigate the billing issue on account 456.",

65 run_config=RunConfig(model="${latestMainlineModelSlug}"),65 run_config=RunConfig(model="gpt-5.5"),

66 )66 )

67 print(result.final_output)67 print(result.final_output)

68 68 

Details

36 name: "History tutor",36 name: "History tutor",

37 instructions:37 instructions:

38 "You answer history questions clearly and concisely.",38 "You answer history questions clearly and concisely.",

39 model: "${latestMainlineModelSlug}",39 model: "gpt-5.5",

40});40});

41 41 

42const result = await run(agent, "When did the Roman Empire fall?");42const result = await run(agent, "When did the Roman Empire fall?");


51agent = Agent(51agent = Agent(

52 name="History tutor",52 name="History tutor",

53 instructions="You answer history questions clearly and concisely.",53 instructions="You answer history questions clearly and concisely.",

54 model="${latestMainlineModelSlug}",54 model="gpt-5.5",

55)55)

56 56 

57 57 

Details

172}172}

173 173 

174await stream.completed;174await stream.completed;

175console.log("\\nFinal:", stream.finalOutput);175console.log("\nFinal:", stream.finalOutput);

176```176```

177 177 

178```python178```python


201 ):201 ):

202 print(event.data.delta, end="", flush=True)202 print(event.data.delta, end="", flush=True)

203 203 

204 print(f"\\nFinal: {stream.final_output}")204 print(f"\nFinal: {stream.final_output}")

205 205 

206 206 

207if __name__ == "__main__":207if __name__ == "__main__":

Details

279 entries: {279 entries: {

280 "account_brief.md": file({280 "account_brief.md": file({

281 content:281 content:

282 "# Northwind Health\\n\\n" +282 "# Northwind Health\n\n" +

283 "- Segment: Mid-market healthcare analytics provider.\\n" +283 "- Segment: Mid-market healthcare analytics provider.\n" +

284 "- Renewal date: 2026-04-15.\\n",284 "- Renewal date: 2026-04-15.\n",

285 }),285 }),

286 "implementation_risks.md": file({286 "implementation_risks.md": file({

287 content:287 content:

288 "# Delivery risks\\n\\n" +288 "# Delivery risks\n\n" +

289 "- Security questionnaire is not complete.\\n" +289 "- Security questionnaire is not complete.\n" +

290 "- Procurement requires final legal language by April 1.\\n",290 "- Procurement requires final legal language by April 1.\n",

291 }),291 }),

292 },292 },

293});293});

294 294 

295const agent = new SandboxAgent({295const agent = new SandboxAgent({

296 name: "Renewal Packet Analyst",296 name: "Renewal Packet Analyst",

297 model: "${latestMainlineModelSlug}",297 model: "gpt-5.5",

298 instructions:298 instructions:

299 "Review the workspace before answering. Keep the response concise, " +299 "Review the workspace before answering. Keep the response concise, " +

300 "business-focused, and cite the file names that support each conclusion.",300 "business-focused, and cite the file names that support each conclusion.",


329 entries={329 entries={

330 "account_brief.md": File(330 "account_brief.md": File(

331 content=(331 content=(

332 b"# Northwind Health\\n\\n"332 b"# Northwind Health\n\n"

333 b"- Segment: Mid-market healthcare analytics provider.\\n"333 b"- Segment: Mid-market healthcare analytics provider.\n"

334 b"- Renewal date: 2026-04-15.\\n"334 b"- Renewal date: 2026-04-15.\n"

335 )335 )

336 ),336 ),

337 "implementation_risks.md": File(337 "implementation_risks.md": File(

338 content=(338 content=(

339 b"# Delivery risks\\n\\n"339 b"# Delivery risks\n\n"

340 b"- Security questionnaire is not complete.\\n"340 b"- Security questionnaire is not complete.\n"

341 b"- Procurement requires final legal language by April 1.\\n"341 b"- Procurement requires final legal language by April 1.\n"

342 )342 )

343 ),343 ),

344 }344 }


346 346 

347agent = SandboxAgent(347agent = SandboxAgent(

348 name="Renewal Packet Analyst",348 name="Renewal Packet Analyst",

349 model="${latestMainlineModelSlug}",349 model="gpt-5.5",

350 instructions=(350 instructions=(

351 "Review the workspace before answering. Keep the response concise, "351 "Review the workspace before answering. Keep the response concise, "

352 "business-focused, and cite the file names that support each conclusion."352 "business-focused, and cite the file names that support each conclusion."


392 392 

393const agent = new SandboxAgent({393const agent = new SandboxAgent({

394 name: "Workspace reviewer",394 name: "Workspace reviewer",

395 model: "${latestMainlineModelSlug}",395 model: "gpt-5.5",

396 instructions: "Inspect the sandbox workspace before answering.",396 instructions: "Inspect the sandbox workspace before answering.",

397});397});

398 398 


487});487});

488const agent = new SandboxAgent({488const agent = new SandboxAgent({

489 name: "Workspace builder",489 name: "Workspace builder",

490 model: "${latestMainlineModelSlug}",490 model: "gpt-5.5",

491 instructions: "Inspect the sandbox workspace before answering.",491 instructions: "Inspect the sandbox workspace before answering.",

492});492});

493 493 

Details

80```80```

81 81 

82```bash82```bash

83curl "https://bedrock-mantle.us-east-2.api.aws/openai/v1/responses" \\83curl "https://bedrock-mantle.us-east-2.api.aws/openai/v1/responses" \

84 -H "Content-Type: application/json" \\84 -H "Content-Type: application/json" \

85 -H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK" \\85 -H "Authorization: Bearer $AWS_BEARER_TOKEN_BEDROCK" \

86 -d '{86 -d '{

87 "model": "openai.gpt-5.5",87 "model": "openai.gpt-5.5",

88 "input": "Write a haiku about cloud infrastructure."88 "input": "Write a haiku about cloud infrastructure."

guides/audio.md +6 −6

Details

138```138```

139 139 

140```bash140```bash

141curl "https://api.openai.com/v1/chat/completions" \\141curl "https://api.openai.com/v1/chat/completions" \

142 -H "Content-Type: application/json" \\142 -H "Content-Type: application/json" \

143 -H "Authorization: Bearer $OPENAI_API_KEY" \\143 -H "Authorization: Bearer $OPENAI_API_KEY" \

144 -d '{144 -d '{

145 "model": "gpt-audio-1.5",145 "model": "gpt-audio-1.5",

146 "modalities": ["text", "audio"],146 "modalities": ["text", "audio"],


230```230```

231 231 

232```bash232```bash

233curl "https://api.openai.com/v1/chat/completions" \\233curl "https://api.openai.com/v1/chat/completions" \

234 -H "Content-Type: application/json" \\234 -H "Content-Type: application/json" \

235 -H "Authorization: Bearer $OPENAI_API_KEY" \\235 -H "Authorization: Bearer $OPENAI_API_KEY" \

236 -d '{236 -d '{

237 "model": "gpt-audio-1.5",237 "model": "gpt-audio-1.5",

238 "modalities": ["text", "audio"],238 "modalities": ["text", "audio"],

Details

13Generate a response in the background13Generate a response in the background

14 14 

15```bash15```bash

16curl https://api.openai.com/v1/responses \\16curl https://api.openai.com/v1/responses \

17-H "Content-Type: application/json" \\17-H "Content-Type: application/json" \

18-H "Authorization: Bearer $OPENAI_API_KEY" \\18-H "Authorization: Bearer $OPENAI_API_KEY" \

19-d '{19-d '{

20 "model": "gpt-5.5",20 "model": "gpt-5.5",

21 "input": "Write a very long novel about otters in space.",21 "input": "Write a very long novel about otters in space.",


58Retrieve a response executing in the background58Retrieve a response executing in the background

59 59 

60```bash60```bash

61curl https://api.openai.com/v1/responses/resp_123 \\61curl https://api.openai.com/v1/responses/resp_123 \

62 -H "Content-Type: application/json" \\62 -H "Content-Type: application/json" \

63 -H "Authorization: Bearer $OPENAI_API_KEY"63 -H "Authorization: Bearer $OPENAI_API_KEY"

64```64```

65 65 


79resp = await client.responses.retrieve(resp.id);79resp = await client.responses.retrieve(resp.id);

80}80}

81 81 

82console.log("Final status: " + resp.status + "\\nOutput:\\n" + resp.output_text);82console.log("Final status: " + resp.status + "\nOutput:\n" + resp.output_text);

83```83```

84 84 

85```python85```python


99 sleep(2)99 sleep(2)

100 resp = client.responses.retrieve(resp.id)100 resp = client.responses.retrieve(resp.id)

101 101 

102print(f"Final status: {resp.status}\\nOutput:\\n{resp.output_text}")102print(f"Final status: {resp.status}\nOutput:\n{resp.output_text}")

103```103```

104 104 

105 105 


110Cancel an ongoing response110Cancel an ongoing response

111 111 

112```bash112```bash

113curl -X POST https://api.openai.com/v1/responses/resp_123/cancel \\113curl -X POST https://api.openai.com/v1/responses/resp_123/cancel \

114 -H "Content-Type: application/json" \\114 -H "Content-Type: application/json" \

115 -H "Authorization: Bearer $OPENAI_API_KEY"115 -H "Authorization: Bearer $OPENAI_API_KEY"

116```116```

117 117 


147Generate and stream a background response147Generate and stream a background response

148 148 

149```bash149```bash

150curl https://api.openai.com/v1/responses \\150curl https://api.openai.com/v1/responses \

151-H "Content-Type: application/json" \\151-H "Content-Type: application/json" \

152-H "Authorization: Bearer $OPENAI_API_KEY" \\152-H "Authorization: Bearer $OPENAI_API_KEY" \

153-d '{153-d '{

154 "model": "gpt-5.5",154 "model": "gpt-5.5",

155 "input": "Write a very long novel about otters in space.",155 "input": "Write a very long novel about otters in space.",


158}'158}'

159 159 

160// To resume:160// To resume:

161curl "https://api.openai.com/v1/responses/resp_123?stream=true&starting_after=42" \\161curl "https://api.openai.com/v1/responses/resp_123?stream=true&starting_after=42" \

162-H "Content-Type: application/json" \\162-H "Content-Type: application/json" \

163-H "Authorization: Bearer $OPENAI_API_KEY"163-H "Authorization: Bearer $OPENAI_API_KEY"

164```164```

165 165 

guides/batch.md +24 −24

Details

129```129```

130 130 

131```bash131```bash

132curl https://api.openai.com/v1/files \\132curl https://api.openai.com/v1/files \

133 -H "Authorization: Bearer $OPENAI_API_KEY" \\133 -H "Authorization: Bearer $OPENAI_API_KEY" \

134 -F purpose="batch" \\134 -F purpose="batch" \

135 -F file="@batchinput.jsonl"135 -F file="@batchinput.jsonl"

136```136```

137 137 

138```cli138```cli

139openai files create \\139openai files create \

140 --file batchinput.jsonl \\140 --file batchinput.jsonl \

141 --purpose batch141 --purpose batch

142```142```

143 143 


177```177```

178 178 

179```bash179```bash

180curl https://api.openai.com/v1/batches \\180curl https://api.openai.com/v1/batches \

181 -H "Authorization: Bearer $OPENAI_API_KEY" \\181 -H "Authorization: Bearer $OPENAI_API_KEY" \

182 -H "Content-Type: application/json" \\182 -H "Content-Type: application/json" \

183 -d '{183 -d '{

184 "input_file_id": "file-abc123",184 "input_file_id": "file-abc123",

185 "endpoint": "/v1/chat/completions",185 "endpoint": "/v1/chat/completions",


188```188```

189 189 

190```cli190```cli

191openai batches create \\191openai batches create \

192 --input-file-id file-abc123 \\192 --input-file-id file-abc123 \

193 --endpoint /v1/chat/completions \\193 --endpoint /v1/chat/completions \

194 --completion-window 24h194 --completion-window 24h

195```195```

196 196 


246```246```

247 247 

248```bash248```bash

249curl https://api.openai.com/v1/batches/batch_abc123 \\249curl https://api.openai.com/v1/batches/batch_abc123 \

250 -H "Authorization: Bearer $OPENAI_API_KEY" \\250 -H "Authorization: Bearer $OPENAI_API_KEY" \

251 -H "Content-Type: application/json"251 -H "Content-Type: application/json"

252```252```

253 253 

254```cli254```cli

255openai batches retrieve \\255openai batches retrieve \

256 --batch-id batch_abc123256 --batch-id batch_abc123

257```257```

258 258 


295```295```

296 296 

297```bash297```bash

298curl https://api.openai.com/v1/files/file-xyz123/content \\298curl https://api.openai.com/v1/files/file-xyz123/content \

299 -H "Authorization: Bearer $OPENAI_API_KEY" > batch_output.jsonl299 -H "Authorization: Bearer $OPENAI_API_KEY" > batch_output.jsonl

300```300```

301 301 

302```cli302```cli

303openai files content \\303openai files content \

304 --file-id file-xyz123 \\304 --file-id file-xyz123 \

305 --output batch_output.jsonl305 --output batch_output.jsonl

306```306```

307 307 


344```344```

345 345 

346```bash346```bash

347curl https://api.openai.com/v1/batches/batch_abc123/cancel \\347curl https://api.openai.com/v1/batches/batch_abc123/cancel \

348 -H "Authorization: Bearer $OPENAI_API_KEY" \\348 -H "Authorization: Bearer $OPENAI_API_KEY" \

349 -H "Content-Type: application/json" \\349 -H "Content-Type: application/json" \

350 -X POST350 -X POST

351```351```

352 352 

353```cli353```cli

354openai batches cancel \\354openai batches cancel \

355 --batch-id batch_abc123355 --batch-id batch_abc123

356```356```

357 357 


381```381```

382 382 

383```bash383```bash

384curl https://api.openai.com/v1/batches?limit=10 \\384curl https://api.openai.com/v1/batches?limit=10 \

385 -H "Authorization: Bearer $OPENAI_API_KEY" \\385 -H "Authorization: Bearer $OPENAI_API_KEY" \

386 -H "Content-Type: application/json"386 -H "Content-Type: application/json"

387```387```

388 388 

389```cli389```cli

390openai batches list \\390openai batches list \

391 --limit 10391 --limit 10

392```392```

393 393 

Details

1# ChatKit1# ChatKit

2 2 

3import {

4 BookBookmark,

5 Code,

6 Cube,

7 Inpaint,

8 Globe,

9 Playground,

10 Sparkles,

11} from "@components/react/oai/platform/ui/Icon.react";

12 

13 

14 

15ChatKit is the best way to build agentic chat experiences. Whether you’re building an internal knowledge base assistant, HR onboarding helper, research companion, shopping or scheduling assistant, troubleshooting bot, financial planning advisor, or support agent, ChatKit provides a customizable chat embed to handle all user experience details.3ChatKit is the best way to build agentic chat experiences. Whether you’re building an internal knowledge base assistant, HR onboarding helper, research companion, shopping or scheduling assistant, troubleshooting bot, financial planning advisor, or support agent, ChatKit provides a customizable chat embed to handle all user experience details.

16 4 

17Use ChatKit's embeddable UI widgets, customizable prompts, tool‑invocation support, file attachments, and chain‑of‑thought visualizations to build agents without reinventing the chat UI.5Use ChatKit's embeddable UI widgets, customizable prompts, tool‑invocation support, file attachments, and chain‑of‑thought visualizations to build agents without reinventing the chat UI.

Details

1# Theming and customization in ChatKit1# Theming and customization in ChatKit

2 2 

3import {

4 Inpaint,

5 Globe,

6 Playground,

7 Github,

8 Sparkles,

9} from "@components/react/oai/platform/ui/Icon.react";

10 

11After following the [ChatKit quickstart](https://developers.openai.com/api/docs/guides/chatkit), learn how to change themes and add customization to your chat embed. Match your app’s aesthetic with light and dark themes, setting an accent color, controlling the density, and rounded corners.3After following the [ChatKit quickstart](https://developers.openai.com/api/docs/guides/chatkit), learn how to change themes and add customization to your chat embed. Match your app’s aesthetic with light and dark themes, setting an accent color, controlling the density, and rounded corners.

12 4 

13## Overview5## Overview

Details

28const openai = new OpenAI();28const openai = new OpenAI();

29 29 

30const result = await openai.responses.create({30const result = await openai.responses.create({

31 model: "${latestMainlineModelSlug}",31 model: "gpt-5.5",

32 input: "Find the null pointer exception: ...your code here...",32 input: "Find the null pointer exception: ...your code here...",

33 reasoning: { effort: "high" },33 reasoning: { effort: "high" },

34});34});


41client = OpenAI()41client = OpenAI()

42 42 

43result = client.responses.create(43result = client.responses.create(

44 model="${latestMainlineModelSlug}",44 model="gpt-5.5",

45 input="Find the null pointer exception: ...your code here...",45 input="Find the null pointer exception: ...your code here...",

46 reasoning={ "effort": "high" },46 reasoning={ "effort": "high" },

47)47)


50```50```

51 51 

52```bash52```bash

53curl https://api.openai.com/v1/responses \\53curl https://api.openai.com/v1/responses \

54 -H "Content-Type: application/json" \\54 -H "Content-Type: application/json" \

55 -H "Authorization: Bearer $OPENAI_API_KEY" \\55 -H "Authorization: Bearer $OPENAI_API_KEY" \

56 -d '{56 -d '{

57 "model": "${latestMainlineModelSlug}",57 "model": "gpt-5.5",

58 "input": "Find the null pointer exception: ...your code here...",58 "input": "Find the null pointer exception: ...your code here...",

59 "reasoning": { "effort": "high" }59 "reasoning": { "effort": "high" }

60 }'60 }'

Details

1# Deep research1# Deep research

2 2 

3import {

4 deepResearchBasic,

5 deepResearchClarification,

6 deepResearchPromptEnrichment,

7 deepResearchRemoteMCP,

8} from "./deep-research-examples";

9 

10 

11 

12 

13The [`o3-deep-research`](https://developers.openai.com/api/docs/models/o3-deep-research) and [`o4-mini-deep-research`](https://developers.openai.com/api/docs/models/o4-mini-deep-research) models can find, analyze, and synthesize hundreds of sources to create a comprehensive report at the level of a research analyst. These models are optimized for browsing and data analysis, and can use [web search](https://developers.openai.com/api/docs/guides/tools-web-search), [remote MCP](https://developers.openai.com/api/docs/guides/tools-remote-mcp) servers, and [file search](https://developers.openai.com/api/docs/guides/tools-file-search) over internal [vector stores](https://developers.openai.com/api/docs/api-reference/vector-stores) to generate detailed reports, ideal for use cases like:3The [`o3-deep-research`](https://developers.openai.com/api/docs/models/o3-deep-research) and [`o4-mini-deep-research`](https://developers.openai.com/api/docs/models/o4-mini-deep-research) models can find, analyze, and synthesize hundreds of sources to create a comprehensive report at the level of a research analyst. These models are optimized for browsing and data analysis, and can use [web search](https://developers.openai.com/api/docs/guides/tools-web-search), [remote MCP](https://developers.openai.com/api/docs/guides/tools-remote-mcp) servers, and [file search](https://developers.openai.com/api/docs/guides/tools-file-search) over internal [vector stores](https://developers.openai.com/api/docs/api-reference/vector-stores) to generate detailed reports, ideal for use cases like:

14 4 

15- Legal or scientific research5- Legal or scientific research

Details

47```47```

48 48 

49```bash49```bash

50curl https://api.openai.com/v1/embeddings \\50curl https://api.openai.com/v1/embeddings \

51 -H "Content-Type: application/json" \\51 -H "Content-Type: application/json" \

52 -H "Authorization: Bearer $OPENAI_API_KEY" \\52 -H "Authorization: Bearer $OPENAI_API_KEY" \

53 -d '{53 -d '{

54 "input": "Your text string goes here",54 "input": "Your text string goes here",

55 "model": "text-embedding-3-small"55 "model": "text-embedding-3-small"

guides/evals.md +19 −19

Details

33 Categorize IT support tickets33 Categorize IT support tickets

34 34 

35```bash35```bash

36curl https://api.openai.com/v1/responses \\36curl https://api.openai.com/v1/responses \

37 -H "Authorization: Bearer $OPENAI_API_KEY" \\37 -H "Authorization: Bearer $OPENAI_API_KEY" \

38 -H "Content-Type: application/json" \\38 -H "Content-Type: application/json" \

39 -d '{39 -d '{

40 "model": "gpt-5.5",40 "model": "gpt-5.5",

41 "input": [41 "input": [


55import OpenAI from "openai";55import OpenAI from "openai";

56const client = new OpenAI();56const client = new OpenAI();

57 57 

58const instructions = \`58const instructions = `

59You are an expert in categorizing IT support tickets. Given the support59You are an expert in categorizing IT support tickets. Given the support

60ticket below, categorize the request into one of "Hardware", "Software",60ticket below, categorize the request into one of "Hardware", "Software",

61or "Other". Respond with only one of those words.61or "Other". Respond with only one of those words.

62\`;62`;

63 63 

64const ticket = "My monitor won't turn on - help!";64const ticket = "My monitor won't turn on - help!";

65 65 


109Create an eval109Create an eval

110 110 

111```bash111```bash

112curl https://api.openai.com/v1/evals \\112curl https://api.openai.com/v1/evals \

113 -H "Authorization: Bearer $OPENAI_API_KEY" \\113 -H "Authorization: Bearer $OPENAI_API_KEY" \

114 -H "Content-Type: application/json" \\114 -H "Content-Type: application/json" \

115 -d '{115 -d '{

116 "name": "IT Ticket Categorization",116 "name": "IT Ticket Categorization",

117 "data_source_config": {117 "data_source_config": {


295Upload a test data file295Upload a test data file

296 296 

297```bash297```bash

298curl https://api.openai.com/v1/files \\298curl https://api.openai.com/v1/files \

299 -H "Authorization: Bearer $OPENAI_API_KEY" \\299 -H "Authorization: Bearer $OPENAI_API_KEY" \

300 -F purpose="evals" \\300 -F purpose="evals" \

301 -F file="@tickets.jsonl"301 -F file="@tickets.jsonl"

302```302```

303 303 


354 Create an eval run354 Create an eval run

355 355 

356```bash356```bash

357curl https://api.openai.com/v1/evals/YOUR_EVAL_ID/runs \\357curl https://api.openai.com/v1/evals/YOUR_EVAL_ID/runs \

358 -H "Authorization: Bearer $OPENAI_API_KEY" \\358 -H "Authorization: Bearer $OPENAI_API_KEY" \

359 -H "Content-Type: application/json" \\359 -H "Content-Type: application/json" \

360 -d '{360 -d '{

361 "name": "Categorization text run",361 "name": "Categorization text run",

362 "data_source": {362 "data_source": {

363 "type": "responses",363 "type": "responses",

364 "model": "gpt-4.1",364 "model": "gpt-5.5",

365 "input_messages": {365 "input_messages": {

366 "type": "template",366 "type": "template",

367 "template": [367 "template": [


382 name: "Categorization text run",382 name: "Categorization text run",

383 data_source: {383 data_source: {

384 type: "responses",384 type: "responses",

385 model: "gpt-4.1",385 model: "gpt-5.5",

386 input_messages: {386 input_messages: {

387 type: "template",387 type: "template",

388 template: [388 template: [


406 name="Categorization text run",406 name="Categorization text run",

407 data_source={407 data_source={

408 "type": "responses",408 "type": "responses",

409 "model": "gpt-4.1",409 "model": "gpt-5.5",

410 "input_messages": {410 "input_messages": {

411 "type": "template",411 "type": "template",

412 "template": [412 "template": [


492Retrieve eval run status492Retrieve eval run status

493 493 

494```bash494```bash

495curl https://api.openai.com/v1/evals/YOUR_EVAL_ID/runs/YOUR_RUN_ID \\495curl https://api.openai.com/v1/evals/YOUR_EVAL_ID/runs/YOUR_RUN_ID \

496 -H "Authorization: Bearer $OPENAI_API_KEY" \\496 -H "Authorization: Bearer $OPENAI_API_KEY" \

497 -H "Content-Type: application/json"497 -H "Content-Type: application/json"

498```498```

499 499 

Details

54Use an external file URL54Use an external file URL

55 55 

56```bash56```bash

57curl "https://api.openai.com/v1/responses" \\57curl "https://api.openai.com/v1/responses" \

58 -H "Content-Type: application/json" \\58 -H "Content-Type: application/json" \

59 -H "Authorization: Bearer $OPENAI_API_KEY" \\59 -H "Authorization: Bearer $OPENAI_API_KEY" \

60 -d '{60 -d '{

61 "model": "gpt-5.5",61 "model": "gpt-5.5",

62 "input": [62 "input": [


165Upload a file165Upload a file

166 166 

167```bash167```bash

168curl https://api.openai.com/v1/files \\168curl https://api.openai.com/v1/files \

169 -H "Authorization: Bearer $OPENAI_API_KEY" \\169 -H "Authorization: Bearer $OPENAI_API_KEY" \

170 -F purpose="user_data" \\170 -F purpose="user_data" \

171 -F file="@draconomicon.pdf"171 -F file="@draconomicon.pdf"

172 172 

173curl "https://api.openai.com/v1/responses" \\173curl "https://api.openai.com/v1/responses" \

174 -H "Content-Type: application/json" \\174 -H "Content-Type: application/json" \

175 -H "Authorization: Bearer $OPENAI_API_KEY" \\175 -H "Authorization: Bearer $OPENAI_API_KEY" \

176 -d '{176 -d '{

177 "model": "gpt-5.5",177 "model": "gpt-5.5",

178 "input": [178 "input": [


290Send a Base64-encoded file290Send a Base64-encoded file

291 291 

292```bash292```bash

293curl "https://api.openai.com/v1/responses" \\293curl "https://api.openai.com/v1/responses" \

294 -H "Content-Type: application/json" \\294 -H "Content-Type: application/json" \

295 -H "Authorization: Bearer $OPENAI_API_KEY" \\295 -H "Authorization: Bearer $OPENAI_API_KEY" \

296 -d '{296 -d '{

297 "model": "gpt-5.5",297 "model": "gpt-5.5",

298 "input": [298 "input": [


331 {331 {

332 type: "input_file",332 type: "input_file",

333 filename: "draconomicon.pdf",333 filename: "draconomicon.pdf",

334 file_data: \`data:application/pdf;base64,\${base64String}\`,334 file_data: `data:application/pdf;base64,${base64String}`,

335 },335 },

336 {336 {

337 type: "input_text",337 type: "input_text",

Details

49```49```

50 50 

51```bash51```bash

52curl https://api.openai.com/v1/responses \\52curl https://api.openai.com/v1/responses \

53 -H "Authorization: Bearer $OPENAI_API_KEY" \\53 -H "Authorization: Bearer $OPENAI_API_KEY" \

54 -H "Content-Type: application/json" \\54 -H "Content-Type: application/json" \

55 -d '{55 -d '{

56 "model": "gpt-5.5",56 "model": "gpt-5.5",

57 "instructions": "List and describe all the metaphors used in this book.",57 "instructions": "List and describe all the metaphors used in this book.",

Details

72import fs from "fs";72import fs from "fs";

73const openai = new OpenAI();73const openai = new OpenAI();

74 74 

75const prompt = \`75const prompt = `

76A children's book drawing of a veterinarian using a stethoscope to 76A children's book drawing of a veterinarian using a stethoscope to

77listen to the heartbeat of a baby otter.77listen to the heartbeat of a baby otter.

78\`;78`;

79 79 

80const result = await openai.images.generate({80const result = await openai.images.generate({

81 model: "gpt-image-2",81 model: "gpt-image-2",


112```112```

113 113 

114```bash114```bash

115curl -X POST "https://api.openai.com/v1/images/generations" \\115curl -X POST "https://api.openai.com/v1/images/generations" \

116 -H "Authorization: Bearer $OPENAI_API_KEY" \\116 -H "Authorization: Bearer $OPENAI_API_KEY" \

117 -H "Content-type: application/json" \\117 -H "Content-type: application/json" \

118 -d '{118 -d '{

119 "model": "gpt-image-2",119 "model": "gpt-image-2",

120 "prompt": "A children'\\''s book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."120 "prompt": "A children'\''s book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."

121 }' | jq -r '.data[0].b64_json' | base64 --decode > otter.png121 }' | jq -r '.data[0].b64_json' | base64 --decode > otter.png

122```122```

123 123 

124```cli124```cli

125openai images generate \\125openai images generate \

126 --model gpt-image-2 \\126 --model gpt-image-2 \

127 --prompt "A children's book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter." \\127 --prompt "A children's book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter." \

128 --raw-output \\128 --raw-output \

129 --transform 'data.0.b64_json' | base64 --decode > otter.png129 --transform 'data.0.b64_json' | base64 --decode > otter.png

130```130```

131 131 


531for await (const event of stream) {531for await (const event of stream) {

532 if (event.type === "response.image_generation_call.partial_image") {532 if (event.type === "response.image_generation_call.partial_image") {

533 const idx = event.partial_image_index;533 const idx = event.partial_image_index;

534 saveBase64Image(\`river-partial-\${idx}.png\`, event.partial_image_b64);534 saveBase64Image(`river-partial-${idx}.png`, event.partial_image_b64);

535 } else if (event.type === "response.completed") {535 } else if (event.type === "response.completed") {

536 const imageData = event.response.output536 const imageData = event.response.output

537 .filter((output) => output.type === "image_generation_call")537 .filter((output) => output.type === "image_generation_call")


602 const idx = event.partial_image_index;602 const idx = event.partial_image_index;

603 const imageBase64 = event.b64_json;603 const imageBase64 = event.b64_json;

604 const imageBuffer = Buffer.from(imageBase64, "base64");604 const imageBuffer = Buffer.from(imageBase64, "base64");

605 fs.writeFileSync(\`river\${idx}.png\`, imageBuffer);605 fs.writeFileSync(`river${idx}.png`, imageBuffer);

606 }606 }

607}607}

608```608```


724 724 

725const client = new OpenAI();725const client = new OpenAI();

726 726 

727const prompt = \`727const prompt = `

728Generate a photorealistic image of a gift basket on a white background 728Generate a photorealistic image of a gift basket on a white background

729labeled 'Relax & Unwind' with a ribbon and handwriting-like font, 729labeled 'Relax & Unwind' with a ribbon and handwriting-like font,

730containing all the items in the reference pictures.730containing all the items in the reference pictures.

731\`;731`;

732 732 

733const imageFiles = [733const imageFiles = [

734 "bath-bomb.png",734 "bath-bomb.png",


758```758```

759 759 

760```bash760```bash

761curl -s -D >(grep -i x-request-id >&2) \\761curl -s -D >(grep -i x-request-id >&2) \

762 -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \\762 -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \

763 -X POST "https://api.openai.com/v1/images/edits" \\763 -X POST "https://api.openai.com/v1/images/edits" \

764 -H "Authorization: Bearer $OPENAI_API_KEY" \\764 -H "Authorization: Bearer $OPENAI_API_KEY" \

765 -F "model=gpt-image-2" \\765 -F "model=gpt-image-2" \

766 -F "image[]=@body-lotion.png" \\766 -F "image[]=@body-lotion.png" \

767 -F "image[]=@bath-bomb.png" \\767 -F "image[]=@bath-bomb.png" \

768 -F "image[]=@incense-kit.png" \\768 -F "image[]=@incense-kit.png" \

769 -F "image[]=@soap.png" \\769 -F "image[]=@soap.png" \

770 -F 'prompt=Generate a photorealistic image of a gift basket on a white background labeled "Relax & Unwind" with a ribbon and handwriting-like font, containing all the items in the reference pictures'770 -F 'prompt=Generate a photorealistic image of a gift basket on a white background labeled "Relax & Unwind" with a ribbon and handwriting-like font, containing all the items in the reference pictures'

771```771```

772 772 

773```cli773```cli

774openai images edit \\774openai images edit \

775 --model gpt-image-2 \\775 --model gpt-image-2 \

776 --image body-lotion.png \\776 --image body-lotion.png \

777 --image bath-bomb.png \\777 --image bath-bomb.png \

778 --image incense-kit.png \\778 --image incense-kit.png \

779 --image soap.png \\779 --image soap.png \

780 --prompt 'Generate a photorealistic image of a gift basket on a white background labeled "Relax & Unwind" with a ribbon and handwriting-like font, containing all the items in the reference pictures' \\780 --prompt 'Generate a photorealistic image of a gift basket on a white background labeled "Relax & Unwind" with a ribbon and handwriting-like font, containing all the items in the reference pictures' \

781 --raw-output \\781 --raw-output \

782 --transform 'data.0.b64_json' | base64 --decode > gift-basket.png782 --transform 'data.0.b64_json' | base64 --decode > gift-basket.png

783```783```

784 784 


962```962```

963 963 

964```bash964```bash

965curl -s -D >(grep -i x-request-id >&2) \\965curl -s -D >(grep -i x-request-id >&2) \

966 -o >(jq -r '.data[0].b64_json' | base64 --decode > lounge.png) \\966 -o >(jq -r '.data[0].b64_json' | base64 --decode > lounge.png) \

967 -X POST "https://api.openai.com/v1/images/edits" \\967 -X POST "https://api.openai.com/v1/images/edits" \

968 -H "Authorization: Bearer $OPENAI_API_KEY" \\968 -H "Authorization: Bearer $OPENAI_API_KEY" \

969 -F "model=gpt-image-2" \\969 -F "model=gpt-image-2" \

970 -F "mask=@mask.png" \\970 -F "mask=@mask.png" \

971 -F "image[]=@sunlit_lounge.png" \\971 -F "image[]=@sunlit_lounge.png" \

972 -F 'prompt=A sunlit indoor lounge area with a pool containing a flamingo'972 -F 'prompt=A sunlit indoor lounge area with a pool containing a flamingo'

973```973```

974 974 

975```cli975```cli

976openai images edit \\976openai images edit \

977 --model gpt-image-2 \\977 --model gpt-image-2 \

978 --image sunlit_lounge.png \\978 --image sunlit_lounge.png \

979 --mask mask.png \\979 --mask mask.png \

980 --prompt "A sunlit indoor lounge area with a pool containing a flamingo" \\980 --prompt "A sunlit indoor lounge area with a pool containing a flamingo" \

981 --raw-output \\981 --raw-output \

982 --transform 'data.0.b64_json' | base64 --decode > out.png982 --transform 'data.0.b64_json' | base64 --decode > out.png

983```983```

984 984 

Details

66 tools=[{"type": "image_generation"}],66 tools=[{"type": "image_generation"}],

67)67)

68 68 

69// Save the image to a file69# Save the image to a file

70image_data = [70image_data = [

71 output.result71 output.result

72 for output in response.output72 for output in response.output


80```80```

81 81 

82```cli82```cli

83openai responses create \\83openai responses create \

84 --model gpt-5.5 \\84 --model gpt-5.5 \

85 --raw-output \\85 --raw-output \

86 --transform 'output.#(type=="image_generation_call").result' <<'YAML' | base64 --decode > cat_and_otter.png86 --transform 'output.#(type=="image_generation_call").result' <<'YAML' | base64 --decode > cat_and_otter.png

87tools:87tools:

88 - type: image_generation88 - type: image_generation


189```189```

190 190 

191```bash191```bash

192curl https://api.openai.com/v1/responses \\192curl https://api.openai.com/v1/responses \

193 -H "Content-Type: application/json" \\193 -H "Content-Type: application/json" \

194 -H "Authorization: Bearer $OPENAI_API_KEY" \\194 -H "Authorization: Bearer $OPENAI_API_KEY" \

195 -d '{195 -d '{

196 "model": "gpt-5.5",196 "model": "gpt-5.5",

197 "input": [197 "input": [


210```210```

211 211 

212```cli212```cli

213openai responses create \\213openai responses create \

214 --model gpt-5.5 \\214 --model gpt-5.5 \

215 --raw-output \\215 --raw-output \

216 --transform 'output.#(type=="message").content.0.text' <<'YAML'216 --transform 'output.#(type=="message").content.0.text' <<'YAML'

217input:217input:

218 - role: user218 - role: user


247 { type: "input_text", text: "what's in this image?" },247 { type: "input_text", text: "what's in this image?" },

248 {248 {

249 type: "input_image",249 type: "input_image",

250 image_url: \`data:image/jpeg;base64,\${base64Image}\`,250 image_url: `data:image/jpeg;base64,${base64Image}`,

251 },251 },

252 ],252 ],

253 },253 },

Details

1# Migrate to the Responses API1# Migrate to the Responses API

2 2 

3import {

4 CheckCircleFilled,

5 XCircle,

6} from "@components/react/oai/platform/ui/Icon.react";

7 

8 

9 

10 

11 

12The [Responses API](https://developers.openai.com/api/docs/api-reference/responses) is our new API primitive, an evolution of [Chat Completions](https://developers.openai.com/api/docs/api-reference/chat) which brings added simplicity and powerful agentic primitives to your integrations.3The [Responses API](https://developers.openai.com/api/docs/api-reference/responses) is our new API primitive, an evolution of [Chat Completions](https://developers.openai.com/api/docs/api-reference/chat) which brings added simplicity and powerful agentic primitives to your integrations.

13 4 

14**While Chat Completions remains supported, Responses is recommended for all new projects.**5**While Chat Completions remains supported, Responses is recommended for all new projects.**


100 { "role": "user", "content": "Hello!" }91 { "role": "user", "content": "Hello!" }

101]'92]'

102 93 

103curl -s https://api.openai.com/v1/chat/completions \\94curl -s https://api.openai.com/v1/chat/completions \

104 -H "Content-Type: application/json" \\95 -H "Content-Type: application/json" \

105 -H "Authorization: Bearer $OPENAI_API_KEY" \\96 -H "Authorization: Bearer $OPENAI_API_KEY" \

106 -d "{97 -d "{

107 \\"model\\": \\"gpt-5.5\\",98 \"model\": \"gpt-5.5\",

108 \\"messages\\": $INPUT99 \"messages\": $INPUT

109 }"100 }"

110 101 

111curl -s https://api.openai.com/v1/responses \\102curl -s https://api.openai.com/v1/responses \

112 -H "Content-Type: application/json" \\103 -H "Content-Type: application/json" \

113 -H "Authorization: Bearer $OPENAI_API_KEY" \\104 -H "Authorization: Bearer $OPENAI_API_KEY" \

114 -d "{105 -d "{

115 \\"model\\": \\"gpt-5.5\\",106 \"model\": \"gpt-5.5\",

116 \\"input\\": $INPUT107 \"input\": $INPUT

117 }"108 }"

118```109```

119 110 


189```180```

190 181 

191```bash182```bash

192curl https://api.openai.com/v1/chat/completions \\183curl https://api.openai.com/v1/chat/completions \

193 -H "Content-Type: application/json" \\184 -H "Content-Type: application/json" \

194 -H "Authorization: Bearer $OPENAI_API_KEY" \\185 -H "Authorization: Bearer $OPENAI_API_KEY" \

195 -d '{186 -d '{

196 "model": "gpt-5.5",187 "model": "gpt-5.5",

197 "messages": [188 "messages": [


235```226```

236 227 

237```bash228```bash

238curl https://api.openai.com/v1/responses \\229curl https://api.openai.com/v1/responses \

239 -H "Content-Type: application/json" \\230 -H "Content-Type: application/json" \

240 -H "Authorization: Bearer $OPENAI_API_KEY" \\231 -H "Authorization: Bearer $OPENAI_API_KEY" \

241 -d '{232 -d '{

242 "model": "gpt-5.5",233 "model": "gpt-5.5",

243 "instructions": "You are a helpful assistant.",234 "instructions": "You are a helpful assistant.",


447 Structured Outputs438 Structured Outputs

448 439 

449```bash440```bash

450curl https://api.openai.com/v1/chat/completions \\441curl https://api.openai.com/v1/chat/completions \

451 -H "Content-Type: application/json" \\442 -H "Content-Type: application/json" \

452 -H "Authorization: Bearer $OPENAI_API_KEY" \\443 -H "Authorization: Bearer $OPENAI_API_KEY" \

453 -d '{444 -d '{

454 "model": "gpt-5.5",445 "model": "gpt-5.5",

455 "messages": [446 "messages": [


575 Structured Outputs566 Structured Outputs

576 567 

577```bash568```bash

578curl https://api.openai.com/v1/responses \\569curl https://api.openai.com/v1/responses \

579 -H "Content-Type: application/json" \\570 -H "Content-Type: application/json" \

580 -H "Authorization: Bearer $OPENAI_API_KEY" \\571 -H "Authorization: Bearer $OPENAI_API_KEY" \

581 -d '{572 -d '{

582 "model": "gpt-5.5",573 "model": "gpt-5.5",

583 "input": "Jane, 54 years old",574 "input": "Jane, 54 years old",


708```javascript699```javascript

709async function web_search(query) {700async function web_search(query) {

710 const fetch = (await import('node-fetch')).default;701 const fetch = (await import('node-fetch')).default;

711 const res = await fetch(\`https://api.example.com/search?q=\${query}\`);702 const res = await fetch(`https://api.example.com/search?q=${query}`);

712 const data = await res.json();703 const data = await res.json();

713 return data.results;704 return data.results;

714}705}


761```752```

762 753 

763```bash754```bash

764curl https://api.example.com/search \\755curl https://api.example.com/search \

765 -G \\756 -G \

766 --data-urlencode "q=your+search+term" \\757 --data-urlencode "q=your+search+term" \

767 --data-urlencode "key=$SEARCH_API_KEY"\758 --data-urlencode "key=$SEARCH_API_KEY"\

768```759```

769 760 


794```785```

795 786 

796```bash787```bash

797curl https://api.openai.com/v1/responses \\788curl https://api.openai.com/v1/responses \

798 -H "Content-Type: application/json" \\789 -H "Content-Type: application/json" \

799 -H "Authorization: Bearer $OPENAI_API_KEY" \\790 -H "Authorization: Bearer $OPENAI_API_KEY" \

800 -d '{791 -d '{

801 "model": "gpt-5.5",792 "model": "gpt-5.5",

802 "input": "Who is the current president of France?",793 "input": "Who is the current president of France?",

Details

1# Model optimization1# Model optimization

2 2 

3import {

4 Report,

5 Code,

6 Tools,

7} from "@components/react/oai/platform/ui/Icon.react";

8import {

9 evalsIcon,

10 promptIcon,

11 fineTuneIcon,

12} from "./model-optimization-icons";

13 

14 

15 

16 

17 

18 

19LLM output is non-deterministic, and model behavior changes between model snapshots and families. Developers must constantly measure and tune the performance of LLM applications to ensure they're getting the best results. In this guide, we explore the techniques and OpenAI platform tools you can use to ensure high quality outputs from the model.3LLM output is non-deterministic, and model behavior changes between model snapshots and families. Developers must constantly measure and tune the performance of LLM applications to ensure they're getting the best results. In this guide, we explore the techniques and OpenAI platform tools you can use to ensure high quality outputs from the model.

20 4 

21This guide covers evals and fine-tuning workflows that are being moved into5This guide covers evals and fine-tuning workflows that are being moved into

Details

27```javascript27```javascript

28import OpenAI from "openai";28import OpenAI from "openai";

29 29 

30const code = \`30const code = `

31class User {31class User {

32 firstName: string = "";32 firstName: string = "";

33 lastName: string = "";33 lastName: string = "";


35}35}

36 36 

37export default User;37export default User;

38\`.trim();38`.trim();

39 39 

40const openai = new OpenAI();40const openai = new OpenAI();

41 41 

42const refactorPrompt = \`42const refactorPrompt = `

43Replace the "username" property with an "email" property. Respond only 43Replace the "username" property with an "email" property. Respond only

44with code, and with no markdown formatting.44with code, and with no markdown formatting.

45\`;45`;

46 46 

47const completion = await openai.chat.completions.create({47const completion = await openai.chat.completions.create({

48 model: "gpt-4.1",48 model: "gpt-4.1",


111```111```

112 112 

113```bash113```bash

114curl https://api.openai.com/v1/chat/completions \\114curl https://api.openai.com/v1/chat/completions \

115 -H "Content-Type: application/json" \\115 -H "Content-Type: application/json" \

116 -H "Authorization: Bearer $OPENAI_API_KEY" \\116 -H "Authorization: Bearer $OPENAI_API_KEY" \

117 -d '{117 -d '{

118 "model": "gpt-4.1",118 "model": "gpt-4.1",

119 "messages": [119 "messages": [


174```javascript174```javascript

175import OpenAI from "openai";175import OpenAI from "openai";

176 176 

177const code = \`177const code = `

178class User {178class User {

179 firstName: string = "";179 firstName: string = "";

180 lastName: string = "";180 lastName: string = "";


182}182}

183 183 

184export default User;184export default User;

185\`.trim();185`.trim();

186 186 

187const openai = new OpenAI();187const openai = new OpenAI();

188 188 

189const refactorPrompt = \`189const refactorPrompt = `

190Replace the "username" property with an "email" property. Respond only 190Replace the "username" property with an "email" property. Respond only

191with code, and with no markdown formatting.191with code, and with no markdown formatting.

192\`;192`;

193 193 

194const completion = await openai.chat.completions.create({194const completion = await openai.chat.completions.create({

195 model: "gpt-4.1",195 model: "gpt-4.1",

Details

97```97```

98 98 

99```bash99```bash

100curl "https://api.openai.com/v1/responses" \\100curl "https://api.openai.com/v1/responses" \

101 -H "Content-Type: application/json" \\101 -H "Content-Type: application/json" \

102 -H "Authorization: Bearer $OPENAI_API_KEY" \\102 -H "Authorization: Bearer $OPENAI_API_KEY" \

103 -d '{103 -d '{

104 "model": "gpt-5.5",104 "model": "gpt-5.5",

105 "reasoning": {"effort": "low"},105 "reasoning": {"effort": "low"},


158```158```

159 159 

160```bash160```bash

161curl "https://api.openai.com/v1/responses" \\161curl "https://api.openai.com/v1/responses" \

162 -H "Content-Type: application/json" \\162 -H "Content-Type: application/json" \

163 -H "Authorization: Bearer $OPENAI_API_KEY" \\163 -H "Authorization: Bearer $OPENAI_API_KEY" \

164 -d '{164 -d '{

165 "model": "gpt-5.5",165 "model": "gpt-5.5",

166 "reasoning": {"effort": "low"},166 "reasoning": {"effort": "low"},


236 236 

237<div data-content-switcher-pane data-value="prompt">237<div data-content-switcher-pane data-value="prompt">

238 <div class="hidden">Example prompt</div>238 <div class="hidden">Example prompt</div>

239 A developer message for code generation

240 

241```text

242# Identity

243 

244You are coding assistant that helps enforce the use of snake case

245variables in JavaScript code, and writing code that will run in

246Internet Explorer version 6.

247 

248# Instructions

249 

250* When defining variables, use snake case names (e.g. my_variable)

251 instead of camel case names (e.g. myVariable).

252* To support old browsers, declare variables using the older

253 "var" keyword.

254* Do not give responses with Markdown formatting, just return

255 the code as requested.

256 

257# Examples

258 

259<user_query>

260How do I declare a string variable for a first name?

261</user_query>

262 

263<assistant_response>

264var first_name = "Anna";

265</assistant_response>

266```

267 

239 </div>268 </div>

240 <div data-content-switcher-pane data-value="code" hidden>269 <div data-content-switcher-pane data-value="code" hidden>

241 <div class="hidden">API request</div>270 <div class="hidden">API request</div>


274```303```

275 304 

276```bash305```bash

277curl https://api.openai.com/v1/responses \\306curl https://api.openai.com/v1/responses \

278 -H "Authorization: Bearer $OPENAI_API_KEY" \\307 -H "Authorization: Bearer $OPENAI_API_KEY" \

279 -H "Content-Type: application/json" \\308 -H "Content-Type: application/json" \

280 -d '{309 -d '{

281 "model": "gpt-5.5",310 "model": "gpt-5.5",

282 "instructions": "'"$(< prompt.txt)"'",311 "instructions": "'"$(< prompt.txt)"'",

Details

1# Prompt generation1# Prompt generation

2 2 

3import {

4 FUNCTION_META_SCHEMA,

5 FUNCTION_META_SCHEMA_PROMPT,

6 GENERAL_META_PROMPT,

7 GENERAL_META_PROMPT_EDIT,

8 META_SCHEMA,

9 META_SCHEMA_PROMPT,

10 REALTIME_META_PROMPT,

11 REALTIME_META_PROMPT_EDIT,

12} from "./prompts";

13 

14The **Generate** button in the [Playground](https://platform.openai.com/chat/edit) lets you generate prompts, [functions](https://developers.openai.com/api/docs/guides/function-calling), and [schemas](https://developers.openai.com/api/docs/guides/structured-outputs#supported-schemas) from just a description of your task. This guide will walk through exactly how it works.3The **Generate** button in the [Playground](https://platform.openai.com/chat/edit) lets you generate prompts, [functions](https://developers.openai.com/api/docs/guides/function-calling), and [schemas](https://developers.openai.com/api/docs/guides/structured-outputs#supported-schemas) from just a description of your task. This guide will walk through exactly how it works.

15 4 

16## Overview5## Overview


30 19 

31### Meta-prompts20### Meta-prompts

32 21 

33export const textMeta = {

34 python:`

35from openai import OpenAI

36 

37client = OpenAI()

38 

39META_PROMPT = """\n`+

40GENERAL_META_PROMPT + "\n" +`""".strip()

41 

42def generate_prompt(task_or_prompt: str):

43completion = client.chat.completions.create(

44model="gpt-4o",

45messages=[

46{

47"role": "system",

48"content": META_PROMPT,

49},

50{

51"role": "user",

52"content": "Task, Goal, or Current Prompt:\\n" + task_or_prompt,

53},

54],

55)

56 

57 return completion.choices[0].message.content

58 

59`.trim(),

60};

61 

62export const audioMeta = {

63 python:`

64from openai import OpenAI

65 

66client = OpenAI()

67 

68META_PROMPT = """\n`+

69REALTIME_META_PROMPT + "\n" +`""".strip()

70 

71def generate_prompt(task_or_prompt: str):

72completion = client.chat.completions.create(

73model="gpt-4o",

74messages=[

75{

76"role": "system",

77"content": META_PROMPT,

78},

79{

80"role": "user",

81"content": "Task, Goal, or Current Prompt:\\n" + task_or_prompt,

82},

83],

84)

85 

86 return completion.choices[0].message.content

87 

88`.trim(),

89};

90 

91 22 

92 23 

93<div data-content-switcher-pane data-value="text-out">24<div data-content-switcher-pane data-value="text-out">


103 34 

104To edit prompts, we use a slightly modified meta-prompt. While direct edits are straightforward to apply, identifying necessary changes for more open-ended revisions can be challenging. To address this, we include a **reasoning section** at the beginning of the response. This section helps guide the model in determining what changes are needed by evaluating the existing prompt's clarity, chain-of-thought ordering, overall structure, and specificity, among other factors. The reasoning section makes suggestions for improvements and is then parsed out from the final response.35To edit prompts, we use a slightly modified meta-prompt. While direct edits are straightforward to apply, identifying necessary changes for more open-ended revisions can be challenging. To address this, we include a **reasoning section** at the beginning of the response. This section helps guide the model in determining what changes are needed by evaluating the existing prompt's clarity, chain-of-thought ordering, overall structure, and specificity, among other factors. The reasoning section makes suggestions for improvements and is then parsed out from the final response.

105 36 

106export const textMetaEdits = {

107 python:`

108from openai import OpenAI

109 

110client = OpenAI()

111 

112META_PROMPT = """\n`+

113GENERAL_META_PROMPT_EDIT + "\n" +`""".strip()

114 

115def generate_prompt(task_or_prompt: str):

116completion = client.chat.completions.create(

117model="gpt-4o",

118messages=[

119{

120"role": "system",

121"content": META_PROMPT,

122},

123{

124"role": "user",

125"content": "Task, Goal, or Current Prompt:\\n" + task_or_prompt,

126},

127],

128)

129 

130 return completion.choices[0].message.content

131 

132`.trim(),

133};

134 

135export const audioMetaEdits = {

136 python:`

137from openai import OpenAI

138 

139client = OpenAI()

140 

141META_PROMPT = """\n`+

142REALTIME_META_PROMPT_EDIT + "\n" +`""".strip()

143 

144def generate_prompt(task_or_prompt: str):

145completion = client.chat.completions.create(

146model="gpt-4o",

147messages=[

148{

149"role": "system",

150"content": META_PROMPT,

151},

152{

153"role": "user",

154"content": "Task, Goal, or Current Prompt:\\n" + task_or_prompt,

155},

156],

157)

158 

159 return completion.choices[0].message.content

160 

161`.trim(),

162};

163 

164 37 

165 38 

166<div data-content-switcher-pane data-value="text-out">39<div data-content-switcher-pane data-value="text-out">


224 97 

225Each meta-schema has a corresponding prompt which includes few-shot examples. When combined with the reliability of Structured Outputs — even without strict mode — we were able to generate schemas.98Each meta-schema has a corresponding prompt which includes few-shot examples. When combined with the reliability of Structured Outputs — even without strict mode — we were able to generate schemas.

226 99 

227export const soMetaSchema = {

228 python:`

229from openai import OpenAI

230import json

231 

232client = OpenAI()

233 

234META_SCHEMA = ` +

235JSON.stringify(META_SCHEMA, null, 2).replaceAll("false", "False") + "\n" +

236 

237`

238META_PROMPT = """\n` +

239META_SCHEMA_PROMPT + "\n" + `""".strip()

240 

241def generate_schema(description: str):

242completion = client.chat.completions.create(

243model="gpt-5.4-mini",

244response_format={"type": "json_schema", "json_schema": META_SCHEMA},

245messages=[

246{

247"role": "system",

248"content": META_PROMPT,

249},

250{

251"role": "user",

252"content": "Description:\\n" + description,

253},

254],

255)

256 

257 return json.loads(completion.choices[0].message.content)

258 

259`.trim(),

260};

261 

262export const soFunctionSchema = {

263 python:`

264from openai import OpenAI

265import json

266 

267client = OpenAI()

268 

269META_SCHEMA = ` +

270JSON.stringify(FUNCTION_META_SCHEMA, null, 2).replaceAll("false", "False") + "\n" +

271 

272`

273META_PROMPT = """\n` +

274FUNCTION_META_SCHEMA_PROMPT + "\n" + `""".strip()

275 

276def generate_function_schema(description: str):

277completion = client.chat.completions.create(

278model="gpt-5.4-mini",

279response_format={"type": "json_schema", "json_schema": META_SCHEMA},

280messages=[

281{

282"role": "system",

283"content": META_PROMPT,

284},

285{

286"role": "user",

287"content": "Description:\\n" + description,

288},

289],

290)

291 

292 return json.loads(completion.choices[0].message.content)

293 

294`.trim(),

295};

296 

297 100 

298 101 

299<div data-content-switcher-pane data-value="structured-output">102<div data-content-switcher-pane data-value="structured-output">

Details

47```47```

48 48 

49```bash49```bash

50curl https://api.openai.com/v1/responses \\50curl https://api.openai.com/v1/responses \

51 -H "Content-Type: application/json" \\51 -H "Content-Type: application/json" \

52 -H "Authorization: Bearer $OPENAI_API_KEY" \\52 -H "Authorization: Bearer $OPENAI_API_KEY" \

53 -d '{53 -d '{

54 "prompt": {54 "prompt": {

55 "prompt_id": "pmpt_123",55 "prompt_id": "pmpt_123",


73const client = new OpenAI();73const client = new OpenAI();

74 74 

75const response = await client.responses.create({75const response = await client.responses.create({

76 model: "gpt-5.1",76 model: "gpt-5.5",

77 input: [77 input: [

78 {78 {

79 role: "system",79 role: "system",


97client = OpenAI()97client = OpenAI()

98 98 

99response = client.responses.create(99response = client.responses.create(

100 model="gpt-5.1",100 model="gpt-5.5",

101 input=[101 input=[

102 {102 {

103 "role": "system",103 "role": "system",


114```114```

115 115 

116```bash116```bash

117curl https://api.openai.com/v1/responses \\117curl https://api.openai.com/v1/responses \

118 -H "Content-Type: application/json" \\118 -H "Content-Type: application/json" \

119 -H "Authorization: Bearer $OPENAI_API_KEY" \\119 -H "Authorization: Bearer $OPENAI_API_KEY" \

120 -d '{120 -d '{

121 "model": "gpt-5.1",121 "model": "gpt-5.5",

122 "input": [122 "input": [

123 {123 {

124 "role": "system",124 "role": "system",


169 },169 },

170 {170 {

171 role: "user",171 role: "user",

172 content: \`Customer name: \${customerName}. Issue: \${issue}. Write a response to the customer.\`,172 content: `Customer name: ${customerName}. Issue: ${issue}. Write a response to the customer.`,

173 },173 },

174 ];174 ];

175}175}

176 176 

177const response = await client.responses.create({177const response = await client.responses.create({

178 model: "gpt-5.1",178 model: "gpt-5.5",

179 input: buildSupportPrompt({179 input: buildSupportPrompt({

180 customerName: "Acme",180 customerName: "Acme",

181 issue: "billing question",181 issue: "billing question",


201 ]201 ]

202 202 

203response = client.responses.create(203response = client.responses.create(

204 model="gpt-5.1",204 model="gpt-5.5",

205 input=build_support_prompt(205 input=build_support_prompt(

206 customer_name="Acme",206 customer_name="Acme",

207 issue="billing question",207 issue="billing question",

Details

1# Realtime and audio1# Realtime and audio

2 2 

3import {

4 Cube,

5 Desktop,

6 Phone,

7} from "@components/react/oai/platform/ui/Icon.react";

8 

9Start with the outcome you want to build. Realtime sessions are best for live audio that needs low latency. Request-based audio APIs are best for files, bounded requests, or generated speech that doesn't need a live session.3Start with the outcome you want to build. Realtime sessions are best for live audio that needs low latency. Request-based audio APIs are best for files, bounded requests, or generated speech that doesn't need a live session.

10 4 

11## Common use cases5## Common use cases

Details

617 content: [617 content: [

618 {618 {

619 type: "input_image",619 type: "input_image",

620 image_url: \`data:image/{format};base64,\${base64Image}\`,620 image_url: `data:image/{format};base64,${base64Image}`,

621 },621 },

622 ],622 ],

623 },623 },


661Create an out-of-band model response661Create an out-of-band model response

662 662 

663```javascript663```javascript

664const prompt = \`664const prompt = `

665Analyze the conversation so far. If it is related to support, output665Analyze the conversation so far. If it is related to support, output

666"support". If it is related to sales, output "sales".666"support". If it is related to sales, output "sales".

667\`;667`;

668 668 

669const event = {669const event = {

670 type: "response.create",670 type: "response.create",


835Insert no-context model responses into the default conversation835Insert no-context model responses into the default conversation

836 836 

837```javascript837```javascript

838const prompt = \`838const prompt = `

839Say exactly the following:839Say exactly the following:

840I'm a little teapot, short and stout!840I'm a little teapot, short and stout!

841This is my handle, this is my spout!841This is my handle, this is my spout!

842\`;842`;

843 843 

844const event = {844const event = {

845 type: "response.create",845 type: "response.create",

Details

311 case "conversation.item.done":311 case "conversation.item.done":

312 if (event.item.type === "mcp_list_tools") {312 if (event.item.type === "mcp_list_tools") {

313 const names = event.item.tools.map((tool) => tool.name).join(", ");313 const names = event.item.tools.map((tool) => tool.name).join(", ");

314 console.log(\`MCP tools ready on \${event.item.server_label}: \${names}\`);314 console.log(`MCP tools ready on ${event.item.server_label}: ${names}`);

315 }315 }

316 316 

317 if (event.item.type === "mcp_approval_request") {317 if (event.item.type === "mcp_approval_request") {


334 case "response.output_item.done":334 case "response.output_item.done":

335 if (event.item.type === "mcp_call") {335 if (event.item.type === "mcp_call") {

336 console.log(336 console.log(

337 \`MCP output from \${event.item.server_label}.\${event.item.name}:\`,337 `MCP output from ${event.item.server_label}.${event.item.name}:`,

338 event.item.output338 event.item.output

339 );339 );

340 }340 }


436 const event = {436 const event = {

437 type: "conversation.item.create",437 type: "conversation.item.create",

438 item: {438 item: {

439 id: \`mcp_approval_\${approvalRequestId}\`,439 id: `mcp_approval_${approvalRequestId}`,

440 type: "mcp_approval_response",440 type: "mcp_approval_response",

441 approval_request_id: approvalRequestId,441 approval_request_id: approvalRequestId,

442 approve: true,442 approve: true,

Details

1# Realtime transcription1# Realtime transcription

2 2 

3import {

4 Bolt,

5 Cube,

6 Desktop,

7 Phone,

8} from "@components/react/oai/platform/ui/Icon.react";

9 

10Use realtime transcription when your application needs live speech-to-text without a spoken assistant response. Realtime transcription sessions stream transcript deltas as audio arrives, so users can see text before the full utterance is complete.3Use realtime transcription when your application needs live speech-to-text without a spoken assistant response. Realtime transcription sessions stream transcript deltas as audio arrives, so users can see text before the full utterance is complete.

11 4 

12For the lowest-latency streaming transcription path, use [`gpt-realtime-whisper`](https://developers.openai.com/api/docs/models/gpt-realtime-whisper). For offline files or workflows that don't need streaming deltas, use the standard speech-to-text models in the Audio API.5For the lowest-latency streaming transcription path, use [`gpt-realtime-whisper`](https://developers.openai.com/api/docs/models/gpt-realtime-whisper). For offline files or workflows that don't need streaming deltas, use the standard speech-to-text models in the Audio API.

Details

1# Realtime translation1# Realtime translation

2 2 

3import {

4 Bolt,

5 Cube,

6 Desktop,

7 Phone,

8} from "@components/react/oai/platform/ui/Icon.react";

9 

10 

11Realtime translation lets you stream source audio into a dedicated translation session and receive translated audio plus transcript deltas while the speaker is still talking. Use it for live interpretation, multilingual calls, broadcasts, meetings, lessons, and video rooms.3Realtime translation lets you stream source audio into a dedicated translation session and receive translated audio plus transcript deltas while the speaker is still talking. Use it for live interpretation, multilingual calls, broadcasts, meetings, lessons, and video rooms.

12 4 

13Use [`gpt-realtime-translate`](https://developers.openai.com/api/docs/models/gpt-realtime-translate) when your application should translate what a human says. If you need an assistant that answers questions, calls tools, and manages a conversation, use [`gpt-realtime-2`](https://developers.openai.com/api/docs/models/gpt-realtime-2) with a standard Realtime session instead.5Use [`gpt-realtime-translate`](https://developers.openai.com/api/docs/models/gpt-realtime-translate) when your application should translate what a human says. If you need an assistant that answers questions, calls tools, and manages a conversation, use [`gpt-realtime-2`](https://developers.openai.com/api/docs/models/gpt-realtime-2) with a standard Realtime session instead.


36 28 

37For browser apps, create a short-lived client secret on your server. Don't expose your standard API key in the browser.29For browser apps, create a short-lived client secret on your server. Don't expose your standard API key in the browser.

38 30 

31Create a translation client secret

32 

33```javascript

34app.post("/session", async (req, res) => {

35 const language = req.body.targetLanguage ?? "es";

36 

37 const response = await fetch(

38 "https://api.openai.com/v1/realtime/translations/client_secrets",

39 {

40 method: "POST",

41 headers: {

42 Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,

43 "Content-Type": "application/json",

44 "OpenAI-Safety-Identifier": "hashed-user-id",

45 },

46 body: JSON.stringify({

47 session: {

48 model: "gpt-realtime-translate",

49 audio: {

50 output: { language },

51 },

52 },

53 }),

54 }

55 );

56 

57 res.status(response.status).json(await response.json());

58});

59```

60 

61 

39In the browser, capture audio, create a peer connection, and post the SDP offer to the translation calls endpoint:62In the browser, capture audio, create a peer connection, and post the SDP offer to the translation calls endpoint:

40 63 

64Connect a browser translation call

65 

66```javascript

67const { value: clientSecret } = await fetch("/session", {

68 method: "POST",

69 headers: { "Content-Type": "application/json" },

70 body: JSON.stringify({ targetLanguage: "es" }),

71}).then((response) => response.json());

72 

73const sourceStream = await navigator.mediaDevices.getUserMedia({

74 audio: true,

75});

76 

77const pc = new RTCPeerConnection();

78pc.addTrack(sourceStream.getAudioTracks()[0], sourceStream);

79 

80const translatedAudio = new Audio();

81translatedAudio.autoplay = true;

82pc.ontrack = ({ streams }) => {

83 translatedAudio.srcObject = streams[0];

84};

85 

86const events = pc.createDataChannel("oai-events");

87events.onmessage = ({ data }) => {

88 const event = JSON.parse(data);

89 if (event.type === "session.output_transcript.delta") {

90 subtitles.textContent += event.delta;

91 }

92};

93 

94const offer = await pc.createOffer();

95await pc.setLocalDescription(offer);

96 

97const sdpResponse = await fetch(

98 "https://api.openai.com/v1/realtime/translations/calls",

99 {

100 method: "POST",

101 headers: {

102 Authorization: `Bearer ${clientSecret}`,

103 "Content-Type": "application/sdp",

104 },

105 body: offer.sdp,

106 }

107);

108 

109if (!sdpResponse.ok) {

110 throw new Error(await sdpResponse.text());

111}

112 

113await pc.setRemoteDescription({

114 type: "answer",

115 sdp: await sdpResponse.text(),

116});

117```

118 

119 

41## Create a WebSocket session120## Create a WebSocket session

42 121 

43Connect to the dedicated translation endpoint and select the model in the URL:122Connect to the dedicated translation endpoint and select the model in the URL:


53 "wss://api.openai.com/v1/realtime/translations?model=gpt-realtime-translate",132 "wss://api.openai.com/v1/realtime/translations?model=gpt-realtime-translate",

54 {133 {

55 headers: {134 headers: {

56 Authorization: \`Bearer \${process.env.OPENAI_API_KEY}\`,135 Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,

57 "OpenAI-Safety-Identifier": "hashed-user-id",136 "OpenAI-Safety-Identifier": "hashed-user-id",

58 },137 },

59 }138 }

Details

28Below is an example of a simple Node.js [express](https://expressjs.com/) server which creates a realtime API session:28Below is an example of a simple Node.js [express](https://expressjs.com/) server which creates a realtime API session:

29 29 

30```javascript30```javascript

31 31import express from "express";

32 32 

33const app = express();33const app = express();

34 34 


132Below is an example of a simple Node.js [express](https://expressjs.com/) server which mints an ephemeral API key using the REST API:133Below is an example of a simple Node.js [express](https://expressjs.com/) server which mints an ephemeral API key using the REST API:

133 134 

134```javascript135```javascript

135 136import express from "express";

136 137 

137const app = express();138const app = express();

138 139 

Details

124Over a WebSocket, you will both send and receive JSON-serialized events as strings of text, as in this Node.js example below (the same principles apply for other WebSocket libraries):124Over a WebSocket, you will both send and receive JSON-serialized events as strings of text, as in this Node.js example below (the same principles apply for other WebSocket libraries):

125 125 

126```javascript126```javascript

127 127import WebSocket from "ws";

128 128 

129const url = "wss://api.openai.com/v1/realtime?model=gpt-realtime-2";129const url = "wss://api.openai.com/v1/realtime?model=gpt-realtime-2";

130const ws = new WebSocket(url, {130const ws = new WebSocket(url, {

Details

1# Reasoning models1# Reasoning models

2 2 

3import {

4 Question,

5 Storage,

6} from "@components/react/oai/platform/ui/Icon.react";

7 

8 

9 

10 

11 

12 

13 

14 

15 

16**Reasoning models** like [GPT-5.5](https://developers.openai.com/api/docs/models/gpt-5.5) use internal reasoning tokens before producing a response. This helps the model plan, use tools effectively, inspect alternatives, recover from ambiguity, and solve harder multi-step tasks. Reasoning models work especially well for complex problem solving, coding, scientific reasoning, and multi-step agentic workflows. They're also the best models for [Codex CLI](https://github.com/openai/codex), our lightweight coding agent.3**Reasoning models** like [GPT-5.5](https://developers.openai.com/api/docs/models/gpt-5.5) use internal reasoning tokens before producing a response. This helps the model plan, use tools effectively, inspect alternatives, recover from ambiguity, and solve harder multi-step tasks. Reasoning models work especially well for complex problem solving, coding, scientific reasoning, and multi-step agentic workflows. They're also the best models for [Codex CLI](https://github.com/openai/codex), our lightweight coding agent.

17 4 

18Start with `gpt-5.5` for most reasoning workloads. If you need the highest-intelligence API option for more challenging problems that can tolerate more latency, use [`gpt-5.5-pro`](https://developers.openai.com/api/docs/models/gpt-5.5-pro). For lower cost, consider `gpt-5.4` and for lower cost and latency, consider `gpt-5.4-mini`.5Start with `gpt-5.5` for most reasoning workloads. If you need the highest-intelligence API option for more challenging problems that can tolerate more latency, use [`gpt-5.5-pro`](https://developers.openai.com/api/docs/models/gpt-5.5-pro). For lower cost, consider `gpt-5.4` and for lower cost and latency, consider `gpt-5.4-mini`.


33 20 

34const openai = new OpenAI();21const openai = new OpenAI();

35 22 

36const prompt = \`23const prompt = `

37Write a bash script that takes a matrix represented as a string with 24Write a bash script that takes a matrix represented as a string with

38format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.25format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.

39\`;26`;

40 27 

41const response = await openai.responses.create({28const response = await openai.responses.create({

42 model: "${latestMainlineModelSlug}",29 model: "gpt-5.5",

43 reasoning: { effort: "low" },30 reasoning: { effort: "low" },

44 input: [31 input: [

45 {32 {


63"""50"""

64 51 

65response = client.responses.create(52response = client.responses.create(

66 model="${latestMainlineModelSlug}",53 model="gpt-5.5",

67 reasoning={"effort": "low"},54 reasoning={"effort": "low"},

68 input=[55 input=[

69 {56 {


77```64```

78 65 

79```bash66```bash

80curl https://api.openai.com/v1/responses \\67curl https://api.openai.com/v1/responses \

81 -H "Content-Type: application/json" \\68 -H "Content-Type: application/json" \

82 -H "Authorization: Bearer $OPENAI_API_KEY" \\69 -H "Authorization: Bearer $OPENAI_API_KEY" \

83 -d '{70 -d '{

84 "model": "${latestMainlineModelSlug}",71 "model": "gpt-5.5",

85 "reasoning": {"effort": "low"},72 "reasoning": {"effort": "low"},

86 "input": [73 "input": [

87 {74 {

88 "role": "user",75 "role": "user",

89 "content": "Write a bash script that takes a matrix represented as a string with format \\"[1,2],[3,4],[5,6]\\" and prints the transpose in the same format."76 "content": "Write a bash script that takes a matrix represented as a string with format \"[1,2],[3,4],[5,6]\" and prints the transpose in the same format."

90 }77 }

91 ]78 ]

92 }'79 }'


167 154 

168const openai = new OpenAI();155const openai = new OpenAI();

169 156 

170const prompt = \`157const prompt = `

171Write a bash script that takes a matrix represented as a string with 158Write a bash script that takes a matrix represented as a string with

172format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.159format '[1,2],[3,4],[5,6]' and prints the transpose in the same format.

173\`;160`;

174 161 

175const response = await openai.responses.create({162const response = await openai.responses.create({

176 model: "${latestMainlineModelSlug}",163 model: "gpt-5.5",

177 reasoning: { effort: "medium" },164 reasoning: { effort: "medium" },

178 input: [165 input: [

179 {166 {


208"""195"""

209 196 

210response = client.responses.create(197response = client.responses.create(

211 model="${latestMainlineModelSlug}",198 model="gpt-5.5",

212 reasoning={"effort": "medium"},199 reasoning={"effort": "medium"},

213 input=[200 input=[

214 {201 {


274const openai = new OpenAI();262const openai = new OpenAI();

275 263 

276const response = await openai.responses.create({264const response = await openai.responses.create({

277 model: "${latestMainlineModelSlug}",265 model: "gpt-5.5",

278 input: "What is the capital of France?",266 input: "What is the capital of France?",

279 reasoning: {267 reasoning: {

280 effort: "low",268 effort: "low",


290client = OpenAI()278client = OpenAI()

291 279 

292response = client.responses.create(280response = client.responses.create(

293 model="${latestMainlineModelSlug}",281 model="gpt-5.5",

294 input="What is the capital of France?",282 input="What is the capital of France?",

295 reasoning={283 reasoning={

296 "effort": "low",284 "effort": "low",


302```290```

303 291 

304```bash292```bash

305curl https://api.openai.com/v1/responses \\293curl https://api.openai.com/v1/responses \

306 -H "Content-Type: application/json" \\294 -H "Content-Type: application/json" \

307 -H "Authorization: Bearer $OPENAI_API_KEY" \\295 -H "Authorization: Bearer $OPENAI_API_KEY" \

308 -d '{296 -d '{

309 "model": "${latestMainlineModelSlug}",297 "model": "gpt-5.5",

310 "input": "What is the capital of France?",298 "input": "What is the capital of France?",

311 "reasoning": {299 "reasoning": {

312 "effort": "low",300 "effort": "low",

Details

116 116 

117<div data-content-switcher-pane data-value="grader">117<div data-content-switcher-pane data-value="grader">

118 <div class="hidden">Grader configuration</div>118 <div class="hidden">Grader configuration</div>

119 Multi-grader configuration object

120 

121```json

122{

123 "type": "multi",

124 "graders": {

125 "explanation": {

126 "name": "Explanation text grader",

127 "type": "score_model",

128 "input": [

129 {

130 "role": "user",

131 "type": "message",

132 "content": "...see other tab for the full prompt..."

133 }

134 ],

135 "model": "gpt-4o-2024-08-06"

136 },

137 "compliant": {

138 "name": "compliant",

139 "type": "string_check",

140 "reference": "{{item.compliant}}",

141 "operation": "eq",

142 "input": "{{sample.output_json.compliant}}"

143 }

144 },

145 "calculate_output": "0.5 * compliant + 0.5 * explanation"

146}

147```

148 

119 </div>149 </div>

120 <div data-content-switcher-pane data-value="grader_json" hidden>150 <div data-content-switcher-pane data-value="grader_json" hidden>

121 <div class="hidden">Grading prompt</div>151 <div class="hidden">Grading prompt</div>

152 Grading prompt in the grader config

153 

154```markdown

155# Overview

156 

157Evaluate the accuracy of the model-generated answer based on the

158Copernicus Product Security Policy and an example answer. The response

159should align with the policy, cover key details, and avoid speculative

160or fabricated claims.

161 

162Always respond with a single floating point number 0 through 1,

163using the grading criteria below.

164 

165## Grading Criteria:

166- **1.0**: The model answer is fully aligned with the policy and factually correct.

167- **0.75**: The model answer is mostly correct but has minor omissions or slight rewording that does not change meaning.

168- **0.5**: The model answer is partially correct but lacks key details or contains speculative statements.

169- **0.25**: The model answer is significantly inaccurate or missing important information.

170- **0.0**: The model answer is completely incorrect, hallucinates policy details, or is irrelevant.

171 

172## Copernicus Product Security Policy

173 

174### Introduction

175Protecting customer data is a top priority for Copernicus. Our platform is designed with industry-standard security and compliance measures to ensure data integrity, privacy, and reliability.

176 

177### Data Classification

178Copernicus safeguards customer data, which includes prompts, responses, file uploads, user preferences, and authentication configurations. Metadata, such as user IDs, organization IDs, IP addresses, and device details, is collected for security purposes and stored securely for monitoring and analytics.

179 

180### Data Management

181Copernicus utilizes cloud-based storage with strong encryption (AES-256) and strict access controls. Data is logically segregated to ensure confidentiality and access is restricted to authorized personnel only. Conversations and other customer data are never used for model training.

182 

183### Data Retention

184Customer data is retained only for providing core functionalities like conversation history and team collaboration. Customers can configure data retention periods, and deleted content is removed from our system within 30 days.

185 

186### User Authentication & Access Control

187Users authenticate via Single Sign-On (SSO) using an Identity Provider (IdP). Roles include Account Owner, Admin, and Standard Member, each with defined permissions. User provisioning can be automated through SCIM integration.

188 

189### Compliance & Security Monitoring

190- **Compliance API**: Logs interactions, enabling data export and deletion.

191- **Audit Logging**: Ensures transparency for security audits.

192- **HIPAA Support**: Business Associate Agreements (BAAs) available for customers needing healthcare compliance.

193- **Security Monitoring**: 24/7 monitoring for threats and suspicious activity.

194- **Incident Response**: A dedicated security team follows strict protocols for handling incidents.

195 

196### Infrastructure Security

197- **Access Controls**: Role-based authentication with multi-factor security.

198- **Source Code Security**: Controlled code access with mandatory reviews before deployment.

199- **Network Security**: Web application firewalls and strict ingress/egress controls to prevent unauthorized access.

200- **Physical Security**: Data centers have controlled access, surveillance, and environmental risk management.

201 

202### Bug Bounty Program

203Security researchers are encouraged to report vulnerabilities through our Bug Bounty Program for responsible disclosure and rewards.

204 

205### Compliance & Certifications

206Copernicus maintains compliance with industry standards, including SOC 2 and GDPR. Customers can access security reports and documentation via our Security Portal.

207 

208### Conclusion

209Copernicus prioritizes security, privacy, and compliance. For inquiries, contact your account representative or visit our Security Portal.

210 

211## Examples

212 

213### Example 1: GDPR Compliance

214**Reference Answer**: 'Copernicus maintains compliance with industry standards, including SOC 2 and GDPR. Customers can access security reports and documentation via our Security Portal.'

215 

216**Model Answer 1**: 'Yes, Copernicus is GDPR compliant and provides compliance documentation via the Security Portal.'

217**Score: 1.0** (fully correct)

218 

219**Model Answer 2**: 'Yes, Copernicus follows GDPR standards.'

220**Score: 0.75** (mostly correct but lacks detail about compliance reports)

221 

222**Model Answer 3**: 'Copernicus may comply with GDPR but does not provide documentation.'

223**Score: 0.5** (partially correct, speculative about compliance reports)

224 

225**Model Answer 4**: 'Copernicus does not follow GDPR standards.'

226**Score: 0.0** (factually incorrect)

227 

228### Example 2: Encryption in Transit

229**Reference Answer**: 'The Copernicus Product Security Policy states that data is stored with strong encryption (AES-256) and that network security measures include web application firewalls and strict ingress/egress controls. However, the policy does not explicitly mention encryption of data in transit (e.g., TLS encryption). A review is needed to confirm whether data transmission is encrypted.'

230 

231**Model Answer 1**: 'Data is encrypted at rest using AES-256, but a review is needed to confirm encryption in transit.'

232**Score: 1.0** (fully correct)

233 

234**Model Answer 2**: 'Yes, Copernicus encrypts data in transit and at rest.'

235**Score: 0.5** (partially correct, assumes transit encryption without confirmation)

236 

237**Model Answer 3**: 'All data is protected with encryption.'

238**Score: 0.25** (vague and lacks clarity on encryption specifics)

239 

240**Model Answer 4**: 'Data is not encrypted in transit.'

241**Score: 0.0** (factually incorrect)

242 

243Reference Answer: {{item.explanation}}

244Model Answer: {{sample.output_json.explanation}}

245```

246 

122 </div>247 </div>

123 248 

124 249 

Details

1# Safety best practices1# Safety best practices

2 2 

3export const snippetExampleProvidingUserIdentifier = {

4 python: `

5from openai import OpenAI

6client = OpenAI()

7 

8response = client.chat.completions.create(

9model="gpt-5.5",

10messages=[

11{"role": "user", "content": "This is a test"}

12],

13max_completion_tokens=5,

14safety_identifier="user_123456"

15)

16`.trim(),

17 curl: `

18curl https://api.openai.com/v1/chat/completions \\

19-H "Content-Type: application/json" \\

20-H "Authorization: Bearer $OPENAI_API_KEY" \\

21-d '{

22"model": "gpt-5.5",

23"messages": [

24{"role": "user", "content": "This is a test"}

25],

26"max_completion_tokens": 5,

27"safety_identifier": "user123456"

28}'

29`.trim(),

30};

31 

32### Use our free Moderation API3### Use our free Moderation API

33 4 

34OpenAI's [Moderation API](https://developers.openai.com/api/docs/guides/moderation) is free-to-use and can help reduce the frequency of unsafe content in your completions. Alternatively, you may wish to develop your own content filtration system tailored to your use case.5OpenAI's [Moderation API](https://developers.openai.com/api/docs/guides/moderation) is free-to-use and can help reduce the frequency of unsafe content in your completions. Alternatively, you may wish to develop your own content filtration system tailored to your use case.


87with a model, but they are not required. Include safety identifiers in your API58with a model, but they are not required. Include safety identifiers in your API

88requests with the `safety_identifier` parameter:59requests with the `safety_identifier` parameter:

89 60 

61Example: Providing a safety identifier

62 

63```python

64from openai import OpenAI

65client = OpenAI()

66 

67response = client.chat.completions.create(

68model="gpt-5.5",

69messages=[

70{"role": "user", "content": "This is a test"}

71],

72max_completion_tokens=5,

73safety_identifier="user_123456"

74)

75```

76 

77```bash

78curl https://api.openai.com/v1/chat/completions \

79-H "Content-Type: application/json" \

80-H "Authorization: Bearer $OPENAI_API_KEY" \

81-d '{

82"model": "gpt-5.5",

83"messages": [

84{"role": "user", "content": "This is a test"}

85],

86"max_completion_tokens": 5,

87"safety_identifier": "user123456"

88}'

89```

90 

91 

90For Realtime API requests, provide the same stable, privacy-preserving identifier92For Realtime API requests, provide the same stable, privacy-preserving identifier

91with the `OpenAI-Safety-Identifier` header. When you create an ephemeral Realtime93with the `OpenAI-Safety-Identifier` header. When you create an ephemeral Realtime

92client secret, include the header on the server-side request that creates the94client secret, include the header on the server-side request that creates the

Details

1# Safety checks1# Safety checks

2 2 

3export const snippetExampleProvidingUserIdentifier = {

4 python: `

5from openai import OpenAI

6client = OpenAI()

7 

8response = client.chat.completions.create(

9model="gpt-5.4-mini",

10messages=[

11{"role": "user", "content": "This is a test"}

12],

13safety_identifier="user_123456"

14)

15`.trim(),

16 curl: `

17curl https://api.openai.com/v1/chat/completions \\

18-H "Content-Type: application/json" \\

19-H "Authorization: Bearer $OPENAI_API_KEY" \\

20-d '{

21"model": "gpt-5.4-mini",

22"messages": [

23{"role": "user", "content": "This is a test"}

24],

25"safety_identifier": "user_123456"

26}'

27`.trim(),

28};

29 

30export const snippetExampleProvidingUserIdentifierResponses = {

31 python: `

32from openai import OpenAI

33client = OpenAI()

34 

35response = client.responses.create(

36model="gpt-5.4-mini",

37input="This is a test",

38safety_identifier="user_123456",

39)

40`.trim(),

41 curl: `

42curl https://api.openai.com/v1/responses \\

43-H "Content-Type: application/json" \\

44-H "Authorization: Bearer $OPENAI_API_KEY" \\

45-d '{

46"model": "gpt-5.4-mini",

47"input": "This is a test",

48"safety_identifier": "user_123456"

49}'

50`.trim(),

51};

52 

53export const snippetExampleProvidingUserIdentifierRealtime = {

54 curl: `

55curl https://api.openai.com/v1/realtime/client_secrets \\

56-H "Content-Type: application/json" \\

57-H "Authorization: Bearer $OPENAI_API_KEY" \\

58-H "OpenAI-Safety-Identifier: user_123456" \\

59-d '{

60"session": {

61"type": "realtime",

62"model": "gpt-realtime-2"

63}

64}'

65`.trim(),

66};

67 

68We run several types of evaluations on our models and how they're being used. This guide covers how we test for safety and what you can do to avoid violations.3We run several types of evaluations on our models and how they're being used. This guide covers how we test for safety and what you can do to avoid violations.

69 4 

70## Safety classifiers for GPT-5 and forward5## Safety classifiers for GPT-5 and forward


94 29 

95<div data-content-switcher-pane data-value="responses">30<div data-content-switcher-pane data-value="responses">

96 <div class="hidden">Responses API</div>31 <div class="hidden">Responses API</div>

32 Providing a safety identifier with the Responses API

33 

34```python

35from openai import OpenAI

36client = OpenAI()

37 

38response = client.responses.create(

39model="gpt-5.4-mini",

40input="This is a test",

41safety_identifier="user_123456",

42)

43```

44 

45```bash

46curl https://api.openai.com/v1/responses \

47-H "Content-Type: application/json" \

48-H "Authorization: Bearer $OPENAI_API_KEY" \

49-d '{

50"model": "gpt-5.4-mini",

51"input": "This is a test",

52"safety_identifier": "user_123456"

53}'

54```

55 

97 </div>56 </div>

98 <div data-content-switcher-pane data-value="chat" hidden>57 <div data-content-switcher-pane data-value="chat" hidden>

99 <div class="hidden">Chat Completions API</div>58 <div class="hidden">Chat Completions API</div>

59 Providing a safety identifier with the Chat Completions API

60 

61```python

62from openai import OpenAI

63client = OpenAI()

64 

65response = client.chat.completions.create(

66model="gpt-5.4-mini",

67messages=[

68{"role": "user", "content": "This is a test"}

69],

70safety_identifier="user_123456"

71)

72```

73 

74```bash

75curl https://api.openai.com/v1/chat/completions \

76-H "Content-Type: application/json" \

77-H "Authorization: Bearer $OPENAI_API_KEY" \

78-d '{

79"model": "gpt-5.4-mini",

80"messages": [

81{"role": "user", "content": "This is a test"}

82],

83"safety_identifier": "user_123456"

84}'

85```

86 

100 </div>87 </div>

101 <div data-content-switcher-pane data-value="realtime" hidden>88 <div data-content-switcher-pane data-value="realtime" hidden>

102 <div class="hidden">Realtime API</div>89 <div class="hidden">Realtime API</div>

90 Providing a safety identifier with the Realtime API

91 

92```bash

93curl https://api.openai.com/v1/realtime/client_secrets \

94-H "Content-Type: application/json" \

95-H "Authorization: Bearer $OPENAI_API_KEY" \

96-H "OpenAI-Safety-Identifier: user_123456" \

97-d '{

98"session": {

99"type": "realtime",

100"model": "gpt-realtime-2"

101}

102}'

103```

104 

103 </div>105 </div>

104 106 

105 107 

Details

64```64```

65 65 

66```cli66```cli

67openai audio:transcriptions create \\67openai audio:transcriptions create \

68 --model gpt-4o-transcribe \\68 --model gpt-4o-transcribe \

69 --file /path/to/file/audio.mp3 \\69 --file /path/to/file/audio.mp3 \

70 --raw-output \\70 --raw-output \

71 --transform text71 --transform text

72```72```

73 73 

74```bash74```bash

75curl --request POST \\75curl --request POST \

76 --url https://api.openai.com/v1/audio/transcriptions \\76 --url https://api.openai.com/v1/audio/transcriptions \

77 --header "Authorization: Bearer $OPENAI_API_KEY" \\77 --header "Authorization: Bearer $OPENAI_API_KEY" \

78 --header 'Content-Type: multipart/form-data' \\78 --header 'Content-Type: multipart/form-data' \

79 --form file=@/path/to/file/audio.mp3 \\79 --form file=@/path/to/file/audio.mp3 \

80 --form model=gpt-4o-transcribe80 --form model=gpt-4o-transcribe

81```81```

82 82 


125```125```

126 126 

127```bash127```bash

128curl --request POST \\128curl --request POST \

129 --url https://api.openai.com/v1/audio/transcriptions \\129 --url https://api.openai.com/v1/audio/transcriptions \

130 --header "Authorization: Bearer $OPENAI_API_KEY" \\130 --header "Authorization: Bearer $OPENAI_API_KEY" \

131 --header 'Content-Type: multipart/form-data' \\131 --header 'Content-Type: multipart/form-data' \

132 --form file=@/path/to/file/speech.mp3 \\132 --form file=@/path/to/file/speech.mp3 \

133 --form model=gpt-4o-transcribe \\133 --form model=gpt-4o-transcribe \

134 --form response_format=text134 --form response_format=text

135```135```

136 136 


171});171});

172 172 

173for (const segment of transcript.segments) {173for (const segment of transcript.segments) {

174 console.log(\`\${segment.speaker}: \${segment.text}\`, segment.start, segment.end);174 console.log(`${segment.speaker}: ${segment.text}`, segment.start, segment.end);

175}175}

176```176```

177 177 


202```202```

203 203 

204```bash204```bash

205curl --request POST \\205curl --request POST \

206 --url https://api.openai.com/v1/audio/transcriptions \\206 --url https://api.openai.com/v1/audio/transcriptions \

207 --header "Authorization: Bearer $OPENAI_API_KEY" \\207 --header "Authorization: Bearer $OPENAI_API_KEY" \

208 --header 'Content-Type: multipart/form-data' \\208 --header 'Content-Type: multipart/form-data' \

209 --form file=@/path/to/file/meeting.wav \\209 --form file=@/path/to/file/meeting.wav \

210 --form model=gpt-4o-transcribe-diarize \\210 --form model=gpt-4o-transcribe-diarize \

211 --form response_format=diarized_json \\211 --form response_format=diarized_json \

212 --form chunking_strategy=auto \\212 --form chunking_strategy=auto \

213 --form 'known_speaker_names[]=agent' \\213 --form 'known_speaker_names[]=agent' \

214 --form 'known_speaker_references[]=data:audio/wav;base64,AAA...'214 --form 'known_speaker_references[]=data:audio/wav;base64,AAA...'

215```215```

216 216 


255```255```

256 256 

257```bash257```bash

258curl --request POST \\258curl --request POST \

259 --url https://api.openai.com/v1/audio/translations \\259 --url https://api.openai.com/v1/audio/translations \

260 --header "Authorization: Bearer $OPENAI_API_KEY" \\260 --header "Authorization: Bearer $OPENAI_API_KEY" \

261 --header 'Content-Type: multipart/form-data' \\261 --header 'Content-Type: multipart/form-data' \

262 --form file=@/path/to/file/german.mp3 \\262 --form file=@/path/to/file/german.mp3 \

263 --form model=whisper-1 \\263 --form model=whisper-1 \

264```264```

265 265 

266 266 


321```321```

322 322 

323```bash323```bash

324curl https://api.openai.com/v1/audio/transcriptions \\324curl https://api.openai.com/v1/audio/transcriptions \

325 -H "Authorization: Bearer $OPENAI_API_KEY" \\325 -H "Authorization: Bearer $OPENAI_API_KEY" \

326 -H "Content-Type: multipart/form-data" \\326 -H "Content-Type: multipart/form-data" \

327 -F file="@/path/to/file/audio.mp3" \\327 -F file="@/path/to/file/audio.mp3" \

328 -F "timestamp_granularities[]=word" \\328 -F "timestamp_granularities[]=word" \

329 -F model="whisper-1" \\329 -F model="whisper-1" \

330 -F response_format="verbose_json"330 -F response_format="verbose_json"

331```331```

332 332 


393```393```

394 394 

395```bash395```bash

396curl --request POST \\396curl --request POST \

397 --url https://api.openai.com/v1/audio/transcriptions \\397 --url https://api.openai.com/v1/audio/transcriptions \

398 --header "Authorization: Bearer $OPENAI_API_KEY" \\398 --header "Authorization: Bearer $OPENAI_API_KEY" \

399 --header 'Content-Type: multipart/form-data' \\399 --header 'Content-Type: multipart/form-data' \

400 --form file=@/path/to/file/speech.mp3 \\400 --form file=@/path/to/file/speech.mp3 \

401 --form model=gpt-4o-transcribe \\401 --form model=gpt-4o-transcribe \

402 --form prompt="The following conversation is a lecture about the recent developments around OpenAI, GPT-4.5 and the future of AI."402 --form prompt="The following conversation is a lecture about the recent developments around OpenAI, GPT-4.5 and the future of AI."

403```403```

404 404 


475```475```

476 476 

477```bash477```bash

478curl --request POST \\478curl --request POST \

479 --url https://api.openai.com/v1/audio/transcriptions \\479 --url https://api.openai.com/v1/audio/transcriptions \

480 --header "Authorization: Bearer $OPENAI_API_KEY" \\480 --header "Authorization: Bearer $OPENAI_API_KEY" \

481 --header 'Content-Type: multipart/form-data' \\481 --header 'Content-Type: multipart/form-data' \

482 --form file=@example.wav \\482 --form file=@example.wav \

483 --form model=whisper-1 \\483 --form model=whisper-1 \

484 # highlight-start484 # highlight-start

485 --form stream=True485 --form stream=True

486```486```


541```541```

542 542 

543```bash543```bash

544curl --request POST \\544curl --request POST \

545 --url https://api.openai.com/v1/audio/transcriptions \\545 --url https://api.openai.com/v1/audio/transcriptions \

546 --header "Authorization: Bearer $OPENAI_API_KEY" \\546 --header "Authorization: Bearer $OPENAI_API_KEY" \

547 --header 'Content-Type: multipart/form-data' \\547 --header 'Content-Type: multipart/form-data' \

548 --form file=@/path/to/file/speech.mp3 \\548 --form file=@/path/to/file/speech.mp3 \

549 --form model=whisper-1 \\549 --form model=whisper-1 \

550 --form prompt="ZyntriQix, Digique Plus, CynapseFive, VortiQore V8, EchoNix Array, OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., Q.U.A.R.T.Z., F.L.I.N.T."550 --form prompt="ZyntriQix, Digique Plus, CynapseFive, VortiQore V8, EchoNix Array, OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., Q.U.A.R.T.Z., F.L.I.N.T."

551```551```

552 552 


562Post-processing562Post-processing

563 563 

564```javascript564```javascript

565const systemPrompt = \`565const systemPrompt = `

566You are a helpful assistant for the company ZyntriQix. Your task is 566You are a helpful assistant for the company ZyntriQix. Your task is

567to correct any spelling discrepancies in the transcribed text. Make 567to correct any spelling discrepancies in the transcribed text. Make

568sure that the names of the following products are spelled correctly: 568sure that the names of the following products are spelled correctly:


570OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K., 570OrbitalLink Seven, DigiFractal Matrix, PULSE, RAPT, B.R.I.C.K.,

571Q.U.A.R.T.Z., F.L.I.N.T. Only add necessary punctuation such as 571Q.U.A.R.T.Z., F.L.I.N.T. Only add necessary punctuation such as

572periods, commas, and capitalization, and use only the context provided.572periods, commas, and capitalization, and use only the context provided.

573\`;573`;

574 574 

575const transcript = await transcribe(audioFile);575const transcript = await transcribe(audioFile);

576const completion = await openai.chat.completions.create({576const completion = await openai.chat.completions.create({

Details

1# Structured model outputs1# Structured model outputs

2 2 

3import {

4 snippetRefusalsChatCompletionsApi,

5 snippetRefusalsResponsesApi,

6} from "./inline-examples";

7 

8export const snippetRefusalApiResponseChatCompletionsApi = {

9 json: `

10{

11 "id": "chatcmpl-9nYAG9LPNonX8DAyrkwYfemr3C8HC",

12 "object": "chat.completion",

13 "created": 1721596428,

14 "model": "gpt-4o-2024-08-06",

15 "choices": [

16 {

17 "index": 0,

18 "message": {

19 "role": "assistant",

20 // highlight-start

21 "refusal": "I'm sorry, I cannot assist with that request."

22 // highlight-end

23 },

24 "logprobs": null,

25 "finish_reason": "stop"

26 }

27 ],

28 "usage": {

29 "prompt_tokens": 81,

30 "completion_tokens": 11,

31 "total_tokens": 92,

32 "completion_tokens_details": {

33 "reasoning_tokens": 0,

34 "accepted_prediction_tokens": 0,

35 "rejected_prediction_tokens": 0

36 }

37 },

38 "system_fingerprint": "fp_3407719c7f"

39}

40 `.trim(),

41};

42export const snippetRefusalApiResponseResponsesApi = {

43 json: `

44{

45 "id": "resp_1234567890",

46 "object": "response",

47 "created_at": 1721596428,

48 "status": "completed",

49 "completed_at": 1721596429,

50 "error": null,

51 "incomplete_details": null,

52 "input": [],

53 "instructions": null,

54 "max_output_tokens": null,

55 "model": "gpt-4o-2024-08-06",

56 "output": [{

57 "id": "msg_1234567890",

58 "type": "message",

59 "role": "assistant",

60 "content": [

61 // highlight-start

62 {

63 "type": "refusal",

64 "refusal": "I'm sorry, I cannot assist with that request."

65 }

66 // highlight-end

67 ]

68 }],

69 "usage": {

70 "input_tokens": 81,

71 "output_tokens": 11,

72 "total_tokens": 92,

73 "output_tokens_details": {

74 "reasoning_tokens": 0,

75 }

76 },

77}

78 `.trim(),

79};

80 

81JSON is one of the most widely used formats in the world for applications to exchange data.3JSON is one of the most widely used formats in the world for applications to exchange data.

82 4 

83Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied [JSON Schema](https://json-schema.org/overview/what-is-jsonschema), so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.5Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied [JSON Schema](https://json-schema.org/overview/what-is-jsonschema), so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value.


92 14 

93### Supported models15### Supported models

94 16 

95Structured Outputs is available in our [latest large language models](https://developers.openai.com/api/docs/models), starting with GPT-4o. Older models like `gpt-4-turbo` and earlier may use [JSON mode](#json-mode) instead.17Structured Outputs is available in our [latest large language models](https://developers.openai.com/api/docs/models), starting with GPT-4o. For new projects, start with [`gpt-5.5`](https://developers.openai.com/api/docs/models/gpt-5.5). Older models like `gpt-4-turbo` and earlier may use [JSON mode](#json-mode) instead.

96 18 

97 19 

98 20 

guides/text.md +6 −6

Details

85```85```

86 86 

87```bash87```bash

88curl "https://api.openai.com/v1/responses" \\88curl "https://api.openai.com/v1/responses" \

89 -H "Content-Type: application/json" \\89 -H "Content-Type: application/json" \

90 -H "Authorization: Bearer $OPENAI_API_KEY" \\90 -H "Authorization: Bearer $OPENAI_API_KEY" \

91 -d '{91 -d '{

92 "model": "gpt-5.5",92 "model": "gpt-5.5",

93 "reasoning": {"effort": "low"},93 "reasoning": {"effort": "low"},


146```146```

147 147 

148```bash148```bash

149curl "https://api.openai.com/v1/responses" \\149curl "https://api.openai.com/v1/responses" \

150 -H "Content-Type: application/json" \\150 -H "Content-Type: application/json" \

151 -H "Authorization: Bearer $OPENAI_API_KEY" \\151 -H "Authorization: Bearer $OPENAI_API_KEY" \

152 -d '{152 -d '{

153 "model": "gpt-5.5",153 "model": "gpt-5.5",

154 "reasoning": {"effort": "low"},154 "reasoning": {"effort": "low"},

Details

60```60```

61 61 

62```bash62```bash

63curl https://api.openai.com/v1/audio/speech \\63curl https://api.openai.com/v1/audio/speech \

64 -H "Authorization: Bearer $OPENAI_API_KEY" \\64 -H "Authorization: Bearer $OPENAI_API_KEY" \

65 -H "Content-Type: application/json" \\65 -H "Content-Type: application/json" \

66 -d '{66 -d '{

67 "model": "gpt-4o-mini-tts",67 "model": "gpt-4o-mini-tts",

68 "input": "Today is a wonderful day to build something people love!",68 "input": "Today is a wonderful day to build something people love!",

69 "voice": "coral",69 "voice": "coral",

70 "instructions": "Speak in a cheerful and positive tone."70 "instructions": "Speak in a cheerful and positive tone."

71 }' \\71 }' \

72 --output speech.mp372 --output speech.mp3

73```73```

74 74 

75```cli75```cli

76openai audio:speech create \\76openai audio:speech create \

77 --model gpt-4o-mini-tts \\77 --model gpt-4o-mini-tts \

78 --voice coral \\78 --voice coral \

79 --instructions "Speak in a cheerful and positive tone." \\79 --instructions "Speak in a cheerful and positive tone." \

80 --input "Today is a wonderful day to build something people love!" \\80 --input "Today is a wonderful day to build something people love!" \

81 --output speech.mp381 --output speech.mp3

82```82```

83 83 


168```168```

169 169 

170```bash170```bash

171curl https://api.openai.com/v1/audio/speech \\171curl https://api.openai.com/v1/audio/speech \

172 -H "Authorization: Bearer $OPENAI_API_KEY" \\172 -H "Authorization: Bearer $OPENAI_API_KEY" \

173 -H "Content-Type: application/json" \\173 -H "Content-Type: application/json" \

174 -d '{174 -d '{

175 "model": "gpt-4o-mini-tts",175 "model": "gpt-4o-mini-tts",

176 "input": "Today is a wonderful day to build something people love!",176 "input": "Today is a wonderful day to build something people love!",

guides/tools.md +4 −20

Details

1# Using tools1# Using tools

2 2 

3import {

4 File,

5 Functions,

6 ImageSquare,

7 Code,

8} from "@components/react/oai/platform/ui/Icon.react";

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19When generating model responses or building agents, you can extend capabilities using built‑in tools, function calling, tool search, and remote MCP servers. These enable the model to search the web, retrieve from your files, load deferred tool definitions at runtime, call your own functions, or access third‑party services. Only `gpt-5.4` and later models support `tool_search`.3When generating model responses or building agents, you can extend capabilities using built‑in tools, function calling, tool search, and remote MCP servers. These enable the model to search the web, retrieve from your files, load deferred tool definitions at runtime, call your own functions, or access third‑party services. Only `gpt-5.4` and later models support `tool_search`.

20 4 

21 5 


89 Call a remote MCP server73 Call a remote MCP server

90 74 

91```bash75```bash

92curl https://api.openai.com/v1/responses \\ 76curl https://api.openai.com/v1/responses \

93-H "Content-Type: application/json" \\ 77-H "Content-Type: application/json" \

94-H "Authorization: Bearer $OPENAI_API_KEY" \\ 78-H "Authorization: Bearer $OPENAI_API_KEY" \

95-d '{79-d '{

96 "model": "gpt-5.5",80 "model": "gpt-5.5",

97 "tools": [81 "tools": [


304 description: "Get the weather for a given city.",288 description: "Get the weather for a given city.",

305 parameters: z.object({ city: z.string() }),289 parameters: z.object({ city: z.string() }),

306 async execute({ city }) {290 async execute({ city }) {

307 return \`The weather in \${city} is sunny.\`;291 return `The weather in ${city} is sunny.`;

308 },292 },

309});293});

310```294```

Details

1# Apply Patch1# Apply Patch

2 2 

3import {

4 CheckCircleFilled,

5 XCircle,

6} from "@components/react/oai/platform/ui/Icon.react";

7 

8 

9 

10The `apply_patch` tool lets GPT-5.1 create, update, and delete files in your codebase using structured diffs. Instead of just suggesting edits, the model emits patch operations that your application applies and then reports back on, enabling iterative, multi-step code editing workflows.3The `apply_patch` tool lets GPT-5.1 create, update, and delete files in your codebase using structured diffs. Instead of just suggesting edits, the model emits patch operations that your application applies and then reports back on, enabling iterative, multi-step code editing workflows.

11 4 

12## When to use5## When to use


47 40 

48**Step 1: Ask the model to plan and emit patches**41**Step 1: Ask the model to plan and emit patches**

49 42 

43Ask the model to plan and emit patches

44 

45```python

46from openai import OpenAI

47 

48client = OpenAI()

49 

50# For brevity, we are including file context in the example input.

51# Most agentic use cases should instead equip the model with tools

52# for exploring file system state.

53RESPONSE_INPUT = """

54The user has the following files:

55<BEGIN_FILES>

56===== lib/fib.py

57def fib(n):

58 if n <= 1:

59 return n

60 return fib(n-1) + fib(n-2)

61 

62===== run.py

63from lib.fib import fib

64 

65def main():

66 print(fib(42))

67<END_FILES>

68 

69You are a helpful coding assistant that should assist the user with whatever they

70ask.

71 

72User query:

73Help me rename the fib() function to fibonacci()

74"""

75 

76response = client.responses.create(

77 model="gpt-5.5",

78 input=RESPONSE_INPUT,

79 tools=[{"type": "apply_patch"}],

80)

81 

82# response.output may contain multiple apply_patch_call entries, e.g.:

83# - update lib/fib.py

84# - update run.py

85patch_calls = [

86 item for item in response.output

87 if item["type"] == "apply_patch_call"

88]

89```

90 

91 

50**Example `apply_patch_call` object**92**Example `apply_patch_call` object**

51 93 

94Example apply_patch_call object

95 

96```json

97{

98 "id": "apc_08f3d96c87a585390069118b594f7481a088b16cda7d9415fe",

99 "type": "apply_patch_call",

100 "status": "completed",

101 "call_id": "call_Rjsqzz96C5xzPb0jUWJFRTNW",

102 "operation": {

103 "type": "update_file",

104 "diff": "

105@@

106-def fib(n):

107+def fibonacci(n):

108 if n <= 1:

109 return n

110- return fib(n-1) + fib(n-2) + return fibonacci(n-1) + fibonacci(n-2),

111",

112 "path": "lib/fib.py"

113 }

114}

115```

116 

117 

52**Step 2: Apply the patch and send results back**118**Step 2: Apply the patch and send results back**

53 119 

120Apply the patch and return results

121 

122```python

123from apply_patch_harness import apply_operation # your implementation

124 

125results = []

126for call in patch_calls:

127 op = call["operation"]

128 success, maybe_log_output = apply_operation(op)

129 

130 results.append({

131 "type": "apply_patch_call_output",

132 "call_id": call["call_id"],

133 "status": "completed" if success else "failed",

134 "output": maybe_log_output,

135 })

136 

137followup = client.responses.create(

138 model="gpt-5.5",

139 previous_response_id=response.id,

140 input=results,

141 tools=[{"type": "apply_patch"}],

142)

143```

144 

145 

54If a patch fails (for example, file not found), set `status: "failed"` and include a helpful `output` string so the model can recover:146If a patch fails (for example, file not found), set `status: "failed"` and include a helpful `output` string so the model can recover:

55 147 

148Report a failed apply_patch call

149 

150```json

151{

152 "type": "apply_patch_call_output",

153 "call_id": "call_cNWm41dB3RyQcLNOVTIPBWZU",

154 "status": "failed",

155 "output": "Could not apply patch to lib/foo.py — file not found on disk"

156}

157```

158 

159 

56## Apply patch operations160## Apply patch operations

57 161 

58| Operation Type | Purpose | Payload |162| Operation Type | Purpose | Payload |


128 232 

129<div data-content-switcher-pane data-value="file-missing">233<div data-content-switcher-pane data-value="file-missing">

130 <div class="hidden">File not found</div>234 <div class="hidden">File not found</div>

235 File not found error

236 

237```json

238{

239 "type": "apply_patch_call_output",

240 "call_id": "call_abc",

241 "status": "failed",

242 "output": "Error: File not found at path 'lib/baz.py'"

243}

244```

245 

131 </div>246 </div>

132 <div data-content-switcher-pane data-value="patch-conflict" hidden>247 <div data-content-switcher-pane data-value="patch-conflict" hidden>

133 <div class="hidden">Patch conflict</div>248 <div class="hidden">Patch conflict</div>

249 Patch conflict error

250 

251```json

252{

253 "type": "apply_patch_call_output",

254 "call_id": "call_abc",

255 "status": "failed",

256 "output": "Error: Invalid Context:\n@@ def fib(n):"

257}

258```

259 

134 </div>260 </div>

135 261 

136 262 

Details

1# Code Interpreter1# Code Interpreter

2 2 

3import {

4 CheckCircleFilled,

5 XCircle,

6} from "@components/react/oai/platform/ui/Icon.react";

7 

8 

9 

10 

11The Code Interpreter tool allows models to write and run Python code in a sandboxed environment to solve complex problems in domains like data analysis, coding, and math. Use it for:3The Code Interpreter tool allows models to write and run Python code in a sandboxed environment to solve complex problems in domains like data analysis, coding, and math. Use it for:

12 4 

13- Processing files with diverse data and formatting5- Processing files with diverse data and formatting


20Use the Responses API with Code Interpreter12Use the Responses API with Code Interpreter

21 13 

22```bash14```bash

23curl https://api.openai.com/v1/responses \\15curl https://api.openai.com/v1/responses \

24 -H "Content-Type: application/json" \\16 -H "Content-Type: application/json" \

25 -H "Authorization: Bearer $OPENAI_API_KEY" \\17 -H "Authorization: Bearer $OPENAI_API_KEY" \

26 -d '{18 -d '{

27 "model": "gpt-5.5",19 "model": "gpt-5.5",

28 "tools": [{20 "tools": [{


38import OpenAI from "openai";30import OpenAI from "openai";

39const client = new OpenAI();31const client = new OpenAI();

40 32 

41const instructions = \`33const instructions = `

42You are a personal math tutor. When asked a math question,34You are a personal math tutor. When asked a math question,

43write and run code using the python tool to answer the question.35write and run code using the python tool to answer the question.

44\`;36`;

45 37 

46const resp = await client.responses.create({38const resp = await client.responses.create({

47 model: "gpt-5.5",39 model: "gpt-5.5",


101Use explicit container creation93Use explicit container creation

102 94 

103```bash95```bash

104curl https://api.openai.com/v1/containers \\96curl https://api.openai.com/v1/containers \

105 -H "Authorization: Bearer $OPENAI_API_KEY" \\97 -H "Authorization: Bearer $OPENAI_API_KEY" \

106 -H "Content-Type: application/json" \\98 -H "Content-Type: application/json" \

107 -d '{99 -d '{

108 "name": "My Container",100 "name": "My Container",

109 "memory_limit": "4g"101 "memory_limit": "4g"

110 }'102 }'

111 103 

112# Use the returned container id in the next call:104# Use the returned container id in the next call:

113curl https://api.openai.com/v1/responses \\105curl https://api.openai.com/v1/responses \

114 -H "Authorization: Bearer $OPENAI_API_KEY" \\106 -H "Authorization: Bearer $OPENAI_API_KEY" \

115 -H "Content-Type: application/json" \\107 -H "Content-Type: application/json" \

116 -d '{108 -d '{

117 "model": "gpt-5.5",109 "model": "gpt-5.5",

118 "tools": [{110 "tools": [{

Details

1# Computer use1# Computer use

2 2 

3import {

4 batchedComputerTurn,

5 captureScreenshotDocker,

6 captureScreenshotPlaywright,

7 codeExecutionHarnessExample,

8 computerLoop,

9 dockerfile,

10 handleActionsDocker,

11 handleActionsPlaywright,

12 handleActionsWithModifiersDocker,

13 handleActionsWithModifiersPlaywright,

14 legacyPreviewRequest,

15 firstComputerTurn,

16 modifierBatchedComputerTurn,

17 normalizeKeysDocker,

18 normalizeKeysPlaywright,

19 sendComputerRequest,

20 sendComputerScreenshot,

21 setupDocker,

22 setupPlaywright,

23} from "./cua-examples.js";

24 

25 

26 

27 

28 

29Computer use lets a model operate software through the user interface. It can inspect screenshots, return interface actions for your code to execute, or work through a custom harness that mixes visual and programmatic interaction with the UI.3Computer use lets a model operate software through the user interface. It can inspect screenshots, return interface actions for your code to execute, or work through a custom harness that mixes visual and programmatic interaction with the UI.

30 4 

31`gpt-5.4` includes new training for this kind of work, and future models will build on the same pattern. The model is designed to operate flexibly across a range of harness shapes, including the built-in Responses API `computer` tool, custom tools layered on top of existing automation harnesses, and code-execution environments that expose browser or desktop controls.5`gpt-5.4` includes new training for this kind of work, and future models will build on the same pattern. The model is designed to operate flexibly across a range of harness shapes, including the built-in Responses API `computer` tool, custom tools layered on top of existing automation harnesses, and code-execution environments that expose browser or desktop controls.


55 29 

56Then launch a browser instance:30Then launch a browser instance:

57 31 

32Start a browser instance

33 

34```javascript

35import { chromium } from "playwright";

36 

37const browser = await chromium.launch({

38 headless: false,

39 chromiumSandbox: true,

40 env: {},

41 args: ["--disable-extensions", "--disable-file-system"],

42});

43const page = await browser.newPage({

44 viewport: { width: 1280, height: 720 },

45});

46```

47 

48```python

49from playwright.sync_api import sync_playwright

50 

51 

52with sync_playwright() as p:

53 browser = p.chromium.launch(

54 headless=False,

55 chromium_sandbox=True,

56 env={},

57 args=["--disable-extensions", "--disable-file-system"],

58 )

59 page = browser.new_page(viewport={"width": 1280, "height": 720})

60```

61 

62 

58Set up a local virtual machine63Set up a local virtual machine

59 64 

60If you need a fuller desktop environment, run the model against a local VM or container and translate actions into OS-level input events.65If you need a fuller desktop environment, run the model against a local VM or container and translate actions into OS-level input events.


63 68 

64The following Dockerfile starts an Ubuntu desktop with Xvfb, `x11vnc`, and Firefox:69The following Dockerfile starts an Ubuntu desktop with Xvfb, `x11vnc`, and Firefox:

65 70 

71Dockerfile

72 

73```dockerfile

74FROM ubuntu:22.04

75ENV DEBIAN_FRONTEND=noninteractive

76 

77RUN apt-get update && apt-get install -y \

78 xfce4 \

79 xfce4-goodies \

80 x11vnc \

81 xvfb \

82 xdotool \

83 imagemagick \

84 x11-apps \

85 sudo \

86 software-properties-common \

87 firefox-esr \

88 && apt-get remove -y light-locker xfce4-screensaver xfce4-power-manager || true \

89 && apt-get clean && rm -rf /var/lib/apt/lists/*

90 

91RUN useradd -ms /bin/bash myuser \

92 && echo "myuser ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers

93USER myuser

94WORKDIR /home/myuser

95 

96RUN x11vnc -storepasswd secret /home/myuser/.vncpass

97 

98EXPOSE 5900

99CMD ["/bin/sh", "-c", "\

100 Xvfb :99 -screen 0 1280x800x24 >/dev/null 2>&1 & \

101 x11vnc -display :99 -forever -rfbauth /home/myuser/.vncpass -listen 0.0.0.0 -rfbport 5900 >/dev/null 2>&1 & \

102 export DISPLAY=:99 && \

103 startxfce4 >/dev/null 2>&1 & \

104 sleep 2 && echo 'Container running!' && \

105 tail -f /dev/null \

106"]

107```

108 

109 

66Build the image:110Build the image:

67 111 

68```bash112```bash


77 121 

78Create a helper for shelling into the container:122Create a helper for shelling into the container:

79 123 

124Execute commands on the container

125 

126```python

127import subprocess

128 

129 

130def docker_exec(cmd: str, container_name: str, decode: bool = True):

131 safe_cmd = cmd.replace('"', '\\"')

132 docker_cmd = f'docker exec {container_name} sh -c "{safe_cmd}"'

133 output = subprocess.check_output(docker_cmd, shell=True)

134 if decode:

135 return output.decode("utf-8", errors="ignore")

136 return output

137 

138 

139class VM:

140 def __init__(self, display: str, container_name: str):

141 self.display = display

142 self.container_name = container_name

143 

144 

145vm = VM(display=":99", container_name="cua-image")

146```

147 

148```javascript

149import { exec } from "node:child_process";

150import { promisify } from "node:util";

151 

152const execAsync = promisify(exec);

153 

154async function dockerExec(cmd, containerName, decode = true) {

155 const safeCmd = cmd.replace(/"/g, '\\"');

156 const dockerCmd = `docker exec ${containerName} sh -c "${safeCmd}"`;

157 const output = await execAsync(dockerCmd, {

158 encoding: decode ? "utf8" : "buffer",

159 });

160 return output.stdout;

161}

162 

163const vm = {

164 display: ":99",

165 containerName: "cua-image",

166};

167```

168 

169 

80Whether you use a browser or VM, treat screenshots, page text, tool outputs, PDFs, emails, chats, and other third-party content as untrusted input. Only direct instructions from the user count as permission.170Whether you use a browser or VM, treat screenshots, page text, tool outputs, PDFs, emails, chats, and other third-party content as untrusted input. Only direct instructions from the user count as permission.

81 171 

82## Choose an integration path172## Choose an integration path


109 199 

110Send the task in plain language and tell the model to use the computer tool for UI interaction.200Send the task in plain language and tell the model to use the computer tool for UI interaction.

111 201 

202Send a computer request

203 

204```javascript

205import OpenAI from "openai";

206 

207const client = new OpenAI();

208 

209const response = await client.responses.create({

210 model: "gpt-5.5",

211 tools: [{ type: "computer" }],

212 input:

213 "Check whether the Filters panel is open. If it is not open, click Show filters. Then type penguin in the search box. Use the computer tool for UI interaction.",

214});

215 

216console.log(JSON.stringify(response.output, null, 2));

217```

218 

219```python

220from openai import OpenAI

221 

222client = OpenAI()

223 

224response = client.responses.create(

225 model="gpt-5.5",

226 tools=[{"type": "computer"}],

227 input="Check whether the Filters panel is open. If it is not open, click Show filters. Then type penguin in the search box. Use the computer tool for UI interaction.",

228)

229 

230print(response.output)

231```

232 

233 

112The first turn often asks for a screenshot before the model commits to UI actions. That's normal.234The first turn often asks for a screenshot before the model commits to UI actions. That's normal.

113 235 

114### 2. Handle screenshot-first turns236### 2. Handle screenshot-first turns

115 237 

116When the model needs visual context, it returns a `computer_call` whose `actions[]` array contains a `screenshot` request:238When the model needs visual context, it returns a `computer_call` whose `actions[]` array contains a `screenshot` request:

117 239 

240Screenshot request

241 

242```json

243{

244 "output": [

245 {

246 "type": "computer_call",

247 "call_id": "call_001",

248 "actions": [

249 { "type": "screenshot" }

250 ],

251 "status": "completed"

252 }

253 ]

254}

255```

256 

257 

118### 3. Run every returned action258### 3. Run every returned action

119 259 

120Later turns can batch actions into the same `computer_call`. Run them in order before taking the next screenshot.260Later turns can batch actions into the same `computer_call`. Run them in order before taking the next screenshot.


127 267 

128<div data-content-switcher-pane data-value="playwright">268<div data-content-switcher-pane data-value="playwright">

129 <div class="hidden">Playwright</div>269 <div class="hidden">Playwright</div>

270 Normalization helpers

271 

272```javascript

273// Map model-emitted key names to the names Playwright expects.

274const normalizeKey = (key) => {

275 switch (key) {

276 case "ENTER":

277 case "RETURN":

278 return "Enter";

279 case "ESC":

280 case "ESCAPE":

281 return "Escape";

282 case "TAB":

283 return "Tab";

284 case "SPACE":

285 return "Space";

286 case "BACKSPACE":

287 return "Backspace";

288 case "DELETE":

289 case "DEL":

290 return "Delete";

291 case "HOME":

292 return "Home";

293 case "END":

294 return "End";

295 case "PAGEUP":

296 return "PageUp";

297 case "PAGEDOWN":

298 return "PageDown";

299 case "UP":

300 case "ARROWUP":

301 return "ArrowUp";

302 case "DOWN":

303 case "ARROWDOWN":

304 return "ArrowDown";

305 case "LEFT":

306 case "ARROWLEFT":

307 return "ArrowLeft";

308 case "RIGHT":

309 case "ARROWRIGHT":

310 return "ArrowRight";

311 case "CTRL":

312 case "CONTROL":

313 return "Control";

314 case "SHIFT":

315 return "Shift";

316 case "OPTION":

317 case "ALT":

318 return "Alt";

319 case "META":

320 case "CMD":

321 case "COMMAND":

322 return "Meta";

323 default:

324 return key;

325 }

326};

327 

328// Accept drag paths as either [x, y] pairs or {x, y} objects.

329const normalizeDragPath = (path) => {

330 if (!Array.isArray(path)) {

331 throw new Error("drag action requires a path array");

332 }

333 

334 return path.map((point) => {

335 if (Array.isArray(point) && point.length >= 2) {

336 return [point[0], point[1]];

337 }

338 if (point && typeof point === "object" && "x" in point && "y" in point) {

339 return [point.x, point.y];

340 }

341 throw new Error("drag path entries must be coordinate pairs or {x, y} objects");

342 });

343};

344```

345 

346```python

347def normalize_key(key):

348 """Map model-emitted key names to the names Playwright expects."""

349 key_map = {

350 "ENTER": "Enter",

351 "RETURN": "Enter",

352 "ESC": "Escape",

353 "ESCAPE": "Escape",

354 "TAB": "Tab",

355 "SPACE": "Space",

356 "BACKSPACE": "Backspace",

357 "DELETE": "Delete",

358 "DEL": "Delete",

359 "HOME": "Home",

360 "END": "End",

361 "PAGEUP": "PageUp",

362 "PAGEDOWN": "PageDown",

363 "UP": "ArrowUp",

364 "DOWN": "ArrowDown",

365 "LEFT": "ArrowLeft",

366 "RIGHT": "ArrowRight",

367 "ARROWUP": "ArrowUp",

368 "ARROWDOWN": "ArrowDown",

369 "ARROWLEFT": "ArrowLeft",

370 "ARROWRIGHT": "ArrowRight",

371 "CTRL": "Control",

372 "CONTROL": "Control",

373 "SHIFT": "Shift",

374 "OPTION": "Alt",

375 "ALT": "Alt",

376 "META": "Meta",

377 "CMD": "Meta",

378 "COMMAND": "Meta",

379 }

380 return key_map.get(key, key)

381 

382 

383def normalize_drag_path(path):

384 """Accept drag paths as either [x, y] pairs or {x, y} objects."""

385 if not isinstance(path, list):

386 raise ValueError("drag action requires a path array")

387 

388 normalized = []

389 for point in path:

390 if isinstance(point, (list, tuple)) and len(point) >= 2:

391 normalized.append((point[0], point[1]))

392 elif isinstance(point, dict) and "x" in point and "y" in point:

393 normalized.append((point["x"], point["y"]))

394 else:

395 raise ValueError(

396 "drag path entries must be coordinate pairs or {x, y} objects"

397 )

398 return normalized

399```

400 

130 </div>401 </div>

131 <div data-content-switcher-pane data-value="docker" hidden>402 <div data-content-switcher-pane data-value="docker" hidden>

132 <div class="hidden">Docker</div>403 <div class="hidden">Docker</div>

404 Normalization helpers

405 

406```javascript

407// Map model-emitted key names to the names xdotool expects.

408const normalizeXdotoolKey = (key) => {

409 switch (key) {

410 case "ENTER":

411 case "RETURN":

412 return "Return";

413 case "ESC":

414 case "ESCAPE":

415 return "Escape";

416 case "TAB":

417 return "Tab";

418 case "SPACE":

419 return "space";

420 case "BACKSPACE":

421 return "BackSpace";

422 case "DELETE":

423 case "DEL":

424 return "Delete";

425 case "HOME":

426 return "Home";

427 case "END":

428 return "End";

429 case "PAGEUP":

430 return "Page_Up";

431 case "PAGEDOWN":

432 return "Page_Down";

433 case "UP":

434 case "ARROWUP":

435 return "Up";

436 case "DOWN":

437 case "ARROWDOWN":

438 return "Down";

439 case "LEFT":

440 case "ARROWLEFT":

441 return "Left";

442 case "RIGHT":

443 case "ARROWRIGHT":

444 return "Right";

445 case "CTRL":

446 case "CONTROL":

447 return "ctrl";

448 case "SHIFT":

449 return "shift";

450 case "OPTION":

451 case "ALT":

452 return "alt";

453 case "META":

454 case "CMD":

455 case "COMMAND":

456 return "super";

457 default:

458 return key;

459 }

460};

461 

462// Accept drag paths as either [x, y] pairs or {x, y} objects.

463const normalizeDragPath = (path) => {

464 if (!Array.isArray(path)) {

465 throw new Error("drag action requires a path array");

466 }

467 

468 return path.map((point) => {

469 if (Array.isArray(point) && point.length >= 2) {

470 return [point[0], point[1]];

471 }

472 if (point && typeof point === "object" && "x" in point && "y" in point) {

473 return [point.x, point.y];

474 }

475 throw new Error("drag path entries must be coordinate pairs or {x, y} objects");

476 });

477};

478```

479 

480```python

481def normalize_xdotool_key(key):

482 """Map model-emitted key names to the names xdotool expects."""

483 key_map = {

484 "ENTER": "Return",

485 "RETURN": "Return",

486 "ESC": "Escape",

487 "ESCAPE": "Escape",

488 "TAB": "Tab",

489 "SPACE": "space",

490 "BACKSPACE": "BackSpace",

491 "DELETE": "Delete",

492 "DEL": "Delete",

493 "HOME": "Home",

494 "END": "End",

495 "PAGEUP": "Page_Up",

496 "PAGEDOWN": "Page_Down",

497 "UP": "Up",

498 "DOWN": "Down",

499 "LEFT": "Left",

500 "RIGHT": "Right",

501 "ARROWUP": "Up",

502 "ARROWDOWN": "Down",

503 "ARROWLEFT": "Left",

504 "ARROWRIGHT": "Right",

505 "CTRL": "ctrl",

506 "CONTROL": "ctrl",

507 "SHIFT": "shift",

508 "OPTION": "alt",

509 "ALT": "alt",

510 "META": "super",

511 "CMD": "super",

512 "COMMAND": "super",

513 }

514 return key_map.get(key, key)

515 

516 

517def normalize_drag_path(path):

518 """Accept drag paths as either [x, y] pairs or {x, y} objects."""

519 if not isinstance(path, list):

520 raise ValueError("drag action requires a path array")

521 

522 normalized = []

523 for point in path:

524 if isinstance(point, (list, tuple)) and len(point) >= 2:

525 normalized.append((point[0], point[1]))

526 elif isinstance(point, dict) and "x" in point and "y" in point:

527 normalized.append((point["x"], point["y"]))

528 else:

529 raise ValueError(

530 "drag path entries must be coordinate pairs or {x, y} objects"

531 )

532 return normalized

533```

534 

133 </div>535 </div>

134 536 

135 537 

136 538 

539Batched actions in one turn

540 

541```json

542{

543 "output": [

544 {

545 "type": "computer_call",

546 "call_id": "call_002",

547 "actions": [

548 { "type": "click", "button": "left", "x": 405, "y": 157 },

549 { "type": "type", "text": "penguin" }

550 ],

551 "status": "completed"

552 }

553 ]

554}

555```

556 

557 

137The following helpers show how to run a batch of actions in either environment:558The following helpers show how to run a batch of actions in either environment:

138 559 

139 560 

140 561 

141<div data-content-switcher-pane data-value="playwright">562<div data-content-switcher-pane data-value="playwright">

142 <div class="hidden">Playwright</div>563 <div class="hidden">Playwright</div>

564 Execute Computer use actions

565 

566```javascript

567// Reuse normalizeKey from the helper above.

568// Reuse normalizeDragPath from the helper above.

569 

570async function handleComputerActions(page, actions) {

571 for (const action of actions) {

572 switch (action.type) {

573 case "click":

574 await page.mouse.click(action.x, action.y, {

575 button: action.button ?? "left",

576 });

577 break;

578 case "double_click":

579 await page.mouse.dblclick(action.x, action.y, {

580 button: action.button ?? "left",

581 });

582 break;

583 case "drag": {

584 const path = normalizeDragPath(action.path);

585 if (path.length < 2) {

586 throw new Error("drag action requires at least two path points");

587 }

588 const [[startX, startY], ...rest] = path;

589 await page.mouse.move(startX, startY);

590 await page.mouse.down();

591 for (const [x, y] of rest) {

592 await page.mouse.move(x, y);

593 }

594 await page.mouse.up();

595 break;

596 }

597 case "move":

598 await page.mouse.move(action.x, action.y);

599 break;

600 case "scroll":

601 await page.mouse.move(action.x, action.y);

602 await page.mouse.wheel(action.scrollX ?? 0, action.scrollY ?? 0);

603 break;

604 case "keypress":

605 for (const key of action.keys) {

606 await page.keyboard.press(normalizeKey(key));

607 }

608 break;

609 case "type":

610 await page.keyboard.type(action.text);

611 break;

612 case "wait":

613 case "screenshot":

614 break;

615 default:

616 throw new Error(`Unsupported action: ${action.type}`);

617 }

618 }

619}

620```

621 

622```python

623import time

624 

625# Reuse normalize_key from the helper above.

626# Reuse normalize_drag_path from the helper above.

627 

628 

629def handle_computer_actions(page, actions):

630 for action in actions:

631 match action.type:

632 case "click":

633 page.mouse.click(

634 action.x,

635 action.y,

636 button=getattr(action, "button", "left"),

637 )

638 case "double_click":

639 page.mouse.dblclick(

640 action.x,

641 action.y,

642 button=getattr(action, "button", "left"),

643 )

644 case "drag":

645 path = normalize_drag_path(action.path)

646 if len(path) < 2:

647 raise ValueError("drag action requires at least two path points")

648 start_x, start_y = path[0]

649 page.mouse.move(start_x, start_y)

650 page.mouse.down()

651 for x, y in path[1:]:

652 page.mouse.move(x, y)

653 page.mouse.up()

654 case "move":

655 page.mouse.move(action.x, action.y)

656 case "scroll":

657 page.mouse.move(action.x, action.y)

658 page.mouse.wheel(

659 getattr(action, "scrollX", 0),

660 getattr(action, "scrollY", 0),

661 )

662 case "keypress":

663 for key in action.keys:

664 page.keyboard.press(normalize_key(key))

665 case "type":

666 page.keyboard.type(action.text)

667 case "wait":

668 time.sleep(2)

669 case "screenshot":

670 pass

671 case _:

672 raise ValueError(f"Unsupported action: {action.type}")

673```

674 

143 </div>675 </div>

144 <div data-content-switcher-pane data-value="docker" hidden>676 <div data-content-switcher-pane data-value="docker" hidden>

145 <div class="hidden">Docker</div>677 <div class="hidden">Docker</div>

678 Execute Computer use actions

679 

680```javascript

681// Reuse normalizeXdotoolKey from the helper above.

682// Reuse normalizeDragPath from the helper above.

683 

684async function handleComputerActions(vm, actions) {

685 const buttonMap = { left: 1, middle: 2, right: 3 };

686 

687 for (const action of actions) {

688 switch (action.type) {

689 case "click": {

690 const button = buttonMap[action.button ?? "left"] ?? 1;

691 await dockerExec(

692 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y} click ${button}`,

693 vm.containerName

694 );

695 break;

696 }

697 case "double_click": {

698 const button = buttonMap[action.button ?? "left"] ?? 1;

699 await dockerExec(

700 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y} click --repeat 2 ${button}`,

701 vm.containerName

702 );

703 break;

704 }

705 case "drag": {

706 const path = normalizeDragPath(action.path);

707 if (path.length < 2) {

708 throw new Error("drag action requires at least two path points");

709 }

710 const [[startX, startY], ...rest] = path;

711 await dockerExec(

712 `DISPLAY=${vm.display} xdotool mousemove ${startX} ${startY} mousedown 1`,

713 vm.containerName

714 );

715 for (const [x, y] of rest) {

716 await dockerExec(

717 `DISPLAY=${vm.display} xdotool mousemove ${x} ${y}`,

718 vm.containerName

719 );

720 }

721 await dockerExec(

722 `DISPLAY=${vm.display} xdotool mouseup 1`,

723 vm.containerName

724 );

725 break;

726 }

727 case "move":

728 await dockerExec(

729 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y}`,

730 vm.containerName

731 );

732 break;

733 case "scroll": {

734 const button = action.scrollY < 0 ? 4 : 5;

735 const clicks = Math.max(1, Math.abs(Math.round(action.scrollY / 100)));

736 await dockerExec(

737 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y}`,

738 vm.containerName

739 );

740 for (let i = 0; i < clicks; i += 1) {

741 await dockerExec(

742 `DISPLAY=${vm.display} xdotool click ${button}`,

743 vm.containerName

744 );

745 }

746 break;

747 }

748 case "keypress":

749 for (const key of action.keys) {

750 await dockerExec(

751 `DISPLAY=${vm.display} xdotool key '${normalizeXdotoolKey(key)}'`,

752 vm.containerName

753 );

754 }

755 break;

756 case "type":

757 await dockerExec(

758 `DISPLAY=${vm.display} xdotool type --delay 0 '${action.text}'`,

759 vm.containerName

760 );

761 break;

762 case "wait":

763 case "screenshot":

764 break;

765 default:

766 throw new Error(`Unsupported action: ${action.type}`);

767 }

768 }

769}

770```

771 

772```python

773import time

774 

775# Reuse normalize_xdotool_key from the helper above.

776# Reuse normalize_drag_path from the helper above.

777 

778 

779def handle_computer_actions(vm, actions):

780 button_map = {"left": 1, "middle": 2, "right": 3}

781 

782 for action in actions:

783 match action.type:

784 case "click":

785 button = button_map.get(getattr(action, "button", "left"), 1)

786 docker_exec(

787 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y} click {button}",

788 vm.container_name,

789 )

790 case "double_click":

791 button = button_map.get(getattr(action, "button", "left"), 1)

792 docker_exec(

793 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y} click --repeat 2 {button}",

794 vm.container_name,

795 )

796 case "drag":

797 path = normalize_drag_path(action.path)

798 if len(path) < 2:

799 raise ValueError("drag action requires at least two path points")

800 start_x, start_y = path[0]

801 docker_exec(

802 f"DISPLAY={vm.display} xdotool mousemove {start_x} {start_y} mousedown 1",

803 vm.container_name,

804 )

805 for x, y in path[1:]:

806 docker_exec(

807 f"DISPLAY={vm.display} xdotool mousemove {x} {y}",

808 vm.container_name,

809 )

810 docker_exec(

811 f"DISPLAY={vm.display} xdotool mouseup 1",

812 vm.container_name,

813 )

814 case "move":

815 docker_exec(

816 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y}",

817 vm.container_name,

818 )

819 case "scroll":

820 button = 4 if getattr(action, "scrollY", 0) < 0 else 5

821 clicks = max(1, abs(round(getattr(action, "scrollY", 0) / 100)))

822 

823 docker_exec(

824 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y}",

825 vm.container_name,

826 )

827 for _ in range(clicks):

828 docker_exec(

829 f"DISPLAY={vm.display} xdotool click {button}",

830 vm.container_name,

831 )

832 case "keypress":

833 for key in action.keys:

834 docker_exec(

835 f"DISPLAY={vm.display} xdotool key '{normalize_xdotool_key(key)}'",

836 vm.container_name,

837 )

838 case "type":

839 docker_exec(

840 f"DISPLAY={vm.display} xdotool type --delay 0 '{action.text}'",

841 vm.container_name,

842 )

843 case "wait":

844 time.sleep(2)

845 case "screenshot":

846 pass

847 case _:

848 raise ValueError(f"Unsupported action: {action.type}")

849```

850 

146 </div>851 </div>

147 852 

148 853 


155 860 

156You may also need to map model-emitted key names such as `CTRL`, `ALT`, `META`, and `ARROWLEFT` to the names your runtime expects.861You may also need to map model-emitted key names such as `CTRL`, `ALT`, `META`, and `ARROWLEFT` to the names your runtime expects.

157 862 

863Modifier-assisted action

864 

865```json

866{

867 "output": [

868 {

869 "type": "computer_call",

870 "call_id": "call_003",

871 "actions": [

872 {

873 "type": "click",

874 "button": "left",

875 "x": 405,

876 "y": 157,

877 "keys": ["SHIFT"]

878 }

879 ],

880 "status": "completed"

881 }

882 ]

883}

884```

885 

886 

887 

888 

158<div data-content-switcher-pane data-value="playwright">889<div data-content-switcher-pane data-value="playwright">

159 <div class="hidden">Playwright</div>890 <div class="hidden">Playwright</div>

891 Execute modifier-assisted Computer use actions

892 

893```javascript

894// Reuse normalizeKey from the helper above.

895// Reuse normalizeDragPath from the helper above.

896 

897async function withModifiers(page, keys, callback) {

898 const normalizedKeys = (keys ?? []).map(normalizeKey);

899 const pressedKeys = [];

900 

901 try {

902 for (const key of normalizedKeys) {

903 await page.keyboard.down(key);

904 pressedKeys.push(key);

905 }

906 

907 await callback();

908 } finally {

909 for (const key of [...pressedKeys].reverse()) {

910 await page.keyboard.up(key);

911 }

912 }

913}

914 

915async function handleComputerActions(page, actions) {

916 for (const action of actions) {

917 switch (action.type) {

918 case "click":

919 await withModifiers(page, action.keys, async () => {

920 await page.mouse.click(action.x, action.y, {

921 button: action.button ?? "left",

922 });

923 });

924 break;

925 case "double_click":

926 await withModifiers(page, action.keys, async () => {

927 await page.mouse.dblclick(action.x, action.y, {

928 button: action.button ?? "left",

929 });

930 });

931 break;

932 case "drag": {

933 const path = normalizeDragPath(action.path);

934 if (path.length < 2) {

935 throw new Error("drag action requires at least two path points");

936 }

937 await withModifiers(page, action.keys, async () => {

938 const [[startX, startY], ...rest] = path;

939 await page.mouse.move(startX, startY);

940 await page.mouse.down();

941 for (const [x, y] of rest) {

942 await page.mouse.move(x, y);

943 }

944 await page.mouse.up();

945 });

946 break;

947 }

948 case "move":

949 await withModifiers(page, action.keys, async () => {

950 await page.mouse.move(action.x, action.y);

951 });

952 break;

953 case "scroll":

954 await withModifiers(page, action.keys, async () => {

955 await page.mouse.move(action.x, action.y);

956 await page.mouse.wheel(action.scrollX ?? 0, action.scrollY ?? 0);

957 });

958 break;

959 case "keypress":

960 for (const key of action.keys) {

961 await page.keyboard.press(normalizeKey(key));

962 }

963 break;

964 case "type":

965 await page.keyboard.type(action.text);

966 break;

967 case "wait":

968 case "screenshot":

969 break;

970 default:

971 throw new Error(`Unsupported action: ${action.type}`);

972 }

973 }

974}

975```

976 

977```python

978import time

979 

980# Reuse normalize_key from the helper above.

981# Reuse normalize_drag_path from the helper above.

982 

983 

984def with_modifiers(page, keys, callback):

985 normalized_keys = [normalize_key(key) for key in (keys or [])]

986 pressed_keys = []

987 

988 try:

989 for key in normalized_keys:

990 page.keyboard.down(key)

991 pressed_keys.append(key)

992 

993 callback()

994 finally:

995 for key in reversed(pressed_keys):

996 page.keyboard.up(key)

997 

998 

999def handle_computer_actions(page, actions):

1000 for action in actions:

1001 match action.type:

1002 case "click":

1003 with_modifiers(

1004 page,

1005 getattr(action, "keys", None),

1006 lambda: page.mouse.click(

1007 action.x,

1008 action.y,

1009 button=getattr(action, "button", "left"),

1010 ),

1011 )

1012 case "double_click":

1013 with_modifiers(

1014 page,

1015 getattr(action, "keys", None),

1016 lambda: page.mouse.dblclick(

1017 action.x,

1018 action.y,

1019 button=getattr(action, "button", "left"),

1020 ),

1021 )

1022 case "drag":

1023 path = normalize_drag_path(action.path)

1024 if len(path) < 2:

1025 raise ValueError("drag action requires at least two path points")

1026 

1027 def do_drag():

1028 start_x, start_y = path[0]

1029 page.mouse.move(start_x, start_y)

1030 page.mouse.down()

1031 for x, y in path[1:]:

1032 page.mouse.move(x, y)

1033 page.mouse.up()

1034 

1035 with_modifiers(

1036 page,

1037 getattr(action, "keys", None),

1038 do_drag,

1039 )

1040 case "move":

1041 with_modifiers(

1042 page,

1043 getattr(action, "keys", None),

1044 lambda: page.mouse.move(action.x, action.y),

1045 )

1046 case "scroll":

1047 with_modifiers(

1048 page,

1049 getattr(action, "keys", None),

1050 lambda: (

1051 page.mouse.move(action.x, action.y),

1052 page.mouse.wheel(

1053 getattr(action, "scrollX", 0),

1054 getattr(action, "scrollY", 0),

1055 ),

1056 ),

1057 )

1058 case "keypress":

1059 for key in action.keys:

1060 page.keyboard.press(normalize_key(key))

1061 case "type":

1062 page.keyboard.type(action.text)

1063 case "wait":

1064 time.sleep(2)

1065 case "screenshot":

1066 pass

1067 case _:

1068 raise ValueError(f"Unsupported action: {action.type}")

1069```

1070 

160 </div>1071 </div>

161 <div data-content-switcher-pane data-value="docker" hidden>1072 <div data-content-switcher-pane data-value="docker" hidden>

162 <div class="hidden">Docker</div>1073 <div class="hidden">Docker</div>

1074 Execute modifier-assisted Computer use actions

1075 

1076```javascript

1077// Reuse normalizeXdotoolKey from the helper above.

1078// Reuse normalizeDragPath from the helper above.

1079 

1080async function withModifiers(vm, keys, callback) {

1081 const normalizedKeys = (keys ?? []).map(normalizeXdotoolKey);

1082 const pressedKeys = [];

1083 

1084 try {

1085 for (const key of normalizedKeys) {

1086 await dockerExec(

1087 `DISPLAY=${vm.display} xdotool keydown '${key}'`,

1088 vm.containerName

1089 );

1090 pressedKeys.push(key);

1091 }

1092 

1093 await callback();

1094 } finally {

1095 for (const key of [...pressedKeys].reverse()) {

1096 await dockerExec(

1097 `DISPLAY=${vm.display} xdotool keyup '${key}'`,

1098 vm.containerName

1099 );

1100 }

1101 }

1102}

1103 

1104async function handleComputerActions(vm, actions) {

1105 const buttonMap = { left: 1, middle: 2, right: 3 };

1106 

1107 for (const action of actions) {

1108 switch (action.type) {

1109 case "click": {

1110 const button = buttonMap[action.button ?? "left"] ?? 1;

1111 await withModifiers(vm, action.keys, async () => {

1112 await dockerExec(

1113 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y} click ${button}`,

1114 vm.containerName

1115 );

1116 });

1117 break;

1118 }

1119 case "double_click": {

1120 const button = buttonMap[action.button ?? "left"] ?? 1;

1121 await withModifiers(vm, action.keys, async () => {

1122 await dockerExec(

1123 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y} click --repeat 2 ${button}`,

1124 vm.containerName

1125 );

1126 });

1127 break;

1128 }

1129 case "drag": {

1130 const path = normalizeDragPath(action.path);

1131 if (path.length < 2) {

1132 throw new Error("drag action requires at least two path points");

1133 }

1134 await withModifiers(vm, action.keys, async () => {

1135 const [[startX, startY], ...rest] = path;

1136 await dockerExec(

1137 `DISPLAY=${vm.display} xdotool mousemove ${startX} ${startY} mousedown 1`,

1138 vm.containerName

1139 );

1140 for (const [x, y] of rest) {

1141 await dockerExec(

1142 `DISPLAY=${vm.display} xdotool mousemove ${x} ${y}`,

1143 vm.containerName

1144 );

1145 }

1146 await dockerExec(

1147 `DISPLAY=${vm.display} xdotool mouseup 1`,

1148 vm.containerName

1149 );

1150 });

1151 break;

1152 }

1153 case "move": {

1154 await withModifiers(vm, action.keys, async () => {

1155 await dockerExec(

1156 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y}`,

1157 vm.containerName

1158 );

1159 });

1160 break;

1161 }

1162 case "scroll": {

1163 const button = action.scrollY < 0 ? 4 : 5;

1164 const clicks = Math.max(1, Math.abs(Math.round(action.scrollY / 100)));

1165 await withModifiers(vm, action.keys, async () => {

1166 await dockerExec(

1167 `DISPLAY=${vm.display} xdotool mousemove ${action.x} ${action.y}`,

1168 vm.containerName

1169 );

1170 for (let i = 0; i < clicks; i += 1) {

1171 await dockerExec(

1172 `DISPLAY=${vm.display} xdotool click ${button}`,

1173 vm.containerName

1174 );

1175 }

1176 });

1177 break;

1178 }

1179 case "keypress":

1180 for (const key of action.keys) {

1181 await dockerExec(

1182 `DISPLAY=${vm.display} xdotool key '${normalizeXdotoolKey(key)}'`,

1183 vm.containerName

1184 );

1185 }

1186 break;

1187 case "type":

1188 await dockerExec(

1189 `DISPLAY=${vm.display} xdotool type --delay 0 '${action.text}'`,

1190 vm.containerName

1191 );

1192 break;

1193 case "wait":

1194 case "screenshot":

1195 break;

1196 default:

1197 throw new Error(`Unsupported action: ${action.type}`);

1198 }

1199 }

1200}

1201```

1202 

1203```python

1204import time

1205 

1206# Reuse normalize_xdotool_key from the helper above.

1207# Reuse normalize_drag_path from the helper above.

1208 

1209 

1210def with_modifiers(vm, keys, callback):

1211 normalized_keys = [normalize_xdotool_key(key) for key in (keys or [])]

1212 pressed_keys = []

1213 

1214 try:

1215 for key in normalized_keys:

1216 docker_exec(

1217 f"DISPLAY={vm.display} xdotool keydown '{key}'",

1218 vm.container_name,

1219 )

1220 pressed_keys.append(key)

1221 

1222 callback()

1223 finally:

1224 for key in reversed(pressed_keys):

1225 docker_exec(

1226 f"DISPLAY={vm.display} xdotool keyup '{key}'",

1227 vm.container_name,

1228 )

1229 

1230 

1231def handle_computer_actions(vm, actions):

1232 button_map = {"left": 1, "middle": 2, "right": 3}

1233 

1234 for action in actions:

1235 match action.type:

1236 case "click":

1237 button = button_map.get(getattr(action, "button", "left"), 1)

1238 with_modifiers(

1239 vm,

1240 getattr(action, "keys", None),

1241 lambda: docker_exec(

1242 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y} click {button}",

1243 vm.container_name,

1244 ),

1245 )

1246 case "double_click":

1247 button = button_map.get(getattr(action, "button", "left"), 1)

1248 with_modifiers(

1249 vm,

1250 getattr(action, "keys", None),

1251 lambda: docker_exec(

1252 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y} click --repeat 2 {button}",

1253 vm.container_name,

1254 ),

1255 )

1256 case "drag":

1257 path = normalize_drag_path(action.path)

1258 if len(path) < 2:

1259 raise ValueError("drag action requires at least two path points")

1260 

1261 def do_drag():

1262 start_x, start_y = path[0]

1263 docker_exec(

1264 f"DISPLAY={vm.display} xdotool mousemove {start_x} {start_y} mousedown 1",

1265 vm.container_name,

1266 )

1267 for x, y in path[1:]:

1268 docker_exec(

1269 f"DISPLAY={vm.display} xdotool mousemove {x} {y}",

1270 vm.container_name,

1271 )

1272 docker_exec(

1273 f"DISPLAY={vm.display} xdotool mouseup 1",

1274 vm.container_name,

1275 )

1276 

1277 with_modifiers(vm, getattr(action, "keys", None), do_drag)

1278 case "move":

1279 with_modifiers(

1280 vm,

1281 getattr(action, "keys", None),

1282 lambda: docker_exec(

1283 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y}",

1284 vm.container_name,

1285 ),

1286 )

1287 case "scroll":

1288 button = 4 if getattr(action, "scrollY", 0) < 0 else 5

1289 clicks = max(1, abs(round(getattr(action, "scrollY", 0) / 100)))

1290 

1291 def do_scroll():

1292 docker_exec(

1293 f"DISPLAY={vm.display} xdotool mousemove {action.x} {action.y}",

1294 vm.container_name,

1295 )

1296 for _ in range(clicks):

1297 docker_exec(

1298 f"DISPLAY={vm.display} xdotool click {button}",

1299 vm.container_name,

1300 )

1301 

1302 with_modifiers(vm, getattr(action, "keys", None), do_scroll)

1303 case "keypress":

1304 for key in action.keys:

1305 docker_exec(

1306 f"DISPLAY={vm.display} xdotool key '{normalize_xdotool_key(key)}'",

1307 vm.container_name,

1308 )

1309 case "type":

1310 docker_exec(

1311 f"DISPLAY={vm.display} xdotool type --delay 0 '{action.text}'",

1312 vm.container_name,

1313 )

1314 case "wait":

1315 time.sleep(2)

1316 case "screenshot":

1317 pass

1318 case _:

1319 raise ValueError(f"Unsupported action: {action.type}")

1320```

1321 

163 </div>1322 </div>

164 1323 

165 1324 


172 1331 

173<div data-content-switcher-pane data-value="playwright">1332<div data-content-switcher-pane data-value="playwright">

174 <div class="hidden">Playwright</div>1333 <div class="hidden">Playwright</div>

1334 Capture a screenshot

1335 

1336```javascript

1337async function captureScreenshot(page) {

1338 return await page.screenshot({ type: "png" });

1339}

1340```

1341 

1342```python

1343def capture_screenshot(page):

1344 return page.screenshot(type="png")

1345```

1346 

175 </div>1347 </div>

176 <div data-content-switcher-pane data-value="docker" hidden>1348 <div data-content-switcher-pane data-value="docker" hidden>

177 <div class="hidden">Docker</div>1349 <div class="hidden">Docker</div>

1350 Capture a screenshot

1351 

1352```javascript

1353async function captureScreenshot(vm) {

1354 return await dockerExec(

1355 `export DISPLAY=${vm.display} && import -window root png:-`,

1356 vm.containerName,

1357 false

1358 );

1359}

1360```

1361 

1362```python

1363def capture_screenshot(vm):

1364 return docker_exec(

1365 f"export DISPLAY={vm.display} && import -window root png:-",

1366 vm.container_name,

1367 decode=False,

1368 )

1369```

1370 

178 </div>1371 </div>

179 1372 

180 1373 


183 1376 

184For Computer use, prefer `detail: "original"` on screenshot inputs. This preserves the full screenshot resolution, up to 10.24M pixels, and improves click accuracy. If `detail: "original"` uses too many tokens, you can downscale the image before sending it to the API, and make sure you remap model-generated coordinates from the downscaled coordinate space to the original image's coordinate space. Avoid using `high` or `low` image detail for computer use tasks. When downscaling, we observe strong performance with 1440x900 and 1600x900 desktop resolutions. See the [Images and Vision guide](https://developers.openai.com/api/docs/guides/images-vision) for more details on image input detail levels.1377For Computer use, prefer `detail: "original"` on screenshot inputs. This preserves the full screenshot resolution, up to 10.24M pixels, and improves click accuracy. If `detail: "original"` uses too many tokens, you can downscale the image before sending it to the API, and make sure you remap model-generated coordinates from the downscaled coordinate space to the original image's coordinate space. Avoid using `high` or `low` image detail for computer use tasks. When downscaling, we observe strong performance with 1440x900 and 1600x900 desktop resolutions. See the [Images and Vision guide](https://developers.openai.com/api/docs/guides/images-vision) for more details on image input detail levels.

185 1378 

1379Send the updated screenshot

1380 

1381```javascript

1382import OpenAI from "openai";

1383 

1384const client = new OpenAI();

1385 

1386async function sendComputerScreenshot(response, callId, screenshotBase64) {

1387 return await client.responses.create({

1388 model: "gpt-5.5",

1389 tools: [{ type: "computer" }],

1390 previous_response_id: response.id,

1391 input: [

1392 {

1393 type: "computer_call_output",

1394 call_id: callId,

1395 output: {

1396 type: "computer_screenshot",

1397 image_url: `data:image/png;base64,${screenshotBase64}`,

1398 detail: "original",

1399 },

1400 },

1401 ],

1402 });

1403}

1404```

1405 

1406```python

1407from openai import OpenAI

1408 

1409client = OpenAI()

1410 

1411 

1412def send_computer_screenshot(response, call_id, screenshot_base64):

1413 return client.responses.create(

1414 model="gpt-5.5",

1415 tools=[{"type": "computer"}],

1416 previous_response_id=response.id,

1417 input=[

1418 {

1419 "type": "computer_call_output",

1420 "call_id": call_id,

1421 "output": {

1422 "type": "computer_screenshot",

1423 "image_url": f"data:image/png;base64,{screenshot_base64}",

1424 "detail": "original",

1425 },

1426 }

1427 ],

1428 )

1429```

1430 

1431 

186### 5. Repeat until the tool stops calling1432### 5. Repeat until the tool stops calling

187 1433 

188The easiest way to continue the loop is to send `previous_response_id` on each follow-up turn and keep reusing the same tool definition.1434The easiest way to continue the loop is to send `previous_response_id` on each follow-up turn and keep reusing the same tool definition.

189 1435 

1436Repeat the Computer use loop

1437 

1438```javascript

1439import OpenAI from "openai";

1440 

1441const client = new OpenAI();

1442 

1443async function computerUseLoop(target, response) {

1444 while (true) {

1445 const computerCall = response.output.find((item) => item.type === "computer_call");

1446 if (!computerCall) {

1447 return response;

1448 }

1449 

1450 await handleComputerActions(target, computerCall.actions);

1451 

1452 const screenshot = await captureScreenshot(target);

1453 const screenshotBase64 = Buffer.from(screenshot).toString("base64");

1454 

1455 response = await client.responses.create({

1456 model: "gpt-5.5",

1457 tools: [{ type: "computer" }],

1458 previous_response_id: response.id,

1459 input: [

1460 {

1461 type: "computer_call_output",

1462 call_id: computerCall.call_id,

1463 output: {

1464 type: "computer_screenshot",

1465 image_url: `data:image/png;base64,${screenshotBase64}`,

1466 detail: "original",

1467 },

1468 },

1469 ],

1470 });

1471 }

1472}

1473```

1474 

1475```python

1476import base64

1477 

1478from openai import OpenAI

1479 

1480client = OpenAI()

1481 

1482 

1483def computer_use_loop(target, response):

1484 while True:

1485 computer_call = next(

1486 (item for item in response.output if item.type == "computer_call"),

1487 None,

1488 )

1489 if computer_call is None:

1490 return response

1491 

1492 handle_computer_actions(target, computer_call.actions)

1493 

1494 screenshot = capture_screenshot(target)

1495 screenshot_base64 = base64.b64encode(screenshot).decode("utf-8")

1496 

1497 response = client.responses.create(

1498 model="gpt-5.5",

1499 tools=[{"type": "computer"}],

1500 previous_response_id=response.id,

1501 input=[

1502 {

1503 "type": "computer_call_output",

1504 "call_id": computer_call.call_id,

1505 "output": {

1506 "type": "computer_screenshot",

1507 "image_url": f"data:image/png;base64,{screenshot_base64}",

1508 "detail": "original",

1509 },

1510 }

1511 ],

1512 )

1513```

1514 

1515 

190When the response no longer contains a `computer_call`, read the remaining output items as the model's final answer or handoff.1516When the response no longer contains a `computer_call`, read the remaining output items as the model's final answer or handoff.

191 1517 

192### Possible Computer use actions1518### Possible Computer use actions


243 1569 

244<div data-content-switcher-pane data-value="javascript">1570<div data-content-switcher-pane data-value="javascript">

245 <div class="hidden">JavaScript</div>1571 <div class="hidden">JavaScript</div>

1572 Code-execution harness

1573 

1574```javascript

1575// Run with:

1576// bun run -i cua_code_mode.ts

1577// Override the user prompt with:

1578// bun run -i cua_code_mode.ts --prompt "Go to example.com and summarize the page."

1579// Note: this script intentionally leaves the Playwright browser open after the

1580// model reaches a final answer. Because the browser/context are not closed,

1581// Bun stays alive until you close the browser or stop the process manually.

1582 

1583import OpenAI from "openai";

1584import readline from "node:readline/promises";

1585import vm from "node:vm";

1586import { chromium } from "playwright";

1587import util from "node:util";

1588 

1589async function main(

1590 prompt: string = "Go to Hacker News, click on the most interesting link (be prepared to justify your choice), take a screenshot, and give me a critique of the visual layout.",

1591 max_steps: number = 50,

1592 model: string = "gpt-5.5"

1593) {

1594 type Phase = null | "commentary" | "final_answer";

1595 const client = new OpenAI();

1596 const rl = readline.createInterface({

1597 input: process.stdin,

1598 output: process.stdout,

1599 });

1600 const browser = await chromium.launch({

1601 headless: false,

1602 args: ["--window-size=1440,900"],

1603 });

1604 const context = await browser.newContext({

1605 viewport: { width: 1440, height: 900 },

1606 });

1607 const page = await context.newPage();

1608 

1609 const conversation: any[] = [];

1610 const js_output: any[] = [];

1611 const sandbox: Record<string, any> = {

1612 console: {

1613 log: (...xs: any[]) => {

1614 js_output.push({

1615 type: "input_text",

1616 text: util.formatWithOptions(

1617 { showHidden: false, getters: false, maxStringLength: 2000 },

1618 ...xs

1619 ),

1620 });

1621 },

1622 },

1623 browser: browser,

1624 context: context,

1625 page: page,

1626 display: (base64_image: string) => {

1627 js_output.push({

1628 type: "input_image",

1629 image_url: `data:image/png;base64,${base64_image}`,

1630 detail: "original",

1631 });

1632 },

1633 };

1634 const ctx = vm.createContext(sandbox);

1635 

1636 conversation.push({

1637 role: "user",

1638 content: prompt,

1639 });

1640 

1641 for (let i = 0; i < max_steps; i++) {

1642 const resp = await client.responses.create({

1643 model,

1644 tools: [

1645 {

1646 type: "function" as const,

1647 name: "exec_js",

1648 description:

1649 "Execute provided interactive JavaScript in a persistent REPL context.",

1650 parameters: {

1651 type: "object",

1652 properties: {

1653 code: {

1654 type: "string",

1655 description: `

1656JavaScript to execute. Write small snippets of interactive code. To persist variables or functions across tool calls, you must save them to globalThis. Code is executed in an async node:vm context, so you can use await. You have access to ONLY the following:

1657- console.log(x): Use this to read contents back to you. But be minimal: otherwise the output may be too long. Avoid using console.log() for large base64 payloads like screenshots or buffer. If you create an image or screenshot, pass the base64 string to display().

1658- display(base64_image_string): Use this to view a base64-encoded image.

1659- Do not write screenshots or image data to temporary files or disk just to pass them back. Keep image data in memory and send it directly to display().

1660- Do not assume package globals like Bun.file are available unless they are explicitly provided.

1661- browser: A playwright chromium browser instance.

1662- context: A playwright browser context with viewport 1440x900.

1663- page: A playwright page already created in that context.

1664`,

1665 },

1666 },

1667 required: ["code"],

1668 additionalProperties: false,

1669 },

1670 },

1671 {

1672 type: "function" as const,

1673 name: "ask_user",

1674 description:

1675 "Ask the user a clarification question and wait for their response.",

1676 parameters: {

1677 type: "object",

1678 properties: {

1679 question: {

1680 type: "string",

1681 description:

1682 "The exact question to show the human. Use this instead of answering with a freeform clarifying question in a final answer.",

1683 },

1684 },

1685 required: ["question"],

1686 additionalProperties: false,

1687 },

1688 },

1689 ],

1690 input: conversation,

1691 reasoning: {

1692 effort: "low",

1693 },

1694 });

1695 

1696 // Save model outputs into the running conversation

1697 conversation.push(...resp.output);

1698 

1699 let hadToolCall = false;

1700 let latestPhase: Phase = null;

1701 

1702 // Handle tool calls

1703 for (const item of resp.output) {

1704 if (item.type === "function_call" && item.name === "exec_js") {

1705 hadToolCall = true;

1706 const parsed = JSON.parse(item.arguments ?? "{}") as {

1707 code?: string;

1708 };

1709 const code = parsed.code ?? "";

1710 console.log(code);

1711 console.log("----");

1712 const wrappedCode = `

1713 (async () => {

1714 ${code}

1715 })();

1716 `;

1717 

1718 try {

1719 await new vm.Script(wrappedCode, {

1720 filename: "exec_js.js",

1721 }).runInContext(ctx);

1722 } catch (e: any) {

1723 sandbox.console.log(e, e?.message, e?.stack);

1724 }

1725 

1726 // Send tool output back to the model, keyed by call_id

1727 conversation.push({

1728 type: "function_call_output",

1729 call_id: item.call_id,

1730 output: js_output.slice(),

1731 });

1732 

1733 for (const out of js_output) {

1734 if (out.type === "input_text") {

1735 console.log("JS LOG:", out.text);

1736 } else if (out.type === "input_image") {

1737 console.log("JS IMAGE: [base64 string omitted]");

1738 }

1739 }

1740 console.log("=====");

1741 

1742 js_output.length = 0;

1743 } else if (item.type === "function_call" && item.name === "ask_user") {

1744 hadToolCall = true;

1745 const parsed = JSON.parse(item.arguments ?? "{}") as {

1746 question?: string;

1747 };

1748 const question = parsed.question ?? "Please provide more information.";

1749 console.log(`MODEL QUESTION: ${question}`);

1750 const answer = await rl.question("> ");

1751 conversation.push({

1752 type: "function_call_output",

1753 call_id: item.call_id,

1754 output: answer,

1755 });

1756 } else if (item.type === "message") {

1757 console.log(item.content[0]?.text ?? item.content);

1758 if ("phase" in item) {

1759 latestPhase = (item.phase as Phase) ?? null;

1760 }

1761 } else if (item.type === "output_item.done" && "phase" in item) {

1762 latestPhase = (item.phase as Phase) ?? null;

1763 }

1764 }

1765 

1766 // Stop only when the model explicitly marks the turn as a final answer

1767 // and there were no tool calls in the same turn.

1768 if (!hadToolCall && latestPhase === "final_answer") return;

1769 }

1770}

1771 

1772function getCliPrompt(): string | undefined {

1773 const args = Bun.argv.slice(2);

1774 for (let i = 0; i < args.length; i++) {

1775 if (args[i] === "--prompt") {

1776 return args[i + 1];

1777 }

1778 }

1779 return undefined;

1780}

1781 

1782main(getCliPrompt());

1783```

1784 

1785```python

1786# /// script

1787# requires-python = ">=3.10"

1788# dependencies = [

1789# "openai",

1790# "playwright",

1791# ]

1792# ///

1793# Run with: `uv run cua_code_mode_py_async.py`

1794# Override the user prompt with:

1795# `uv run cua_code_mode_py_async.py --prompt "Go to example.com and summarize the page."`

1796# Install Chromium once first: `uv run --with playwright python -m playwright install chromium`

1797# Requires `OPENAI_API_KEY` in the environment.

1798 

1799"""Async Python analogue of cua_code_mode.ts.

1800 

1801Runs a Responses API loop with one persistent Playwright browser/context/page,

1802and tools that let the model execute short async Python snippets and ask the

1803user clarifying questions.

1804 

1805The model can return visual observations by calling:

1806 display(base64_png_string)

1807"""

1808 

1809from __future__ import annotations

1810 

1811import argparse

1812import asyncio

1813import json

1814import traceback

1815from typing import Any

1816 

1817from openai import OpenAI

1818from playwright.async_api import async_playwright

1819 

1820Phase = str | None

1821 

1822 

1823def _message_text(item: Any) -> str:

1824 try:

1825 parts = getattr(item, "content", None)

1826 if isinstance(parts, list) and parts:

1827 out: list[str] = []

1828 for p in parts:

1829 t = getattr(p, "text", None)

1830 if isinstance(t, str) and t:

1831 out.append(t)

1832 if out:

1833 return "\n".join(out)

1834 except Exception:

1835 pass

1836 return str(item)

1837 

1838 

1839async def _ainput(prompt: str) -> str:

1840 return await asyncio.to_thread(input, prompt)

1841 

1842 

1843async def main(

1844 prompt: str = "Go to Hacker News, click on the most interesting link (be prepared to justify your choice), take a screenshot, and give me a critique of the visual layout.",

1845 max_steps: int = 20,

1846 model: str = "gpt-5.5",

1847) -> None:

1848 client = OpenAI()

1849 

1850 async with async_playwright() as p:

1851 browser = await p.chromium.launch(

1852 headless=False,

1853 args=["--window-size=1440,900"],

1854 )

1855 context = await browser.new_context(viewport={"width": 1440, "height": 900})

1856 page = await context.new_page()

1857 

1858 conversation: list[dict[str, Any]] = [{"role": "user", "content": prompt}]

1859 py_output: list[dict[str, Any]] = []

1860 

1861 def log(*xs: Any) -> None:

1862 text = " ".join(str(x) for x in xs)

1863 py_output.append({"type": "input_text", "text": text[:5000]})

1864 

1865 def display(base64_image: str) -> None:

1866 py_output.append(

1867 {

1868 "type": "input_image",

1869 "image_url": f"data:image/png;base64,{base64_image}",

1870 "detail": "original",

1871 }

1872 )

1873 

1874 runtime_globals: dict[str, Any] = {

1875 "__builtins__": __builtins__,

1876 "asyncio": asyncio,

1877 "browser": browser,

1878 "context": context,

1879 "page": page,

1880 "display": display,

1881 "log": log,

1882 }

1883 

1884 for _ in range(max_steps):

1885 resp = client.responses.create(

1886 model=model,

1887 tools=[

1888 {

1889 "type": "function",

1890 "name": "exec_py",

1891 "description": "Execute provided interactive async Python in a persistent runtime context.",

1892 "parameters": {

1893 "type": "object",

1894 "properties": {

1895 "code": {

1896 "type": "string",

1897 "description": (

1898 "Python code to execute. Write small snippets. "

1899 "State persists across tool calls via globals(). "

1900 "This runtime uses Playwright's async Python API, so you may use await directly. "

1901 "Do not call asyncio.run(...), loop.run_until_complete(...), or manage the event loop yourself. "

1902 "You can use ONLY these prebound objects/helpers: "

1903 "log(x) for text output, display(base64_png_string) for image output, "

1904 "browser (async Playwright browser), context (viewport 1440x900), page (already created), "

1905 "asyncio (module). "

1906 "Be concise with log(x): do not send large base64 payloads, screenshots, buffers, page HTML, "

1907 "or other large blobs through log(). If you create an image or screenshot, pass the base64 PNG "

1908 "string to display(). Do not write screenshots or image data to temporary files or disk just "

1909 "to pass them back; keep image data in memory and send it directly to display(). "

1910 "Do not assume extra globals or helpers are available unless they are explicitly listed here. "

1911 "Do not close browser/context/page unless explicitly asked."

1912 ),

1913 }

1914 },

1915 "required": ["code"],

1916 "additionalProperties": False,

1917 },

1918 },

1919 {

1920 "type": "function",

1921 "name": "ask_user",

1922 "description": "Ask the user a clarification question and wait for their response.",

1923 "parameters": {

1924 "type": "object",

1925 "properties": {

1926 "question": {

1927 "type": "string",

1928 "description": "The exact question to show the user. Use this instead of asking a freeform clarifying question in a final answer.",

1929 }

1930 },

1931 "required": ["question"],

1932 "additionalProperties": False,

1933 },

1934 },

1935 ],

1936 input=conversation,

1937 )

1938 

1939 conversation.extend(resp.output)

1940 

1941 had_tool_call = False

1942 latest_phase: Phase = None

1943 

1944 for item in resp.output:

1945 item_type = getattr(item, "type", None)

1946 

1947 if item_type == "function_call" and getattr(item, "name", None) == "exec_py":

1948 had_tool_call = True

1949 raw_args = getattr(item, "arguments", "{}") or "{}"

1950 try:

1951 args = json.loads(raw_args)

1952 except json.JSONDecodeError:

1953 args = {}

1954 code = args.get("code", "") if isinstance(args, dict) else ""

1955 

1956 print(code)

1957 print("----")

1958 

1959 wrapped = (

1960 "async def __codex_exec__():\n"

1961 + "".join(

1962 f" {line}\n" if line else " \n"

1963 for line in (code or "pass").splitlines()

1964 )

1965 )

1966 

1967 try:

1968 exec(wrapped, runtime_globals, runtime_globals)

1969 await runtime_globals["__codex_exec__"]()

1970 except Exception:

1971 log(traceback.format_exc())

1972 

1973 conversation.append(

1974 {

1975 "type": "function_call_output",

1976 "call_id": getattr(item, "call_id", None),

1977 "output": py_output[:],

1978 }

1979 )

1980 

1981 for out in py_output:

1982 if out.get("type") == "input_text":

1983 print("PY LOG:", out.get("text", ""))

1984 elif out.get("type") == "input_image":

1985 print("PY IMAGE: [base64 string omitted]")

1986 print("=====")

1987 

1988 py_output.clear()

1989 

1990 elif item_type == "function_call" and getattr(item, "name", None) == "ask_user":

1991 had_tool_call = True

1992 raw_args = getattr(item, "arguments", "{}") or "{}"

1993 try:

1994 args = json.loads(raw_args)

1995 except json.JSONDecodeError:

1996 args = {}

1997 question = (

1998 args.get("question", "Please provide more information.")

1999 if isinstance(args, dict)

2000 else "Please provide more information."

2001 )

2002 

2003 print(f"MODEL QUESTION: {question}")

2004 answer = await _ainput("> ")

2005 

2006 conversation.append(

2007 {

2008 "type": "function_call_output",

2009 "call_id": getattr(item, "call_id", None),

2010 "output": answer,

2011 }

2012 )

2013 

2014 elif item_type == "message":

2015 print(_message_text(item))

2016 phase = getattr(item, "phase", None)

2017 if isinstance(phase, str) or phase is None:

2018 latest_phase = phase

2019 elif item_type == "output_item.done":

2020 phase = getattr(item, "phase", None)

2021 if isinstance(phase, str) or phase is None:

2022 latest_phase = phase

2023 

2024 if not had_tool_call and latest_phase == "final_answer":

2025 return

2026 

2027 

2028if __name__ == "__main__":

2029 parser = argparse.ArgumentParser()

2030 parser.add_argument("--prompt", help="Override the default user prompt.")

2031 args = parser.parse_args()

2032 asyncio.run(main(prompt=args.prompt) if args.prompt is not None else main())

2033```

2034 

246 </div>2035 </div>

247 <div data-content-switcher-pane data-value="python" hidden>2036 <div data-content-switcher-pane data-value="python" hidden>

248 <div class="hidden">Python</div>2037 <div class="hidden">Python</div>

2038 Code-execution harness

2039 

2040```javascript

2041// Run with:

2042// bun run -i cua_code_mode.ts

2043// Override the user prompt with:

2044// bun run -i cua_code_mode.ts --prompt "Go to example.com and summarize the page."

2045// Note: this script intentionally leaves the Playwright browser open after the

2046// model reaches a final answer. Because the browser/context are not closed,

2047// Bun stays alive until you close the browser or stop the process manually.

2048 

2049import OpenAI from "openai";

2050import readline from "node:readline/promises";

2051import vm from "node:vm";

2052import { chromium } from "playwright";

2053import util from "node:util";

2054 

2055async function main(

2056 prompt: string = "Go to Hacker News, click on the most interesting link (be prepared to justify your choice), take a screenshot, and give me a critique of the visual layout.",

2057 max_steps: number = 50,

2058 model: string = "gpt-5.5"

2059) {

2060 type Phase = null | "commentary" | "final_answer";

2061 const client = new OpenAI();

2062 const rl = readline.createInterface({

2063 input: process.stdin,

2064 output: process.stdout,

2065 });

2066 const browser = await chromium.launch({

2067 headless: false,

2068 args: ["--window-size=1440,900"],

2069 });

2070 const context = await browser.newContext({

2071 viewport: { width: 1440, height: 900 },

2072 });

2073 const page = await context.newPage();

2074 

2075 const conversation: any[] = [];

2076 const js_output: any[] = [];

2077 const sandbox: Record<string, any> = {

2078 console: {

2079 log: (...xs: any[]) => {

2080 js_output.push({

2081 type: "input_text",

2082 text: util.formatWithOptions(

2083 { showHidden: false, getters: false, maxStringLength: 2000 },

2084 ...xs

2085 ),

2086 });

2087 },

2088 },

2089 browser: browser,

2090 context: context,

2091 page: page,

2092 display: (base64_image: string) => {

2093 js_output.push({

2094 type: "input_image",

2095 image_url: `data:image/png;base64,${base64_image}`,

2096 detail: "original",

2097 });

2098 },

2099 };

2100 const ctx = vm.createContext(sandbox);

2101 

2102 conversation.push({

2103 role: "user",

2104 content: prompt,

2105 });

2106 

2107 for (let i = 0; i < max_steps; i++) {

2108 const resp = await client.responses.create({

2109 model,

2110 tools: [

2111 {

2112 type: "function" as const,

2113 name: "exec_js",

2114 description:

2115 "Execute provided interactive JavaScript in a persistent REPL context.",

2116 parameters: {

2117 type: "object",

2118 properties: {

2119 code: {

2120 type: "string",

2121 description: `

2122JavaScript to execute. Write small snippets of interactive code. To persist variables or functions across tool calls, you must save them to globalThis. Code is executed in an async node:vm context, so you can use await. You have access to ONLY the following:

2123- console.log(x): Use this to read contents back to you. But be minimal: otherwise the output may be too long. Avoid using console.log() for large base64 payloads like screenshots or buffer. If you create an image or screenshot, pass the base64 string to display().

2124- display(base64_image_string): Use this to view a base64-encoded image.

2125- Do not write screenshots or image data to temporary files or disk just to pass them back. Keep image data in memory and send it directly to display().

2126- Do not assume package globals like Bun.file are available unless they are explicitly provided.

2127- browser: A playwright chromium browser instance.

2128- context: A playwright browser context with viewport 1440x900.

2129- page: A playwright page already created in that context.

2130`,

2131 },

2132 },

2133 required: ["code"],

2134 additionalProperties: false,

2135 },

2136 },

2137 {

2138 type: "function" as const,

2139 name: "ask_user",

2140 description:

2141 "Ask the user a clarification question and wait for their response.",

2142 parameters: {

2143 type: "object",

2144 properties: {

2145 question: {

2146 type: "string",

2147 description:

2148 "The exact question to show the human. Use this instead of answering with a freeform clarifying question in a final answer.",

2149 },

2150 },

2151 required: ["question"],

2152 additionalProperties: false,

2153 },

2154 },

2155 ],

2156 input: conversation,

2157 reasoning: {

2158 effort: "low",

2159 },

2160 });

2161 

2162 // Save model outputs into the running conversation

2163 conversation.push(...resp.output);

2164 

2165 let hadToolCall = false;

2166 let latestPhase: Phase = null;

2167 

2168 // Handle tool calls

2169 for (const item of resp.output) {

2170 if (item.type === "function_call" && item.name === "exec_js") {

2171 hadToolCall = true;

2172 const parsed = JSON.parse(item.arguments ?? "{}") as {

2173 code?: string;

2174 };

2175 const code = parsed.code ?? "";

2176 console.log(code);

2177 console.log("----");

2178 const wrappedCode = `

2179 (async () => {

2180 ${code}

2181 })();

2182 `;

2183 

2184 try {

2185 await new vm.Script(wrappedCode, {

2186 filename: "exec_js.js",

2187 }).runInContext(ctx);

2188 } catch (e: any) {

2189 sandbox.console.log(e, e?.message, e?.stack);

2190 }

2191 

2192 // Send tool output back to the model, keyed by call_id

2193 conversation.push({

2194 type: "function_call_output",

2195 call_id: item.call_id,

2196 output: js_output.slice(),

2197 });

2198 

2199 for (const out of js_output) {

2200 if (out.type === "input_text") {

2201 console.log("JS LOG:", out.text);

2202 } else if (out.type === "input_image") {

2203 console.log("JS IMAGE: [base64 string omitted]");

2204 }

2205 }

2206 console.log("=====");

2207 

2208 js_output.length = 0;

2209 } else if (item.type === "function_call" && item.name === "ask_user") {

2210 hadToolCall = true;

2211 const parsed = JSON.parse(item.arguments ?? "{}") as {

2212 question?: string;

2213 };

2214 const question = parsed.question ?? "Please provide more information.";

2215 console.log(`MODEL QUESTION: ${question}`);

2216 const answer = await rl.question("> ");

2217 conversation.push({

2218 type: "function_call_output",

2219 call_id: item.call_id,

2220 output: answer,

2221 });

2222 } else if (item.type === "message") {

2223 console.log(item.content[0]?.text ?? item.content);

2224 if ("phase" in item) {

2225 latestPhase = (item.phase as Phase) ?? null;

2226 }

2227 } else if (item.type === "output_item.done" && "phase" in item) {

2228 latestPhase = (item.phase as Phase) ?? null;

2229 }

2230 }

2231 

2232 // Stop only when the model explicitly marks the turn as a final answer

2233 // and there were no tool calls in the same turn.

2234 if (!hadToolCall && latestPhase === "final_answer") return;

2235 }

2236}

2237 

2238function getCliPrompt(): string | undefined {

2239 const args = Bun.argv.slice(2);

2240 for (let i = 0; i < args.length; i++) {

2241 if (args[i] === "--prompt") {

2242 return args[i + 1];

2243 }

2244 }

2245 return undefined;

2246}

2247 

2248main(getCliPrompt());

2249```

2250 

2251```python

2252# /// script

2253# requires-python = ">=3.10"

2254# dependencies = [

2255# "openai",

2256# "playwright",

2257# ]

2258# ///

2259# Run with: `uv run cua_code_mode_py_async.py`

2260# Override the user prompt with:

2261# `uv run cua_code_mode_py_async.py --prompt "Go to example.com and summarize the page."`

2262# Install Chromium once first: `uv run --with playwright python -m playwright install chromium`

2263# Requires `OPENAI_API_KEY` in the environment.

2264 

2265"""Async Python analogue of cua_code_mode.ts.

2266 

2267Runs a Responses API loop with one persistent Playwright browser/context/page,

2268and tools that let the model execute short async Python snippets and ask the

2269user clarifying questions.

2270 

2271The model can return visual observations by calling:

2272 display(base64_png_string)

2273"""

2274 

2275from __future__ import annotations

2276 

2277import argparse

2278import asyncio

2279import json

2280import traceback

2281from typing import Any

2282 

2283from openai import OpenAI

2284from playwright.async_api import async_playwright

2285 

2286Phase = str | None

2287 

2288 

2289def _message_text(item: Any) -> str:

2290 try:

2291 parts = getattr(item, "content", None)

2292 if isinstance(parts, list) and parts:

2293 out: list[str] = []

2294 for p in parts:

2295 t = getattr(p, "text", None)

2296 if isinstance(t, str) and t:

2297 out.append(t)

2298 if out:

2299 return "\n".join(out)

2300 except Exception:

2301 pass

2302 return str(item)

2303 

2304 

2305async def _ainput(prompt: str) -> str:

2306 return await asyncio.to_thread(input, prompt)

2307 

2308 

2309async def main(

2310 prompt: str = "Go to Hacker News, click on the most interesting link (be prepared to justify your choice), take a screenshot, and give me a critique of the visual layout.",

2311 max_steps: int = 20,

2312 model: str = "gpt-5.5",

2313) -> None:

2314 client = OpenAI()

2315 

2316 async with async_playwright() as p:

2317 browser = await p.chromium.launch(

2318 headless=False,

2319 args=["--window-size=1440,900"],

2320 )

2321 context = await browser.new_context(viewport={"width": 1440, "height": 900})

2322 page = await context.new_page()

2323 

2324 conversation: list[dict[str, Any]] = [{"role": "user", "content": prompt}]

2325 py_output: list[dict[str, Any]] = []

2326 

2327 def log(*xs: Any) -> None:

2328 text = " ".join(str(x) for x in xs)

2329 py_output.append({"type": "input_text", "text": text[:5000]})

2330 

2331 def display(base64_image: str) -> None:

2332 py_output.append(

2333 {

2334 "type": "input_image",

2335 "image_url": f"data:image/png;base64,{base64_image}",

2336 "detail": "original",

2337 }

2338 )

2339 

2340 runtime_globals: dict[str, Any] = {

2341 "__builtins__": __builtins__,

2342 "asyncio": asyncio,

2343 "browser": browser,

2344 "context": context,

2345 "page": page,

2346 "display": display,

2347 "log": log,

2348 }

2349 

2350 for _ in range(max_steps):

2351 resp = client.responses.create(

2352 model=model,

2353 tools=[

2354 {

2355 "type": "function",

2356 "name": "exec_py",

2357 "description": "Execute provided interactive async Python in a persistent runtime context.",

2358 "parameters": {

2359 "type": "object",

2360 "properties": {

2361 "code": {

2362 "type": "string",

2363 "description": (

2364 "Python code to execute. Write small snippets. "

2365 "State persists across tool calls via globals(). "

2366 "This runtime uses Playwright's async Python API, so you may use await directly. "

2367 "Do not call asyncio.run(...), loop.run_until_complete(...), or manage the event loop yourself. "

2368 "You can use ONLY these prebound objects/helpers: "

2369 "log(x) for text output, display(base64_png_string) for image output, "

2370 "browser (async Playwright browser), context (viewport 1440x900), page (already created), "

2371 "asyncio (module). "

2372 "Be concise with log(x): do not send large base64 payloads, screenshots, buffers, page HTML, "

2373 "or other large blobs through log(). If you create an image or screenshot, pass the base64 PNG "

2374 "string to display(). Do not write screenshots or image data to temporary files or disk just "

2375 "to pass them back; keep image data in memory and send it directly to display(). "

2376 "Do not assume extra globals or helpers are available unless they are explicitly listed here. "

2377 "Do not close browser/context/page unless explicitly asked."

2378 ),

2379 }

2380 },

2381 "required": ["code"],

2382 "additionalProperties": False,

2383 },

2384 },

2385 {

2386 "type": "function",

2387 "name": "ask_user",

2388 "description": "Ask the user a clarification question and wait for their response.",

2389 "parameters": {

2390 "type": "object",

2391 "properties": {

2392 "question": {

2393 "type": "string",

2394 "description": "The exact question to show the user. Use this instead of asking a freeform clarifying question in a final answer.",

2395 }

2396 },

2397 "required": ["question"],

2398 "additionalProperties": False,

2399 },

2400 },

2401 ],

2402 input=conversation,

2403 )

2404 

2405 conversation.extend(resp.output)

2406 

2407 had_tool_call = False

2408 latest_phase: Phase = None

2409 

2410 for item in resp.output:

2411 item_type = getattr(item, "type", None)

2412 

2413 if item_type == "function_call" and getattr(item, "name", None) == "exec_py":

2414 had_tool_call = True

2415 raw_args = getattr(item, "arguments", "{}") or "{}"

2416 try:

2417 args = json.loads(raw_args)

2418 except json.JSONDecodeError:

2419 args = {}

2420 code = args.get("code", "") if isinstance(args, dict) else ""

2421 

2422 print(code)

2423 print("----")

2424 

2425 wrapped = (

2426 "async def __codex_exec__():\n"

2427 + "".join(

2428 f" {line}\n" if line else " \n"

2429 for line in (code or "pass").splitlines()

2430 )

2431 )

2432 

2433 try:

2434 exec(wrapped, runtime_globals, runtime_globals)

2435 await runtime_globals["__codex_exec__"]()

2436 except Exception:

2437 log(traceback.format_exc())

2438 

2439 conversation.append(

2440 {

2441 "type": "function_call_output",

2442 "call_id": getattr(item, "call_id", None),

2443 "output": py_output[:],

2444 }

2445 )

2446 

2447 for out in py_output:

2448 if out.get("type") == "input_text":

2449 print("PY LOG:", out.get("text", ""))

2450 elif out.get("type") == "input_image":

2451 print("PY IMAGE: [base64 string omitted]")

2452 print("=====")

2453 

2454 py_output.clear()

2455 

2456 elif item_type == "function_call" and getattr(item, "name", None) == "ask_user":

2457 had_tool_call = True

2458 raw_args = getattr(item, "arguments", "{}") or "{}"

2459 try:

2460 args = json.loads(raw_args)

2461 except json.JSONDecodeError:

2462 args = {}

2463 question = (

2464 args.get("question", "Please provide more information.")

2465 if isinstance(args, dict)

2466 else "Please provide more information."

2467 )

2468 

2469 print(f"MODEL QUESTION: {question}")

2470 answer = await _ainput("> ")

2471 

2472 conversation.append(

2473 {

2474 "type": "function_call_output",

2475 "call_id": getattr(item, "call_id", None),

2476 "output": answer,

2477 }

2478 )

2479 

2480 elif item_type == "message":

2481 print(_message_text(item))

2482 phase = getattr(item, "phase", None)

2483 if isinstance(phase, str) or phase is None:

2484 latest_phase = phase

2485 elif item_type == "output_item.done":

2486 phase = getattr(item, "phase", None)

2487 if isinstance(phase, str) or phase is None:

2488 latest_phase = phase

2489 

2490 if not had_tool_call and latest_phase == "final_answer":

2491 return

2492 

2493 

2494if __name__ == "__main__":

2495 parser = argparse.ArgumentParser()

2496 parser.add_argument("--prompt", help="Override the default user prompt.")

2497 args = parser.parse_args()

2498 asyncio.run(main(prompt=args.prompt) if args.prompt is not None else main())

2499```

2500 

249 </div>2501 </div>

250 2502 

251 2503 


380 2632 

381The older request shape looked like this:2633The older request shape looked like this:

382 2634 

2635Legacy preview request

2636 

2637```javascript

2638import OpenAI from "openai";

2639 

2640const client = new OpenAI();

2641 

2642const response = await client.responses.create({

2643 model: "computer-use-preview",

2644 tools: [

2645 {

2646 type: "computer_use_preview",

2647 display_width: 1024,

2648 display_height: 768,

2649 environment: "browser",

2650 },

2651 ],

2652 input: "Check whether the Filters panel is open.",

2653 truncation: "auto",

2654});

2655```

2656 

2657```python

2658from openai import OpenAI

2659 

2660client = OpenAI()

2661 

2662response = client.responses.create(

2663 model="computer-use-preview",

2664 tools=[

2665 {

2666 "type": "computer_use_preview",

2667 "display_width": 1024,

2668 "display_height": 768,

2669 "environment": "browser",

2670 }

2671 ],

2672 input="Check whether the Filters panel is open.",

2673 truncation="auto",

2674)

2675```

2676 

2677 

383Keep the preview path only to maintain older integrations. For new implementations, use the GA flow described above.2678Keep the preview path only to maintain older integrations. For new implementations, use the GA flow described above.

384 2679 

385## Keep a human in the loop2680## Keep a human in the loop

Details

1# MCP and Connectors1# MCP and Connectors

2 2 

3import {

4 CheckCircleFilled,

5 XCircle,

6} from "@components/react/oai/platform/ui/Icon.react";

7 

8 

9 

10 

11 

12 

13In addition to tools you make available to the model with [function calling](https://developers.openai.com/api/docs/guides/function-calling), you can give models new capabilities using **connectors** and **remote MCP servers**. These tools give the model the ability to connect to and control external services when needed to respond to a user's prompt. These tool calls can either be allowed automatically, or restricted with explicit approval required by you as the developer.3In addition to tools you make available to the model with [function calling](https://developers.openai.com/api/docs/guides/function-calling), you can give models new capabilities using **connectors** and **remote MCP servers**. These tools give the model the ability to connect to and control external services when needed to respond to a user's prompt. These tool calls can either be allowed automatically, or restricted with explicit approval required by you as the developer.

14 4 

15- **Connectors** are OpenAI-maintained MCP wrappers for popular services like Google Workspace or Dropbox, like the connectors available in [ChatGPT](https://chatgpt.com).5- **Connectors** are OpenAI-maintained MCP wrappers for popular services like Google Workspace or Dropbox, like the connectors available in [ChatGPT](https://chatgpt.com).


38 Using a remote MCP server in the Responses API28 Using a remote MCP server in the Responses API

39 29 

40```bash30```bash

41curl https://api.openai.com/v1/responses \\ 31curl https://api.openai.com/v1/responses \

42-H "Content-Type: application/json" \\ 32-H "Content-Type: application/json" \

43-H "Authorization: Bearer $OPENAI_API_KEY" \\ 33-H "Authorization: Bearer $OPENAI_API_KEY" \

44-d '{34-d '{

45 "model": "gpt-5.5",35 "model": "gpt-5.5",

46 "tools": [36 "tools": [


138 Using connectors in the Responses API128 Using connectors in the Responses API

139 129 

140```bash130```bash

141curl https://api.openai.com/v1/responses \\131curl https://api.openai.com/v1/responses \

142-H "Content-Type: application/json" \\132-H "Content-Type: application/json" \

143-H "Authorization: Bearer $OPENAI_API_KEY" \\133-H "Authorization: Bearer $OPENAI_API_KEY" \

144-d '{134-d '{

145 "model": "gpt-5.5",135 "model": "gpt-5.5",

146 "tools": [136 "tools": [


324Constrain allowed tools314Constrain allowed tools

325 315 

326```bash316```bash

327curl https://api.openai.com/v1/responses \\317curl https://api.openai.com/v1/responses \

328-H "Content-Type: application/json" \\318-H "Content-Type: application/json" \

329-H "Authorization: Bearer $OPENAI_API_KEY" \\319-H "Authorization: Bearer $OPENAI_API_KEY" \

330-d '{320-d '{

331 "model": "gpt-5.5",321 "model": "gpt-5.5",

332 "tools": [322 "tools": [


448Approving the use of tools in an API request438Approving the use of tools in an API request

449 439 

450```bash440```bash

451curl https://api.openai.com/v1/responses \\441curl https://api.openai.com/v1/responses \

452-H "Content-Type: application/json" \\442-H "Content-Type: application/json" \

453-H "Authorization: Bearer $OPENAI_API_KEY" \\443-H "Authorization: Bearer $OPENAI_API_KEY" \

454-d '{444-d '{

455 "model": "gpt-5.5",445 "model": "gpt-5.5",

456 "tools": [446 "tools": [


559Never require approval for some tools549Never require approval for some tools

560 550 

561```bash551```bash

562curl https://api.openai.com/v1/responses \\552curl https://api.openai.com/v1/responses \

563-H "Content-Type: application/json" \\553-H "Content-Type: application/json" \

564-H "Authorization: Bearer $OPENAI_API_KEY" \\554-H "Authorization: Bearer $OPENAI_API_KEY" \

565-d '{555-d '{

566 "model": "gpt-5.5",556 "model": "gpt-5.5",

567 "tools": [557 "tools": [


660Use Stripe MCP tool650Use Stripe MCP tool

661 651 

662```bash652```bash

663curl https://api.openai.com/v1/responses \\653curl https://api.openai.com/v1/responses \

664-H "Content-Type: application/json" \\654-H "Content-Type: application/json" \

665-H "Authorization: Bearer $OPENAI_API_KEY" \\655-H "Authorization: Bearer $OPENAI_API_KEY" \

666-d '{656-d '{

667 "model": "gpt-5.5",657 "model": "gpt-5.5",

668 "input": "Create a payment link for $20",658 "input": "Create a payment link for $20",


782Use the Google Calendar connector772Use the Google Calendar connector

783 773 

784```bash774```bash

785curl https://api.openai.com/v1/responses \\775curl https://api.openai.com/v1/responses \

786 -H "Content-Type: application/json" \\776 -H "Content-Type: application/json" \

787 -H "Authorization: Bearer $OPENAI_API_KEY" \\777 -H "Authorization: Bearer $OPENAI_API_KEY" \

788 -d '{778 -d '{

789 "model": "gpt-5.5",779 "model": "gpt-5.5",

790 "tools": [780 "tools": [

Details

352for await (const event of stream) {352for await (const event of stream) {

353 if (event.type === "response.image_generation_call.partial_image") {353 if (event.type === "response.image_generation_call.partial_image") {

354 const idx = event.partial_image_index;354 const idx = event.partial_image_index;

355 saveBase64Image(\`river-partial-\${idx}.png\`, event.partial_image_b64);355 saveBase64Image(`river-partial-${idx}.png`, event.partial_image_b64);

356 } else if (event.type === "response.completed") {356 } else if (event.type === "response.completed") {

357 const imageData = event.response.output357 const imageData = event.response.output

358 .filter((output) => output.type === "image_generation_call")358 .filter((output) => output.type === "image_generation_call")

Details

366 {366 {

367 type: "input_file",367 type: "input_file",

368 filename: "report.csv",368 filename: "report.csv",

369 file_data: \`data:text/csv;base64,\${reportCsv}\`,369 file_data: `data:text/csv;base64,${reportCsv}`,

370 },370 },

371 {371 {

372 type: "input_text",372 type: "input_text",


628- `shell_call`: commands requested by the model.628- `shell_call`: commands requested by the model.

629- `shell_call_output`: command output and exit outcomes.629- `shell_call_output`: command output and exit outcomes.

630 630 

631Example shell_call item

632 

633```json

634{

635 "type": "shell_call",

636 "call_id": "call_9d14ac6f2b73485e91c0f4da6e1b27c8",

637 "action": {

638 "commands": ["ls -l"],

639 "timeout_ms": 120000,

640 "max_output_length": 4096

641 },

642 "status": "in_progress"

643}

644```

645 

646 

631## Local shell mode647## Local shell mode

632 648 

633You can also run shell commands in your own local runtime by executing `shell_call` actions and sending `shell_call_output` back to the model.649You can also run shell commands in your own local runtime by executing `shell_call` actions and sending `shell_call_output` back to the model.


640- Capture `stdout`, `stderr`, and outcome.656- Capture `stdout`, `stderr`, and outcome.

641- Return results as `shell_call_output` in the next request.657- Return results as `shell_call_output` in the next request.

642 658 

659Example shell_call_output payload

660 

661```json

662{

663 "type": "shell_call_output",

664 "call_id": "call_3ef1b8c79a4d6520f9e3ab7d41c68f25",

665 "max_output_length": 4096,

666 "output": [

667 {

668 "stdout": "...",

669 "stderr": "...",

670 "outcome": {

671 "type": "exit",

672 "exit_code": 0

673 }

674 },

675 {

676 "stdout": "...",

677 "stderr": "...",

678 "outcome": {

679 "type": "timeout"

680 }

681 }

682 ]

683}

684```

685 

686 

643For legacy migration details, see the older [Local shell guide](https://developers.openai.com/api/docs/guides/tools-local-shell).687For legacy migration details, see the older [Local shell guide](https://developers.openai.com/api/docs/guides/tools-local-shell).

644 688 

645## Use local shell with Agents SDK689## Use local shell with Agents SDK

Details

12 12 

13Skills are compatible with the open [Agent Skills standard](https://agentskills.io/home).13Skills are compatible with the open [Agent Skills standard](https://agentskills.io/home).

14 14 

15Example SKILL.md

16 

17```markdown

18---

19name: basic-math

20description: Add or multiply numbers.

21---

22 

23Use this skill when you need a quick sum or product of numbers.

24```

25 

26 

15## Create a skill27## Create a skill

16 28 

17You can upload a directory as multipart form data or upload a `.zip` that contains a single top-level folder.29You can upload a directory as multipart form data or upload a `.zip` that contains a single top-level folder.


20 32 

21Upload multiple `files[]` parts. Each part includes the path inside a single top-level folder.33Upload multiple `files[]` parts. Each part includes the path inside a single top-level folder.

22 34 

35Create a skill (multipart)

36 

37```bash

38curl -X POST 'https://api.openai.com/v1/skills' \

39 -H "Authorization: Bearer $OPENAI_API_KEY" \

40 -F 'files[]=@./basic_math/SKILL.md;filename=basic_math/SKILL.md;type=text/markdown' \

41 -F 'files[]=@./basic_math/calculate.py;filename=basic_math/calculate.py;type=text/plain'

42```

43 

44 

23### Option 2: Zip upload45### Option 2: Zip upload

24 46 

25Zip the top-level folder and upload the zip file.47Zip the top-level folder and upload the zip file.

26 48 

49Create a skill (zip)

50 

51```bash

52curl -X POST 'https://api.openai.com/v1/skills' \

53 -H "Authorization: Bearer $OPENAI_API_KEY" \

54 -F 'files=@./basic_math.zip;type=application/zip'

55```

56 

57 

27## Use skills with hosted shell58## Use skills with hosted shell

28 59 

29To mount skills in a hosted shell environment, attach them via `tools[].environment.skills` when calling the shell tool.60To mount skills in a hosted shell environment, attach them via `tools[].environment.skills` when calling the shell tool.


187 218 

188### Create a new version219### Create a new version

189 220 

221Create a new skill version

222 

223```bash

224curl -X POST 'https://api.openai.com/v1/skills/<skill_id>/versions' \

225 -H "Authorization: Bearer $OPENAI_API_KEY" \

226 -F 'files=@./geometry.zip;type=application/zip'

227```

228 

229 

190### Set default version230### Set default version

191 231 

232Set a skill's default version

233 

234```bash

235curl -X POST 'https://api.openai.com/v1/skills/<skill_id>' \

236 -H "Content-Type: application/json" \

237 -H "Authorization: Bearer $OPENAI_API_KEY" \

238 -d '{"default_version": 2}'

239```

240 

241 

192### Delete rules242### Delete rules

193 243 

194- You can't delete the default version; set another default first.244- You can't delete the default version; set another default first.


199 249 

200OpenAI maintains a set of first-party skills that can be referenced by id (for example, `openai-spreadsheets`).250OpenAI maintains a set of first-party skills that can be referenced by id (for example, `openai-spreadsheets`).

201 251 

252Reference a curated skill

253 

254```json

255{ "type": "skill_reference", "skill_id": "openai-spreadsheets", "version": "latest" }

256```

257 

258 

202## Inline skills259## Inline skills

203 260 

204If you don't want to create a hosted skill, you can inline a zip bundle (base64) in the environment's `skills` array.261If you don't want to create a hosted skill, you can inline a zip bundle (base64) in the environment's `skills` array.

205 262 

263Inline a skill bundle

264 

265```bash

266INLINE_ZIP=$(base64 -i ./basic_math.zip)

267 

268curl -L 'https://api.openai.com/v1/containers' \

269 -H "Content-Type: application/json" \

270 -H "Authorization: Bearer $OPENAI_API_KEY" \

271 -d '{

272 "name": "inline-skill-container",

273 "skills": [

274 {

275 "type": "inline",

276 "name": "basic_math",

277 "description": "Add or multiply numbers.",

278 "source": {

279 "type": "base64",

280 "media_type": "application/zip",

281 "data": "'"$INLINE_ZIP"'"

282 }

283 }

284 ]

285 }'

286```

287 

288 

206## Risks and safety289## Risks and safety

207 290 

208It's important to inspect any Skill used with the Responses API. Skills introduce security risks such as prompt injection-driven data exfiltration.291It's important to inspect any Skill used with the Responses API. Skills introduce security risks such as prompt injection-driven data exfiltration.

Details

69```69```

70 70 

71```bash71```bash

72curl -X POST "https://api.openai.com/v1/videos" \\72curl -X POST "https://api.openai.com/v1/videos" \

73 -H "Authorization: Bearer $OPENAI_API_KEY" \\73 -H "Authorization: Bearer $OPENAI_API_KEY" \

74 -H "Content-Type: multipart/form-data" \\74 -H "Content-Type: multipart/form-data" \

75 -F prompt="Wide tracking shot of a teal coupe driving through a desert highway, heat ripples visible, hard sun overhead." \\75 -F prompt="Wide tracking shot of a teal coupe driving through a desert highway, heat ripples visible, hard sun overhead." \

76 -F model="sora-2-pro" \\76 -F model="sora-2-pro" \

77 -F size="1280x720" \\77 -F size="1280x720" \

78 -F seconds="8" \\78 -F seconds="8" \

79```79```

80 80 

81 81 


251 const bar = '='.repeat(filledLength) + '-'.repeat(barLength - filledLength);251 const bar = '='.repeat(filledLength) + '-'.repeat(barLength - filledLength);

252 const statusText = video.status === 'queued' ? 'Queued' : 'Processing';252 const statusText = video.status === 'queued' ? 'Queued' : 'Processing';

253 253 

254 process.stdout.write(\`\${statusText}: [\${bar}] \${progress.toFixed(1)}%\`);254 process.stdout.write(`${statusText}: [${bar}] ${progress.toFixed(1)}%`);

255 255 

256 await new Promise((resolve) => setTimeout(resolve, 2000));256 await new Promise((resolve) => setTimeout(resolve, 2000));

257}257}

258 258 

259// Clear the progress line and show completion259// Clear the progress line and show completion

260process.stdout.write('\\n');260process.stdout.write('\n');

261 261 

262if (video.status === 'failed') {262if (video.status === 'failed') {

263 console.error('Video generation failed');263 console.error('Video generation failed');


279```279```

280 280 

281```bash281```bash

282curl -L "https://api.openai.com/v1/videos/video_abc123/content" \\282curl -L "https://api.openai.com/v1/videos/video_abc123/content" \

283 -H "Authorization: Bearer $OPENAI_API_KEY" \\283 -H "Authorization: Bearer $OPENAI_API_KEY" \

284 --output video.mp4284 --output video.mp4

285```285```

286 286 

Details

1# Voice agents1# Voice agents

2 2 

3import {

4 Bolt,

5 Cube,

6 Desktop,

7 Phone,

8} from "@components/react/oai/platform/ui/Icon.react";

9 

10 

11Voice agents turn the same agent concepts into spoken, low-latency interactions. The key design choice is deciding whether the model should work directly with live audio or whether your application should explicitly chain speech-to-text, text reasoning, and text-to-speech.3Voice agents turn the same agent concepts into spoken, low-latency interactions. The key design choice is deciding whether the model should work directly with live audio or whether your application should explicitly chain speech-to-text, text reasoning, and text-to-speech.

12 4 

13## Choose the right architecture5## Choose the right architecture

Details

92Generate a background response92Generate a background response

93 93 

94```bash94```bash

95curl https://api.openai.com/v1/responses \\95curl https://api.openai.com/v1/responses \

96-H "Content-Type: application/json" \\96-H "Content-Type: application/json" \

97-H "Authorization: Bearer $OPENAI_API_KEY" \\97-H "Authorization: Bearer $OPENAI_API_KEY" \

98-d '{98-d '{

99 "model": "gpt-5.5",99 "model": "gpt-5.5",

100 "input": "Write a very long novel about otters in space.",100 "input": "Write a very long novel about otters in space.",


229 229 

230```php230```php

231$webhook_secret = getenv("OPENAI_WEBHOOK_SECRET");231$webhook_secret = getenv("OPENAI_WEBHOOK_SECRET");

232$wh = new \\StandardWebhooks\\Webhook($webhook_secret);232$wh = new \StandardWebhooks\Webhook($webhook_secret);

233$wh->verify($webhook_payload, $webhook_headers);233$wh->verify($webhook_payload, $webhook_headers);

234```234```

235 235 

Details

1# Configuring workload identity federation for AWS1# Configuring workload identity federation for AWS

2 2 

3import {

4 awsOutboundWorkloadIdentitySdk,

5 awsWorkloadIdentitySdk,

6} from "./examples";

7 

8Use AWS as a Workload Identity Provider in either of these scenarios:3Use AWS as a Workload Identity Provider in either of these scenarios:

9 4 

10- **AWS outbound identity federation:** Exchange an AWS STS-issued OIDC JWT from `GetWebIdentityToken` for a short-lived OpenAI access token.5- **AWS outbound identity federation:** Exchange an AWS STS-issued OIDC JWT from `GetWebIdentityToken` for a short-lived OpenAI access token.

Details

1# Configuring workload identity federation for Google Cloud1# Configuring workload identity federation for Google Cloud

2 2 

3import {

4 googleGkeWorkloadIdentitySdk,

5 googleWorkloadIdentitySdk,

6} from "./examples";

7 

8Use Google Cloud as a Workload Identity Provider in either of these scenarios:3Use Google Cloud as a Workload Identity Provider in either of these scenarios:

9 4 

10- **Google workload identity:** Exchange a Google-signed OIDC token issued to an attached Google service account for a short-lived OpenAI access token.5- **Google workload identity:** Exchange a Google-signed OIDC token issued to an attached Google service account for a short-lived OpenAI access token.

Details

1# Configuring workload identity federation for Microsoft Azure1# Configuring workload identity federation for Microsoft Azure

2 2 

3import {

4 azureAksWorkloadIdentitySdk,

5 azureManagedIdentityWorkloadIdentitySdk,

6} from "./examples";

7 

8Use Microsoft Azure as a Workload Identity Provider in either of these scenarios:3Use Microsoft Azure as a Workload Identity Provider in either of these scenarios:

9 4 

10- **Azure managed identity:** Exchange a Microsoft Entra ID access token issued for a managed identity for a short-lived OpenAI access token.5- **Azure managed identity:** Exchange a Microsoft Entra ID access token issued for a managed identity for a short-lived OpenAI access token.

Details

1# Data controls in the OpenAI platform1# Data controls in the OpenAI platform

2 2 

3import {

4 dataResidencyRegions,

5 dataResidencyServices,

6} from "./your-data-support";

7 

8Understand how OpenAI uses your data, and how you can control it.3Understand how OpenAI uses your data, and how you can control it.

9 4 

10Your data is your data. As of March 1, 2023, data sent to the OpenAI API is not used to train or improve OpenAI models (unless you explicitly opt in to share data with us).5Your data is your data. As of March 1, 2023, data sent to the OpenAI API is not used to train or improve OpenAI models (unless you explicitly opt in to share data with us).

quickstart.md +0 −28

Details

1# Developer quickstart1# Developer quickstart

2 2 

3import {

4 Assistant,

5 Camera,

6 ChatTripleDots,

7 Code,

8 Bolt,

9 Speed,

10 SquarePlus,

11} from "@components/react/oai/platform/ui/Icon.react";

12 

13 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31The OpenAI API provides a simple interface to state-of-the-art AI [models](https://developers.openai.com/api/docs/models) for text generation, natural language processing, computer vision, and more. Get started by creating an API Key and running your first API call. Discover how to generate text, analyze images, build agents, and more.3The OpenAI API provides a simple interface to state-of-the-art AI [models](https://developers.openai.com/api/docs/models) for text generation, natural language processing, computer vision, and more. Get started by creating an API Key and running your first API call. Discover how to generate text, analyze images, build agents, and more.

32 4 

33## Create and export an API key5## Create and export an API key

ui-kit-demo.md +0 −21

Details

1# UI Kit Demo1# UI Kit Demo

2 2 

3import {

4 CodeComparisonDemo,

5 CodeGalleryDemo,

6 CodeSampleDemo,

7 ContentModeSwitchDemo,

8 ContentSwitcherDemo,

9 DeepDiveDemo,

10 DocCardDemo,

11 DocsMarkdownDemo,

12 DocsTipDemo,

13 ExpanderDemo,

14 GalleryDocsDemo,

15 IconDemo,

16 IconItemDemo,

17 ImageDemo,

18 ImageGalleryDemo,

19 LatencyExampleDemo,

20 VideoGalleryDemo,

21 WaveformComponentDemo,

22} from "@components/react/demo/ApiDocsComponentDemos.react";

23 

24# UI Kit Demo3# UI Kit Demo

25 4 

26## hello5## hello