SpyBara
Go Premium

Documentation 2026-04-16 05:55 UTC to 2026-04-22 05:55 UTC

6 files changed +199 −724. View all changes and history on the product overview
2026
Thu 30 06:13 Tue 28 06:15 Sat 25 05:52 Fri 24 05:58 Thu 23 05:56 Wed 22 05:55 Thu 16 05:55 Wed 15 05:55 Tue 14 05:55 Sat 11 05:41 Thu 9 05:52 Wed 8 05:51 Tue 7 05:51 Wed 1 05:53
Details

110across local, Docker, and hosted clients.110across local, Docker, and hosted clients.

111 111 

112| Manifest input | Use it for |112| Manifest input | Use it for |

113| ------------------------------------------------------------------ | ------------------------------------------------------------------------------------- |113| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------- |

114| `File`, `Dir` | Small synthetic inputs, helper files, or output directories. |114| `File`, `Dir` | Small synthetic inputs, helper files, or output directories. |

115| `LocalFile`, `LocalDir` | Host files or directories to materialize into the sandbox. |115| `LocalFile`, `LocalDir` | Host files or directories to materialize into the sandbox. |

116| `GitRepo` | A repository to fetch into the workspace. |116| `GitRepo` | A repository to fetch into the workspace. |

117| `S3Mount`, `GCSMount`, `R2Mount`, `AzureBlobMount`, `S3FilesMount` | External storage to make available inside the sandbox. |117| `S3Mount`, `GCSMount`, `R2Mount`, `AzureBlobMount`, `BoxMount`, `S3FilesMount` | External storage to make available inside the sandbox. |

118| `environment` | Environment variables the sandbox needs when it starts. |118| `environment` | Environment variables the sandbox needs when it starts. |

119| `users` and `groups` | Sandbox-local OS accounts and groups for providers that support account provisioning. |119| `users` and `groups` | Sandbox-local OS accounts and groups for providers that support account provisioning. |

120 120 

Details

2 2 

3## Overview3## Overview

4 4 

5The OpenAI API lets you generate and edit images from text prompts, using GPT Image or DALL·E models. You can access image generation capabilities through two APIs:5The OpenAI API lets you generate and edit images from text prompts using GPT Image models, including our latest, `gpt-image-2`. You can access image generation capabilities through two APIs:

6 6 

7### Image API7### Image API

8 8 

9The [Image API](https://developers.openai.com/api/docs/api-reference/images) provides three endpoints, each with distinct capabilities:9Starting with `gpt-image-1` and later models, the [Image API](https://developers.openai.com/api/docs/api-reference/images) provides two endpoints, each with distinct capabilities:

10 10 

11- **Generations**: [Generate images](#generate-images) from scratch based on a text prompt11- **Generations**: [Generate images](#generate-images) from scratch based on a text prompt

12- **Edits**: [Modify existing images](#edit-images) using a new prompt, either partially or entirely12- **Edits**: [Modify existing images](#edit-images) using a new prompt, either partially or entirely

13- **Variations**: [Generate variations](#image-variations) of an existing image (available with DALL·E 2 only)

14 13 

15This API supports GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`) as well as `dall-e-2` and `dall-e-3`.14The Image API also includes a variations endpoint for models that support it, such as DALL·E 2.

16 15 

17### Responses API16### Responses API

18 17 


23- **Multi-turn editing**: Iteratively make high fidelity edits to images with prompting22- **Multi-turn editing**: Iteratively make high fidelity edits to images with prompting

24- **Flexible inputs**: Accept image [File](https://developers.openai.com/api/docs/api-reference/files) IDs as input images, not just bytes23- **Flexible inputs**: Accept image [File](https://developers.openai.com/api/docs/api-reference/files) IDs as input images, not just bytes

25 24 

26The image generation tool in responses uses GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`).25The Responses API image generation tool uses its own GPT Image model selection. For details on mainline models that support calling this tool, refer to the [supported models](#supported-models) below.

27When using `gpt-image-1.5` and `chatgpt-image-latest` with the Responses API, you can optionally set the `action` parameter, detailed below.

28For a list of mainline models that support calling this tool, refer to the [supported models](#supported-models) below.

29 26 

30### Choosing the right API27### Choosing the right API

31 28 

32- If you only need to generate or edit a single image from one prompt, the Image API is your best choice.29- If you only need to generate or edit a single image from one prompt, the Image API is your best choice.

33- If you want to build conversational, editable image experiences with GPT Image, go with the Responses API.30- If you want to build conversational, editable image experiences with GPT Image, go with the Responses API.

34 31 

35Both APIs let you [customize output](#customize-image-output) adjust quality, size, format, compression, and enable transparent backgrounds.32Both APIs let you [customize output](#customize-image-output) by adjusting quality, size, format, and compression. Transparent backgrounds depend on model support.

36 33 

34This guide focuses on GPT Image.

37 35 

38 36To ensure these models are used responsibly, you may need to complete the [API

39 

40 

41### Model comparison

42 

43Our latest and most advanced model for image generation is `gpt-image-1.5`, a natively multimodal language model, part of the GPT Image family.

44 

45GPT Image models include `gpt-image-1.5` (state of the art), `gpt-image-1`, and `gpt-image-1-mini`. They share the same API surface, with `gpt-image-1.5` offering the best overall quality.

46 

47We recommend using `gpt-image-1.5` for the best experience, but if you are looking for a more cost-effective option and image quality isn't a priority, you can use `gpt-image-1-mini`.

48 

49You can also use specialized image generation models—DALL·E 2 and DALL·E 3—with the Image API, but please note these models are now deprecated and we will stop supporting them on 05/12, 2026.

50 

51| Model | Endpoints | Use case |

52| --------- | ------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------- |

53| DALL·E 2 | Image API: Generations, Edits, Variations | Lower cost, concurrent requests, inpainting (image editing with a mask) |

54| DALL·E 3 | Image API: Generations only | Higher image quality than DALL·E 2, support for larger resolutions |

55| GPT Image | Image API: Generations, Edits – Responses API (as part of the image generation tool) | Superior instruction following, text rendering, detailed editing, real-world knowledge |

56 

57 

58This guide focuses on GPT Image. To view the DALL·E model-specific content in this same guide, switch to the [DALL·E 2 view](https://developers.openai.com/api/docs/guides/image-generation?image-generation-model=dall-e-2) or [DALL·E 3 view](https://developers.openai.com/api/docs/guides/image-generation?image-generation-model=dall-e-3).

59 

60To ensure this model is used responsibly, you may need to complete the [API

61 Organization37 Organization

62 Verification](https://help.openai.com/en/articles/10910291-api-organization-verification)38 Verification](https://help.openai.com/en/articles/10910291-api-organization-verification)

63 from your [developer39 from your [developer

64 console](https://platform.openai.com/settings/organization/general) before40 console](https://platform.openai.com/settings/organization/general) before

65 using GPT Image models, including `gpt-image-1.5`, `gpt-image-1`, and41 using GPT Image models, including `gpt-image-2`, `gpt-image-1.5`,

66 `gpt-image-1-mini`.42 `gpt-image-1`, and `gpt-image-1-mini`.

67 

68 

69 

70 

71 

72 

73 43 

74<div44<div

75 className="not-prose"45 className="not-prose"


83 53 

84## Generate Images54## Generate Images

85 55 

86 

87You can use the [image generation endpoint](https://developers.openai.com/api/docs/api-reference/images/create) to create images based on text prompts, or the [image generation tool](https://developers.openai.com/api/docs/guides/tools?api-mode=responses) in the Responses API to generate images as part of a conversation.56You can use the [image generation endpoint](https://developers.openai.com/api/docs/api-reference/images/create) to create images based on text prompts, or the [image generation tool](https://developers.openai.com/api/docs/guides/tools?api-mode=responses) in the Responses API to generate images as part of a conversation.

88 57 

89To learn more about customizing the output (size, quality, format, transparency), refer to the [customize image output](#customize-image-output) section below.58To learn more about customizing the output (size, quality, format, compression), refer to the [customize image output](#customize-image-output) section below.

90 59 

91You can set the `n` parameter to generate multiple images at once in a single request (by default, the API returns a single image).60You can set the `n` parameter to generate multiple images at once in a single request (by default, the API returns a single image).

92 61 


101const openai = new OpenAI();70const openai = new OpenAI();

102 71 

103const response = await openai.responses.create({72const response = await openai.responses.create({

104 model: "gpt-5",73 model: "gpt-5.4",

105 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",74 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",

106 tools: [{type: "image_generation"}],75 tools: [{type: "image_generation"}],

107});76});


125client = OpenAI() 94client = OpenAI()

126 95 

127response = client.responses.create(96response = client.responses.create(

128 model="gpt-5",97 model="gpt-5.4",

129 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",98 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

130 tools=[{"type": "image_generation"}],99 tools=[{"type": "image_generation"}],

131)100)


159\`;128\`;

160 129 

161const result = await openai.images.generate({130const result = await openai.images.generate({

162 model: "gpt-image-1.5",131 model: "gpt-image-2",

163 prompt,132 prompt,

164});133});

165 134 


180"""149"""

181 150 

182result = client.images.generate(151result = client.images.generate(

183 model="gpt-image-1.5",152 model="gpt-image-2",

184 prompt=prompt153 prompt=prompt

185)154)

186 155 


197 -H "Authorization: Bearer $OPENAI_API_KEY" \\166 -H "Authorization: Bearer $OPENAI_API_KEY" \\

198 -H "Content-type: application/json" \\167 -H "Content-type: application/json" \\

199 -d '{168 -d '{

200 "model": "gpt-image-1.5",169 "model": "gpt-image-2",

201 "prompt": "A childrens book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."170 "prompt": "A childrens book drawing of a veterinarian using a stethoscope to listen to the heartbeat of a baby otter."

202 }' | jq -r '.data[0].b64_json' | base64 --decode > otter.png171 }' | jq -r '.data[0].b64_json' | base64 --decode > otter.png

203```172```


209### Multi-turn image generation178### Multi-turn image generation

210 179 

211With the Responses API, you can build multi-turn conversations involving image generation either by providing image generation calls outputs within context (you can also just use the image ID), or by using the [`previous_response_id` parameter](https://developers.openai.com/api/docs/guides/conversation-state?api-mode=responses#openai-apis-for-conversation-state).180With the Responses API, you can build multi-turn conversations involving image generation either by providing image generation calls outputs within context (you can also just use the image ID), or by using the [`previous_response_id` parameter](https://developers.openai.com/api/docs/guides/conversation-state?api-mode=responses#openai-apis-for-conversation-state).

212This makes it easy to iterate on images across multiple turns—refining prompts, applying new instructions, and evolving the visual output as the conversation progresses.181This lets you iterate on images across multiple turns—refining prompts, applying new instructions, and evolving the visual output as the conversation progresses.

213 182 

214### Generate vs Edit183With the Responses API image generation tool, supported tool models can choose whether to generate a new image or edit one already in the conversation. The optional `action` parameter controls this behavior: keep `action: "auto"` to let the model decide, set `action: "generate"` to always create a new image, or set `action: "edit"` to force editing when an image is in context.

215 

216With the Responses API you can choose whether to generate a new image or edit one already in the conversation.

217The optional `action` parameter (supported on `gpt-image-1.5` and `chatgpt-image-latest`) controls this behavior: keep `action: "auto"` to let the model decide (recommended), set `action: "generate"` to always create a new image, or set `action: "edit"` to force editing (requires an image in context).

218 184 

219Force image creation with action185Force image creation with action

220 186 


223const openai = new OpenAI();189const openai = new OpenAI();

224 190 

225const response = await openai.responses.create({191const response = await openai.responses.create({

226 model: "gpt-5",192 model: "gpt-5.4",

227 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",193 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",

228 tools: [{type: "image_generation", action: "generate"}],194 tools: [{type: "image_generation", action: "generate"}],

229});195});


247client = OpenAI() 213client = OpenAI()

248 214 

249response = client.responses.create(215response = client.responses.create(

250 model="gpt-5",216 model="gpt-5.4",

251 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",217 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

252 tools=[{"type": "image_generation", "action": "generate"}],218 tools=[{"type": "image_generation", "action": "generate"}],

253)219)


266```232```

267 233 

268 234 

269If you force `edit` without providing an image in context, the call will235If you force `edit` without providing an image in context, the call will return an error. Leave `action` at `auto` to have the model decide when to generate or edit.

270 return an error. Leave `action` at `auto` to have the model decide when to

271 generate or edit.

272 

273When `action` is set to `auto`, the `image_generation_call` result includes an `action` field so you can see whether the model generated a new image or edited one already in context:

274 

275```json

276{

277 "id": "ig_123...",

278 "type": "image_generation_call",

279 "status": "completed",

280 "background": "opaque",

281 "output_format": "jpeg",

282 "quality": "medium",

283 "result": "/9j/4...",

284 "revised_prompt": "...",

285 "size": "1024x1024",

286 "action": "generate"

287}

288```

289 236 

290 237 

291 238 


298const openai = new OpenAI();245const openai = new OpenAI();

299 246 

300const response = await openai.responses.create({247const response = await openai.responses.create({

301 model: "gpt-5",248 model: "gpt-5.4",

302 input:249 input:

303 "Generate an image of gray tabby cat hugging an otter with an orange scarf",250 "Generate an image of gray tabby cat hugging an otter with an orange scarf",

304 tools: [{ type: "image_generation" }],251 tools: [{ type: "image_generation" }],


317// Follow up264// Follow up

318 265 

319const response_fwup = await openai.responses.create({266const response_fwup = await openai.responses.create({

320 model: "gpt-5",267 model: "gpt-5.4",

321 previous_response_id: response.id,268 previous_response_id: response.id,

322 input: "Now make it look realistic",269 input: "Now make it look realistic",

323 tools: [{ type: "image_generation" }],270 tools: [{ type: "image_generation" }],


344client = OpenAI()291client = OpenAI()

345 292 

346response = client.responses.create(293response = client.responses.create(

347 model="gpt-5",294 model="gpt-5.4",

348 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",295 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

349 tools=[{"type": "image_generation"}],296 tools=[{"type": "image_generation"}],

350)297)


365# Follow up312# Follow up

366 313 

367response_fwup = client.responses.create(314response_fwup = client.responses.create(

368 model="gpt-5",315 model="gpt-5.4",

369 previous_response_id=response.id,316 previous_response_id=response.id,

370 input="Now make it look realistic",317 input="Now make it look realistic",

371 tools=[{"type": "image_generation"}],318 tools=[{"type": "image_generation"}],


393const openai = new OpenAI();340const openai = new OpenAI();

394 341 

395const response = await openai.responses.create({342const response = await openai.responses.create({

396 model: "gpt-5",343 model: "gpt-5.4",

397 input:344 input:

398 "Generate an image of gray tabby cat hugging an otter with an orange scarf",345 "Generate an image of gray tabby cat hugging an otter with an orange scarf",

399 tools: [{ type: "image_generation" }],346 tools: [{ type: "image_generation" }],


414// Follow up361// Follow up

415 362 

416const response_fwup = await openai.responses.create({363const response_fwup = await openai.responses.create({

417 model: "gpt-5",364 model: "gpt-5.4",

418 input: [365 input: [

419 {366 {

420 role: "user",367 role: "user",


447import base64394import base64

448 395 

449response = openai.responses.create(396response = openai.responses.create(

450 model="gpt-5",397 model="gpt-5.4",

451 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",398 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

452 tools=[{"type": "image_generation"}],399 tools=[{"type": "image_generation"}],

453)400)


470# Follow up417# Follow up

471 418 

472response_fwup = openai.responses.create(419response_fwup = openai.responses.create(

473 model="gpt-5",420 model="gpt-5.4",

474 input=[421 input=[

475 {422 {

476 "role": "user",423 "role": "user",


540 487 

541### Streaming488### Streaming

542 489 

543The Responses API and Image API support streaming image generation. This allows you to stream partial images as they are generated, providing a more interactive experience.490The Responses API and Image API support streaming image generation. You can stream partial images as the APIs generate them, providing a more interactive experience.

544 491 

545You can adjust the `partial_images` parameter to receive 0-3 partial images.492You can adjust the `partial_images` parameter to receive 0-3 partial images.

546 493 


559const openai = new OpenAI();506const openai = new OpenAI();

560 507 

561const stream = await openai.responses.create({508const stream = await openai.responses.create({

562 model: "gpt-4.1",509 model: "gpt-5.4",

563 input:510 input:

564 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",511 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",

565 stream: true,512 stream: true,


583client = OpenAI()530client = OpenAI()

584 531 

585stream = client.responses.create(532stream = client.responses.create(

586 model="gpt-4.1",533 model="gpt-5.4",

587 input="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",534 input="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",

588 stream=True,535 stream=True,

589 tools=[{"type": "image_generation", "partial_images": 2}],536 tools=[{"type": "image_generation", "partial_images": 2}],


613 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape";560 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape";

614const stream = await openai.images.generate({561const stream = await openai.images.generate({

615 prompt: prompt,562 prompt: prompt,

616 model: "gpt-image-1.5",563 model: "gpt-image-2",

617 stream: true,564 stream: true,

618 partial_images: 2,565 partial_images: 2,

619});566});


636 583 

637stream = client.images.generate(584stream = client.images.generate(

638 prompt="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",585 prompt="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",

639 model="gpt-image-1.5",586 model="gpt-image-2",

640 stream=True,587 stream=True,

641 partial_images=2,588 partial_images=2,

642)589)


671 618 

672### Revised prompt619### Revised prompt

673 620 

674When using the image generation tool in the Responses API, the mainline model (e.g. `gpt-4.1`) will automatically revise your prompt for improved performance.621When using the image generation tool in the Responses API, the mainline model (for example, `gpt-5.4`) will automatically revise your prompt for improved performance.

675 622 

676You can access the revised prompt in the `revised_prompt` field of the image generation call:623You can access the revised prompt in the `revised_prompt` field of the image generation call:

677 624 

625Revised prompt response

626 

678```json627```json

679{628{

680 "id": "ig_123",629 "id": "ig_123",


701 641 

702- Edit existing images642- Edit existing images

703- Generate new images using other images as a reference643- Generate new images using other images as a reference

704- Edit parts of an image by uploading an image and mask indicating which areas should be replaced (a process known as **inpainting**)644- Edit parts of an image by uploading an image and mask that identifies the areas to replace

705 645 

706### Create a new image using image references646### Create a new image using image references

707 647 


728"""668"""

729 669 

730result = client.images.edit(670result = client.images.edit(

731 model="gpt-image-1.5",671 model="gpt-image-2",

732 image=[672 image=[

733 open("body-lotion.png", "rb"),673 open("body-lotion.png", "rb"),

734 open("bath-bomb.png", "rb"),674 open("bath-bomb.png", "rb"),


774);714);

775 715 

776const response = await client.images.edit({716const response = await client.images.edit({

777 model: "gpt-image-1.5",717 model: "gpt-image-2",

778 image: images,718 image: images,

779 prompt,719 prompt,

780});720});


790 -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \\730 -o >(jq -r '.data[0].b64_json' | base64 --decode > gift-basket.png) \\

791 -X POST "https://api.openai.com/v1/images/edits" \\731 -X POST "https://api.openai.com/v1/images/edits" \\

792 -H "Authorization: Bearer $OPENAI_API_KEY" \\732 -H "Authorization: Bearer $OPENAI_API_KEY" \\

793 -F "model=gpt-image-1.5" \\733 -F "model=gpt-image-2" \\

794 -F "image[]=@body-lotion.png" \\734 -F "image[]=@body-lotion.png" \\

795 -F "image[]=@bath-bomb.png" \\735 -F "image[]=@bath-bomb.png" \\

796 -F "image[]=@incense-kit.png" \\736 -F "image[]=@incense-kit.png" \\


802 742 

803 743 

804 744 

805### Edit an image using a mask (inpainting)745### Edit an image using a mask

806 746 

807You can provide a mask to indicate which part of the image should be edited.747You can provide a mask to indicate which part of the image should be edited.

808 748 

809When using a mask with GPT Image, additional instructions are sent to the model to help guide the editing process accordingly.749When using a mask with GPT Image, additional instructions are sent to the model to help guide the editing process accordingly.

810 750 

811Unlike with DALL·E 2, masking with GPT Image is entirely prompt-based. This751Masking with GPT Image is entirely prompt-based. The model uses the mask as

812 means the model uses the mask as guidance, but may not follow its exact shape752 guidance, but may not follow its exact shape with complete precision.

813 with complete precision.

814 753 

815If you provide multiple input images, the mask will be applied to the first image.754If you provide multiple input images, the mask will be applied to the first image.

816 755 


828maskId = create_file("mask.png")767maskId = create_file("mask.png")

829 768 

830response = client.responses.create(769response = client.responses.create(

831 model="gpt-4o",770 model="gpt-5.4",

832 input=[771 input=[

833 {772 {

834 "role": "user",773 "role": "user",


875const maskId = await createFile("mask.png");814const maskId = await createFile("mask.png");

876 815 

877const response = await openai.responses.create({816const response = await openai.responses.create({

878 model: "gpt-4o",817 model: "gpt-5.4",

879 input: [818 input: [

880 {819 {

881 role: "user",820 role: "user",


923client = OpenAI()862client = OpenAI()

924 863 

925result = client.images.edit(864result = client.images.edit(

926 model="gpt-image-1.5",865 model="gpt-image-2",

927 image=open("sunlit_lounge.png", "rb"),866 image=open("sunlit_lounge.png", "rb"),

928 mask=open("mask.png", "rb"),867 mask=open("mask.png", "rb"),

929 prompt="A sunlit indoor lounge area with a pool containing a flamingo"868 prompt="A sunlit indoor lounge area with a pool containing a flamingo"


944const client = new OpenAI();883const client = new OpenAI();

945 884 

946const rsp = await client.images.edit({885const rsp = await client.images.edit({

947 model: "gpt-image-1.5",886 model: "gpt-image-2",

948 image: await toFile(fs.createReadStream("sunlit_lounge.png"), null, {887 image: await toFile(fs.createReadStream("sunlit_lounge.png"), null, {

949 type: "image/png",888 type: "image/png",

950 }),889 }),


965 -o >(jq -r '.data[0].b64_json' | base64 --decode > lounge.png) \\904 -o >(jq -r '.data[0].b64_json' | base64 --decode > lounge.png) \\

966 -X POST "https://api.openai.com/v1/images/edits" \\905 -X POST "https://api.openai.com/v1/images/edits" \\

967 -H "Authorization: Bearer $OPENAI_API_KEY" \\906 -H "Authorization: Bearer $OPENAI_API_KEY" \\

968 -F "model=gpt-image-1.5" \\907 -F "model=gpt-image-2" \\

969 -F "mask=@mask.png" \\908 -F "mask=@mask.png" \\

970 -F "image[]=@sunlit_lounge.png" \\909 -F "image[]=@sunlit_lounge.png" \\

971 -F 'prompt=A sunlit indoor lounge area with a pool containing a flamingo'910 -F 'prompt=A sunlit indoor lounge area with a pool containing a flamingo'


993 932 

994The mask image must also contain an alpha channel. If you're using an image editing tool to create the mask, make sure to save the mask with an alpha channel.933The mask image must also contain an alpha channel. If you're using an image editing tool to create the mask, make sure to save the mask with an alpha channel.

995 934 

996Add an alpha channel to a black and white mask

997 

998You can modify a black and white image programmatically to add an alpha channel.935You can modify a black and white image programmatically to add an alpha channel.

999 936 

1000Add an alpha channel to a black and white mask937Add an alpha channel to a black and white mask


1024```961```

1025 962 

1026 963 

964### Image input fidelity

1027 965 

966The `input_fidelity` parameter controls how strongly a model preserves details from input images during edits and reference-image workflows. For `gpt-image-2`, omit this parameter; the API doesn't allow changing it because the model processes every image input at high fidelity automatically.

1028 967 

1029 968Because `gpt-image-2` always processes image inputs at high fidelity, image

1030 969 input tokens can be higher for edit requests that include reference images. To

1031 970 understand the cost implications, refer to the [vision

1032 

1033 

1034 

1035 

1036 

1037### Input fidelity

1038 

1039GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`) support high input fidelity, which allows you to better preserve details from the input images in the output.

1040This is especially useful when using images that contain elements like faces or logos that require accurate preservation in the generated image.

1041 

1042You can provide multiple input images that will all be preserved with high fidelity, but keep in mind that if using `gpt-image-1` or `gpt-image-1-mini`, the first image will be preserved with richer textures and finer details, so if you include elements such as faces, consider placing them in the first image.

1043 

1044If you are using `gpt-image-1.5`, the first **5** input images will be preserved with higher fidelity.

1045 

1046To enable high input fidelity, set the `input_fidelity` parameter to `high`. The default value is `low`.

1047 

1048 

1049 

1050<div data-content-switcher-pane data-value="responses">

1051 <div class="hidden">Responses API</div>

1052 Generate an image with high input fidelity

1053 

1054```javascript

1055import fs from "fs";

1056import OpenAI from "openai";

1057 

1058const openai = new OpenAI();

1059const response = await openai.responses.create({

1060 model: "gpt-4.1",

1061 input: [

1062 {

1063 role: "user",

1064 content: [

1065 { type: "input_text", text: "Add the logo to the woman's top, as if stamped into the fabric." },

1066 {

1067 type: "input_image",

1068 image_url: "https://cdn.openai.com/API/docs/images/woman_futuristic.jpg",

1069 },

1070 {

1071 type: "input_image",

1072 image_url: "https://cdn.openai.com/API/docs/images/brain_logo.png",

1073 },

1074 ],

1075 },

1076 ],

1077 tools: [{type: "image_generation", input_fidelity: "high", action: "edit"}],

1078});

1079 

1080// Extract the edited image

1081const imageBase64 = response.output.find(

1082 (o) => o.type === "image_generation_call"

1083)?.result;

1084 

1085if (imageBase64) {

1086 const imageBuffer = Buffer.from(imageBase64, "base64");

1087 fs.writeFileSync("woman_with_logo.png", imageBuffer);

1088}

1089```

1090 

1091```python

1092from openai import OpenAI

1093import base64

1094 

1095client = OpenAI()

1096 

1097response = client.responses.create(

1098 model="gpt-4.1",

1099 input=[

1100 {

1101 "role": "user",

1102 "content": [

1103 {"type": "input_text", "text": "Add the logo to the woman's top, as if stamped into the fabric."},

1104 {

1105 "type": "input_image",

1106 "image_url": "https://cdn.openai.com/API/docs/images/woman_futuristic.jpg",

1107 },

1108 {

1109 "type": "input_image",

1110 "image_url": "https://cdn.openai.com/API/docs/images/brain_logo.png",

1111 },

1112 ],

1113 }

1114 ],

1115 tools=[{"type": "image_generation", "input_fidelity": "high", "action": "edit"}],

1116)

1117 

1118# Extract the edited image

1119image_data = [

1120 output.result

1121 for output in response.output

1122 if output.type == "image_generation_call"

1123]

1124 

1125if image_data:

1126 image_base64 = image_data[0]

1127 with open("woman_with_logo.png", "wb") as f:

1128 f.write(base64.b64decode(image_base64))

1129```

1130 

1131 </div>

1132 <div data-content-switcher-pane data-value="image" hidden>

1133 <div class="hidden">Image API</div>

1134 Generate an image with high input fidelity

1135 

1136```javascript

1137import fs from "fs";

1138import OpenAI from "openai";

1139 

1140const openai = new OpenAI();

1141const prompt = "Add the logo to the woman's top, as if stamped into the fabric.";

1142const result = await openai.images.edit({

1143 model: "gpt-image-1.5",

1144 image: [

1145 fs.createReadStream("woman.jpg"),

1146 fs.createReadStream("logo.png")

1147 ],

1148 prompt,

1149 input_fidelity: "high"

1150});

1151 

1152// Save the image to a file

1153const image_base64 = result.data[0].b64_json;

1154const image_bytes = Buffer.from(image_base64, "base64");

1155fs.writeFileSync("woman_with_logo.png", image_bytes);

1156```

1157 

1158```python

1159from openai import OpenAI

1160import base64

1161 

1162client = OpenAI()

1163 

1164result = client.images.edit(

1165 model="gpt-image-1.5",

1166 image=[open("woman.jpg", "rb"), open("logo.png", "rb")],

1167 prompt="Add the logo to the woman's top, as if stamped into the fabric.",

1168 input_fidelity="high"

1169)

1170 

1171image_base64 = result.data[0].b64_json

1172image_bytes = base64.b64decode(image_base64)

1173 

1174# Save the image to a file

1175with open("woman_with_logo.png", "wb") as f:

1176 f.write(image_bytes)

1177```

1178 

1179 </div>

1180 

1181 

1182 

1183<div className="images-examples">

1184 

1185| Input 1 | Input 2 | Output |

1186| ------------------------------------------------------------------------------------------------------------------------ | ----------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------ |

1187| <img className="images-example-image" src="https://cdn.openai.com/API/docs/images/woman_futuristic.jpg" alt="A woman" /> | <img className="images-example-image" src="https://cdn.openai.com/API/docs/images/brain_logo.png" alt="A brain logo" /> | <img className="images-example-image" src="https://cdn.openai.com/API/docs/images/woman_with_logo.jpg" alt="The woman with a brain logo on her top" /> |

1188 

1189</div>

1190 

1191<div className="images-edit-prompt body-small">

1192 Prompt: Add the logo to the woman's top, as if stamped into the fabric.

1193</div>

1194 

1195Keep in mind that when using high input fidelity, more image input tokens will

1196 be used per request. To understand the costs implications, refer to our

1197 [vision

1198 costs](https://developers.openai.com/api/docs/guides/images-vision?api-mode=responses#calculating-costs)971 costs](https://developers.openai.com/api/docs/guides/images-vision?api-mode=responses#calculating-costs)

1199 section.972 section.

1200 973 


1204 975 

1205You can configure the following output options:976You can configure the following output options:

1206 977 

1207 978- **Size**: Image dimensions (for example, `1024x1024`, `1024x1536`)

1208 979- **Quality**: Rendering quality (for example, `low`, `medium`, `high`)

1209- **Size**: Image dimensions (e.g., `1024x1024`, `1024x1536`)

1210- **Quality**: Rendering quality (e.g. `low`, `medium`, `high`)

1211- **Format**: File output format980- **Format**: File output format

1212- **Compression**: Compression level (0-100%) for JPEG and WebP formats981- **Compression**: Compression level (0-100%) for JPEG and WebP formats

1213- **Background**: Transparent or opaque982- **Background**: Opaque or automatic

1214 983 

1215`size`, `quality`, and `background` support the `auto` option, where the model will automatically select the best option based on the prompt.984`size`, `quality`, and `background` support the `auto` option, where the model will automatically select the best option based on the prompt.

1216 985 

1217 986`gpt-image-2` doesn't currently support transparent backgrounds. Requests with

1218 987 `background: "transparent"` aren't supported for this model.

1219 

1220 

1221 

1222 988 

1223### Size and quality options989### Size and quality options

1224 990 

1225Square images with standard quality are the fastest to generate. The default size is 1024x1024 pixels.991`gpt-image-2` accepts any resolution in the `size` parameter when it satisfies the constraints below. Square images are typically fastest to generate.

1226 

1227 

1228 992 

1229<table>993<table>

1230 <tbody>994 <tbody>

1231 <tr>995 <tr>

1232 <td>Available sizes</td>996 <td>Popular sizes</td>

1233 <td>997 <td>

1234 - `1024x1024` (square) - `1536x1024` (landscape) - `1024x1536`998 <ul>

1235 (portrait) - `auto` (default)999 <li>

1000 <code>1024x1024</code> (square)

1001 </li>

1002 <li>

1003 <code>1536x1024</code> (landscape)

1004 </li>

1005 <li>

1006 <code>1024x1536</code> (portrait)

1007 </li>

1008 <li>

1009 <code>2048x2048</code> (2K square)

1010 </li>

1011 <li>

1012 <code>2048x1152</code> (2K landscape)

1013 </li>

1014 <li>

1015 <code>3840x2160</code> (4K landscape)

1016 </li>

1017 <li>

1018 <code>2160x3840</code> (4K portrait)

1019 </li>

1020 <li>

1021 <code>auto</code> (default)

1022 </li>

1023 </ul>

1024 </td>

1025 </tr>

1026 <tr>

1027 <td>Size constraints</td>

1028 <td>

1029 <ul>

1030 <li>

1031 Maximum edge length must be less than or equal to{" "}

1032 <code>3840px</code>

1033 </li>

1034 <li>

1035 Both edges must be multiples of <code>16px</code>

1036 </li>

1037 <li>

1038 Long edge to short edge ratio must not exceed <code>3:1</code>

1039 </li>

1040 <li>

1041 Total pixels must be at least <code>655,360</code> and no more than{" "}

1042 <code>8,294,400</code>

1043 </li>

1044 </ul>

1236 </td>1045 </td>

1237 </tr>1046 </tr>

1238 <tr>1047 <tr>

1239 <td>Quality options</td>1048 <td>Quality options</td>

1240 <td>- `low` - `medium` - `high` - `auto` (default)</td>1049 <td>

1050 <ul>

1051 <li>

1052 <code>low</code>

1053 </li>

1054 <li>

1055 <code>medium</code>

1056 </li>

1057 <li>

1058 <code>high</code>

1059 </li>

1060 <li>

1061 <code>auto</code> (default)

1062 </li>

1063 </ul>

1064 </td>

1241 </tr>1065 </tr>

1242 </tbody>1066 </tbody>

1243</table>1067</table>

1244 1068 

1069Use `quality: "low"` for fast drafts, thumbnails, and quick iterations. It is

1070 the fastest option and works well for many common use cases before you move to

1071 `medium` or `high` for final assets.

1245 1072 

1246 1073Outputs that contain more than `2560x1440` (`3,686,400`) total pixels,

1247 1074 typically referred to as 2K, are considered experimental.

1248 

1249 

1250 1075 

1251### Output format1076### Output format

1252 1077 


1260Using `jpeg` is faster than `png`, so you should prioritize this format if1083Using `jpeg` is faster than `png`, so you should prioritize this format if

1261 latency is a concern.1084 latency is a concern.

1262 1085 

1263 

1264 

1265 

1266 

1267 

1268 

1269 

1270### Transparency

1271 

1272GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`) support transparent backgrounds.

1273To enable transparency, set the `background` parameter to `transparent`.

1274 

1275It is only supported with the `png` and `webp` output formats.

1276 

1277Transparency works best when setting the quality to `medium` or `high`.

1278 

1279 

1280 

1281<div data-content-switcher-pane data-value="responses">

1282 <div class="hidden">Responses API</div>

1283 Generate an image with a transparent background

1284 

1285```python

1286import openai

1287import base64

1288 

1289response = openai.responses.create(

1290 model="gpt-5",

1291 input="Draw a 2D pixel art style sprite sheet of a tabby gray cat",

1292 tools=[

1293 {

1294 "type": "image_generation",

1295 "background": "transparent",

1296 "quality": "high",

1297 }

1298 ],

1299)

1300 

1301image_data = [

1302 output.result

1303 for output in response.output

1304 if output.type == "image_generation_call"

1305]

1306 

1307if image_data:

1308 image_base64 = image_data[0]

1309 

1310 with open("sprite.png", "wb") as f:

1311 f.write(base64.b64decode(image_base64))

1312```

1313 

1314```javascript

1315import fs from "fs";

1316import OpenAI from "openai";

1317 

1318const client = new OpenAI();

1319 

1320const response = await client.responses.create({

1321 model: "gpt-5",

1322 input: "Draw a 2D pixel art style sprite sheet of a tabby gray cat",

1323 tools: [

1324 {

1325 type: "image_generation",

1326 background: "transparent",

1327 quality: "high",

1328 },

1329 ],

1330});

1331 

1332const imageData = response.output

1333 .filter((output) => output.type === "image_generation_call")

1334 .map((output) => output.result);

1335 

1336if (imageData.length > 0) {

1337 const imageBase64 = imageData[0];

1338 const imageBuffer = Buffer.from(imageBase64, "base64");

1339 fs.writeFileSync("sprite.png", imageBuffer);

1340}

1341```

1342 

1343 </div>

1344 <div data-content-switcher-pane data-value="image" hidden>

1345 <div class="hidden">Image API</div>

1346 Generate an image with a transparent background

1347 

1348```javascript

1349import OpenAI from "openai";

1350import fs from "fs";

1351const openai = new OpenAI();

1352 

1353const result = await openai.images.generate({

1354 model: "gpt-image-1.5",

1355 prompt: "Draw a 2D pixel art style sprite sheet of a tabby gray cat",

1356 size: "1024x1024",

1357 background: "transparent",

1358 quality: "high",

1359});

1360 

1361// Save the image to a file

1362const image_base64 = result.data[0].b64_json;

1363const image_bytes = Buffer.from(image_base64, "base64");

1364fs.writeFileSync("sprite.png", image_bytes);

1365```

1366 

1367```python

1368from openai import OpenAI

1369import base64

1370client = OpenAI()

1371 

1372result = client.images.generate(

1373 model="gpt-image-1.5",

1374 prompt="Draw a 2D pixel art style sprite sheet of a tabby gray cat",

1375 size="1024x1024",

1376 background="transparent",

1377 quality="high",

1378)

1379 

1380image_base64 = result.json()["data"][0]["b64_json"]

1381image_bytes = base64.b64decode(image_base64)

1382 

1383# Save the image to a file

1384with open("sprite.png", "wb") as f:

1385 f.write(image_bytes)

1386```

1387 

1388```bash

1389curl -X POST "https://api.openai.com/v1/images" \\

1390 -H "Authorization: Bearer $OPENAI_API_KEY" \\

1391 -H "Content-type: application/json" \\

1392 -d '{

1393 "prompt": "Draw a 2D pixel art style sprite sheet of a tabby gray cat",

1394 "quality": "high",

1395 "size": "1024x1024",

1396 "background": "transparent"

1397 }' | jq -r 'data[0].b64_json' | base64 --decode > sprite.png

1398```

1399 

1400 </div>

1401 

1402 

1403 

1404 

1405 

1406## Limitations1086## Limitations

1407 1087 

1408 1088GPT Image models (`gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`) are powerful and versatile image generation models, but they still have some limitations to be aware of:

1409 

1410GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`) are powerful and versatile image generation models, but they still have some limitations to be aware of:

1411 1089 

1412- **Latency:** Complex prompts may take up to 2 minutes to process.1090- **Latency:** Complex prompts may take up to 2 minutes to process.

1413- **Text Rendering:** Although significantly improved over the DALL·E series, the model can still struggle with precise text placement and clarity.1091- **Text Rendering:** Although significantly improved, the model can still struggle with precise text placement and clarity.

1414- **Consistency:** While capable of producing consistent imagery, the model may occasionally struggle to maintain visual consistency for recurring characters or brand elements across multiple generations.1092- **Consistency:** While capable of producing consistent imagery, the model may occasionally struggle to maintain visual consistency for recurring characters or brand elements across multiple generations.

1415- **Composition Control:** Despite improved instruction following, the model may have difficulty placing elements precisely in structured or layout-sensitive compositions.1093- **Composition Control:** Despite improved instruction following, the model may have difficulty placing elements precisely in structured or layout-sensitive compositions.

1416 1094 


1418 1096 

1419All prompts and generated images are filtered in accordance with our [content policy](https://openai.com/policies/usage-policies/).1097All prompts and generated images are filtered in accordance with our [content policy](https://openai.com/policies/usage-policies/).

1420 1098 

1421For image generation using GPT Image models (`gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`), you can control moderation strictness with the `moderation` parameter. This parameter supports two values:1099For image generation using GPT Image models (`gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`), you can control moderation strictness with the `moderation` parameter. This parameter supports two values:

1422 1100 

1423- `auto` (default): Standard filtering that seeks to limit creating certain categories of potentially age-inappropriate content.1101- `auto` (default): Standard filtering that seeks to limit creating certain categories of potentially age-inappropriate content.

1424- `low`: Less restrictive filtering.1102- `low`: Less restrictive filtering.

1425 1103 

1426### Supported models1104### Supported models

1427 1105 

1428When using image generation in the Responses API, most modern models starting with `gpt-4o` and newer should support the image generation tool. [Check the model detail page for your model](https://developers.openai.com/api/docs/models) to confirm if your desired model can use the image generation tool.1106When using image generation in the Responses API, `gpt-5` and newer models should support the image generation tool. [Check the model detail page for your model](https://developers.openai.com/api/docs/models) to confirm if your desired model can use the image generation tool.

1429 

1430 

1431 

1432 

1433 

1434 

1435 1107 

1436## Cost and latency1108## Cost and latency

1437 1109 

1110### `gpt-image-2` output tokens

1111 

1112For `gpt-image-2`, use the calculator to estimate output tokens from the requested `quality` and `size`:

1438 1113 

1114### Models prior to `gpt-image-2`

1439 1115 

1440This model generates images by first producing specialized image tokens. Both latency and eventual cost are proportional to the number of tokens required to render an image—larger image sizes and higher quality settings result in more tokens.1116GPT Image models prior to `gpt-image-2` generate images by first producing specialized image tokens. Both latency and eventual cost are proportional to the number of tokens required to render an image—larger image sizes and higher quality settings result in more tokens.

1441 1117 

1442The number of tokens generated depends on image dimensions and quality:1118The number of tokens generated depends on image dimensions and quality:

1443 1119 


1448| High | 4160 tokens | 6240 tokens | 6208 tokens |1124| High | 4160 tokens | 6240 tokens | 6208 tokens |

1449 1125 

1450Note that you will also need to account for [input tokens](https://developers.openai.com/api/docs/guides/images-vision?api-mode=responses#calculating-costs): text tokens for the prompt and image tokens for the input images if editing images.1126Note that you will also need to account for [input tokens](https://developers.openai.com/api/docs/guides/images-vision?api-mode=responses#calculating-costs): text tokens for the prompt and image tokens for the input images if editing images.

1451If you are using high input fidelity, the number of input tokens will be higher.1127Because `gpt-image-2` always processes image inputs at high fidelity, edit requests that include reference images can use more input tokens.

1452 1128 

1453Refer to the [Calculating costs](#calculating-costs) section below for more1129Refer to the [pricing page](https://developers.openai.com/api/docs/pricing#image-generation) for current

1454information about price per text and image tokens.1130text and image token prices, and use the [Calculating costs](#calculating-costs)

1131section below to estimate request costs.

1455 1132 

1456So the final cost is the sum of:1133The final cost is the sum of:

1457 1134 

1458- input text tokens1135- input text tokens

1459- input image tokens if using the edits endpoint1136- input image tokens if using the edits endpoint


1461 1138 

1462### Calculating costs1139### Calculating costs

1463 1140 

1464Per-image output pricing is listed below. These tables cover output image1141Use the pricing calculator below to estimate request costs for GPT Image models.

1465generation only. You should still account for text and image input tokens when1142`gpt-image-2` supports thousands of valid resolutions; the table below lists the

1143same sizes used for previous GPT Image models for comparison. For GPT Image 1.5,

1144GPT Image 1, and GPT Image 1 Mini, the legacy per-image output pricing table is

1145also listed below. You should still account for text and image input tokens when

1466estimating the total cost of a request.1146estimating the total cost of a request.

1467 1147 

1148A larger non-square resolution can sometimes produce fewer output tokens than

1149 a smaller or square resolution at the same quality setting.

1150 

1468<table1151<table

1469 style={{ borderCollapse: "collapse", tableLayout: "fixed", width: "100%" }}1152 style={{ borderCollapse: "collapse", tableLayout: "fixed", width: "100%" }}

1470>1153>


1482 <tbody>1165 <tbody>

1483 <tr>1166 <tr>

1484 <td rowSpan="3" style={{ padding: "8px", width: "28%" }}>1167 <td rowSpan="3" style={{ padding: "8px", width: "28%" }}>

1485 GPT Image 1.51168 GPT Image 2

1169 <br />

1170 <span style={{ fontSize: "0.875em" }}>Additional sizes available</span>

1486 </td>1171 </td>

1487 <td style={{ padding: "8px" }}>Low</td>1172 <td style={{ padding: "8px" }}>Low</td>

1488 <td style={{ padding: "8px" }}>$0.009</td>1173 <td style={{ padding: "8px" }}>$0.006</td>

1489 <td style={{ padding: "8px" }}>$0.013</td>1174 <td style={{ padding: "8px" }}>$0.005</td>

1490 <td style={{ padding: "8px" }}>$0.013</td>1175 <td style={{ padding: "8px" }}>$0.005</td>

1491 </tr>1176 </tr>

1492 <tr>1177 <tr>

1493 <td style={{ padding: "8px" }}>Medium</td>1178 <td style={{ padding: "8px" }}>Medium</td>

1494 <td style={{ padding: "8px" }}>$0.034</td>1179 <td style={{ padding: "8px" }}>$0.053</td>

1495 <td style={{ padding: "8px" }}>$0.05</td>1180 <td style={{ padding: "8px" }}>$0.041</td>

1496 <td style={{ padding: "8px" }}>$0.05</td>1181 <td style={{ padding: "8px" }}>$0.041</td>

1497 </tr>1182 </tr>

1498 <tr>1183 <tr>

1499 <td style={{ padding: "8px" }}>High</td>1184 <td style={{ padding: "8px" }}>High</td>

1500 <td style={{ padding: "8px" }}>$0.133</td>1185 <td style={{ padding: "8px" }}>$0.211</td>

1501 <td style={{ padding: "8px" }}>$0.2</td>1186 <td style={{ padding: "8px" }}>$0.165</td>

1502 <td style={{ padding: "8px" }}>$0.2</td>1187 <td style={{ padding: "8px" }}>$0.165</td>

1503 </tr>1188 </tr>

1504 1189 

1505 <tr>1190 <tr>

1506 <td rowSpan="3" style={{ padding: "8px", width: "28%" }}>1191 <td rowSpan="3" style={{ padding: "8px", width: "28%" }}>

1507 GPT Image Latest1192 GPT Image 1.5

1508 </td>1193 </td>

1509 <td style={{ padding: "8px" }}>Low</td>1194 <td style={{ padding: "8px" }}>Low</td>

1510 <td style={{ padding: "8px" }}>$0.009</td>1195 <td style={{ padding: "8px" }}>$0.009</td>


1571 </tbody>1256 </tbody>

1572</table>1257</table>

1573 1258 

1574<table

1575 style={{ borderCollapse: "collapse", tableLayout: "fixed", width: "100%" }}

1576>

1577 <thead>

1578 <tr>

1579 <th style={{ width: "28%" }}>Model</th>

1580 <th

1581 style={{

1582 textAlign: "left",

1583 paddingLeft: "0.5rem",

1584 paddingRight: "0.5rem",

1585 width: "14%",

1586 }}

1587 >

1588 Quality

1589 </th>

1590 <th

1591 style={{

1592 textAlign: "left",

1593 paddingLeft: "0.5rem",

1594 paddingRight: "0.5rem",

1595 width: "19.33%",

1596 }}

1597 >

1598 1024 x 1024

1599 </th>

1600 <th

1601 style={{

1602 textAlign: "left",

1603 paddingLeft: "0.5rem",

1604 paddingRight: "0.5rem",

1605 width: "19.33%",

1606 }}

1607 >

1608 1024 x 1792

1609 </th>

1610 <th

1611 style={{

1612 textAlign: "left",

1613 paddingLeft: "0.5rem",

1614 paddingRight: "0.5rem",

1615 width: "19.34%",

1616 }}

1617 >

1618 1792 x 1024

1619 </th>

1620 </tr>

1621 </thead>

1622 <tbody>

1623 <tr>

1624 <td rowSpan="2" style={{ width: "28%" }}>

1625 DALL·E 3

1626 </td>

1627 <td

1628 style={{

1629 textAlign: "left",

1630 paddingLeft: "0.5rem",

1631 paddingRight: "0.5rem",

1632 }}

1633 >

1634 Standard

1635 </td>

1636 <td

1637 style={{

1638 textAlign: "left",

1639 paddingLeft: "0.5rem",

1640 paddingRight: "0.5rem",

1641 }}

1642 >

1643 $0.04

1644 </td>

1645 <td

1646 style={{

1647 textAlign: "left",

1648 paddingLeft: "0.5rem",

1649 paddingRight: "0.5rem",

1650 }}

1651 >

1652 $0.08

1653 </td>

1654 <td

1655 style={{

1656 textAlign: "left",

1657 paddingLeft: "0.5rem",

1658 paddingRight: "0.5rem",

1659 }}

1660 >

1661 $0.08

1662 </td>

1663 </tr>

1664 <tr>

1665 <td

1666 style={{

1667 textAlign: "left",

1668 paddingLeft: "0.5rem",

1669 paddingRight: "0.5rem",

1670 }}

1671 >

1672 HD

1673 </td>

1674 <td

1675 style={{

1676 textAlign: "left",

1677 paddingLeft: "0.5rem",

1678 paddingRight: "0.5rem",

1679 }}

1680 >

1681 $0.08

1682 </td>

1683 <td

1684 style={{

1685 textAlign: "left",

1686 paddingLeft: "0.5rem",

1687 paddingRight: "0.5rem",

1688 }}

1689 >

1690 $0.12

1691 </td>

1692 <td

1693 style={{

1694 textAlign: "left",

1695 paddingLeft: "0.5rem",

1696 paddingRight: "0.5rem",

1697 }}

1698 >

1699 $0.12

1700 </td>

1701 </tr>

1702 </tbody>

1703</table>

1704 

1705<table

1706 style={{ borderCollapse: "collapse", tableLayout: "fixed", width: "100%" }}

1707>

1708 <thead>

1709 <tr>

1710 <th style={{ width: "28%" }}>Model</th>

1711 <th

1712 style={{

1713 textAlign: "left",

1714 paddingLeft: "0.5rem",

1715 paddingRight: "0.5rem",

1716 width: "14%",

1717 }}

1718 >

1719 Quality

1720 </th>

1721 <th

1722 style={{

1723 textAlign: "left",

1724 paddingLeft: "0.5rem",

1725 paddingRight: "0.5rem",

1726 width: "19.33%",

1727 }}

1728 >

1729 256 x 256

1730 </th>

1731 <th

1732 style={{

1733 textAlign: "left",

1734 paddingLeft: "0.5rem",

1735 paddingRight: "0.5rem",

1736 width: "19.33%",

1737 }}

1738 >

1739 512 x 512

1740 </th>

1741 <th

1742 style={{

1743 textAlign: "left",

1744 paddingLeft: "0.5rem",

1745 paddingRight: "0.5rem",

1746 width: "19.34%",

1747 }}

1748 >

1749 1024 x 1024

1750 </th>

1751 </tr>

1752 </thead>

1753 <tbody>

1754 <tr>

1755 <td style={{ width: "28%" }}>DALL·E 2</td>

1756 <td

1757 style={{

1758 textAlign: "left",

1759 paddingLeft: "0.5rem",

1760 paddingRight: "0.5rem",

1761 }}

1762 >

1763 Standard

1764 </td>

1765 <td

1766 style={{

1767 textAlign: "left",

1768 paddingLeft: "0.5rem",

1769 paddingRight: "0.5rem",

1770 }}

1771 >

1772 $0.016

1773 </td>

1774 <td

1775 style={{

1776 textAlign: "left",

1777 paddingLeft: "0.5rem",

1778 paddingRight: "0.5rem",

1779 }}

1780 >

1781 $0.018

1782 </td>

1783 <td

1784 style={{

1785 textAlign: "left",

1786 paddingLeft: "0.5rem",

1787 paddingRight: "0.5rem",

1788 }}

1789 >

1790 $0.02

1791 </td>

1792 </tr>

1793 </tbody>

1794</table>

1795 

1796### Partial images cost1259### Partial images cost

1797 1260 

1798If you want to [stream image generation](#streaming) using the `partial_images` parameter, each partial image will incur an additional 100 image output tokens.1261If you want to [stream image generation](#streaming) using the `partial_images` parameter, each partial image will incur an additional 100 image output tokens.

Details

10 10 

11### A tour of image-related use cases11### A tour of image-related use cases

12 12 

13Recent language models can process image inputs and analyze them a capability known as **vision**. With `gpt-image-1`, they can both analyze visual inputs and create images.13Recent language models can process image inputs and analyze them—a capability known as **vision**. GPT Image models can use text and image inputs to create new images or edit existing ones.

14 14 

15The OpenAI API offers several endpoints to process images as input or generate them as output, enabling you to build powerful multimodal applications.15The OpenAI API offers several endpoints to process images as input or generate them as output, enabling you to build powerful multimodal applications.

16 16 


26 26 

27You can generate or edit images using the Image API or the Responses API.27You can generate or edit images using the Image API or the Responses API.

28 28 

29Our latest image generation model, `gpt-image-1`, is a natively multimodal large language model.29The state-of-the-art image generation model, `gpt-image-2`, can understand text and images and use broad world knowledge to generate images with strong instruction following and contextual awareness.

30It can understand text and images and leverage its broad world knowledge to generate images with better instruction following and contextual awareness.

31 

32In contrast, we also offer specialized image generation models - DALL·E 2 and 3 - which don't have the same inherent understanding of the world as GPT Image.

33 30 

34 31 

35 32 


89 86 

90### Using world knowledge for image generation87### Using world knowledge for image generation

91 88 

92The difference between DALL·E models and GPT Image is that a natively multimodal language model can use its visual understanding of the world to generate lifelike images including real-life details without a reference.89GPT Image models can use visual understanding of the world to generate lifelike images including real-life details without a reference.

93 90 

94For example, if you prompt GPT Image to generate an image of a glass cabinet with the most popular semi-precious stones, the model knows enough to select gemstones like amethyst, rose quartz, jade, etc, and depict them in a realistic way.91For example, if you prompt GPT Image to generate an image of a glass cabinet with the most popular semi-precious stones, the model knows enough to select gemstones like amethyst, rose quartz, jade, etc, and depict them in a realistic way.

95 92 

Details

857 857 

858```javascript858```javascript

859const answer = await client.responses.create({859const answer = await client.responses.create({

860 model: 'gpt-5',860 model: 'gpt-5.4',

861 input: 'Who is the current president of France?',861 input: 'Who is the current president of France?',

862 tools: [{ type: 'web_search' }]862 tools: [{ type: 'web_search' }]

863});863});


867 867 

868```python868```python

869answer = client.responses.create(869answer = client.responses.create(

870 model="gpt-5",870 model="gpt-5.4",

871 input="Who is the current president of France?",871 input="Who is the current president of France?",

872 tools=[{"type": "web_search_preview"}]872 tools=[{"type": "web_search"}]

873)873)

874 874 

875print(answer.output_text)875print(answer.output_text)


880 -H "Content-Type: application/json" \\880 -H "Content-Type: application/json" \\

881 -H "Authorization: Bearer $OPENAI_API_KEY" \\881 -H "Authorization: Bearer $OPENAI_API_KEY" \\

882 -d '{882 -d '{

883 "model": "gpt-5",883 "model": "gpt-5.4",

884 "input": "Who is the current president of France?",884 "input": "Who is the current president of France?",

885 "tools": [{"type": "web_search"}]885 "tools": [{"type": "web_search"}]

886 }'886 }'

Details

1# Image generation1# Image generation

2 2 

3The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It leverages GPT Image models (`gpt-image-1`, `gpt-image-1-mini`, and `gpt-image-1.5`), and automatically optimizes text inputs for improved performance.3The image generation tool allows you to generate images using a text prompt, and optionally image inputs. It uses GPT Image models, including `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`, and automatically optimizes text inputs for improved performance.

4 4 

5To learn more about image generation, refer to our dedicated [image generation5To learn more about image generation, refer to our dedicated [image generation

6 guide](https://developers.openai.com/api/docs/guides/image-generation?image-generation-model=gpt-image&api=responses).6 guide](https://developers.openai.com/api/docs/guides/image-generation?api=responses).

7 7 

8## Usage8## Usage

9 9 


18const openai = new OpenAI();18const openai = new OpenAI();

19 19 

20const response = await openai.responses.create({20const response = await openai.responses.create({

21 model: "gpt-5",21 model: "gpt-5.4",

22 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",22 input: "Generate an image of gray tabby cat hugging an otter with an orange scarf",

23 tools: [{type: "image_generation"}],23 tools: [{type: "image_generation"}],

24});24});


42client = OpenAI() 42client = OpenAI()

43 43 

44response = client.responses.create(44response = client.responses.create(

45 model="gpt-5",45 model="gpt-5.4",

46 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",46 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

47 tools=[{"type": "image_generation"}],47 tools=[{"type": "image_generation"}],

48)48)


69 69 

70You can configure the following output options as parameters for the [image generation tool](https://developers.openai.com/api/docs/api-reference/responses/create#responses-create-tools):70You can configure the following output options as parameters for the [image generation tool](https://developers.openai.com/api/docs/api-reference/responses/create#responses-create-tools):

71 71 

72- Size: Image dimensions (e.g., 1024x1024, 1024x1536)72- Size: Image dimensions, for example, 1024 × 1024 or 1024 × 1536

73- Quality: Rendering quality (e.g. low, medium, high)73- Quality: Rendering quality, for example, low, medium, or high

74- Format: File output format74- Format: File output format

75- Compression: Compression level (0-100%) for JPEG and WebP formats75- Compression: Compression level (0-100%) for JPEG and WebP formats

76- Background: Transparent or opaque76- Background: Transparent or opaque


78 78 

79`size`, `quality`, and `background` support the `auto` option, where the model will automatically select the best option based on the prompt.79`size`, `quality`, and `background` support the `auto` option, where the model will automatically select the best option based on the prompt.

80 80 

81`gpt-image-2` supports flexible `size` values that meet its [resolution constraints](https://developers.openai.com/api/docs/guides/image-generation#size-and-quality-options). It doesn't currently support transparent backgrounds, so requests with `background: "transparent"` fail.

82 

81For more details on available options, refer to the [image generation guide](https://developers.openai.com/api/docs/guides/image-generation#customize-image-output).83For more details on available options, refer to the [image generation guide](https://developers.openai.com/api/docs/guides/image-generation#customize-image-output).

82 84 

83For `gpt-image-1.5` and `chatgpt-image-latest` when used with the Responses API, you can optionally set the `action` parameter (`auto`, `generate`, or `edit`) to control whether the request performs image generation or editing. We recommend leaving it at `auto` so the model chooses whether to generate a new image or edit one already in context, but if your use case requires always editing or always creating images, you can force the behavior by setting `action`. If not specified, the default is `auto`.85When using the Responses API image generation tool, supported GPT Image models can choose whether to generate a new image or edit one already in the conversation. The optional `action` parameter controls this behavior: keep `action` set to `auto` so the model chooses whether to generate or edit, or set it to `generate` or `edit` to force that behavior. If not specified, the default is `auto`.

84 86 

85### Revised prompt87### Revised prompt

86 88 

87When using the image generation tool, the mainline model (e.g. `gpt-4.1`) will automatically revise your prompt for improved performance.89When using the image generation tool, the mainline model, for example, `gpt-5.4`, will automatically revise your prompt for improved performance.

88 90 

89You can access the revised prompt in the `revised_prompt` field of the image generation call:91You can access the revised prompt in the `revised_prompt` field of the image generation call:

90 92 


100 102 

101### Prompting tips103### Prompting tips

102 104 

103Image generation works best when you use terms like "draw" or "edit" in your prompt.105Image generation works best when you use terms like `draw` or `edit` in your prompt.

104 106 

105For example, if you want to combine images, instead of saying "combine" or "merge", you can say something like "edit the first image by adding this element from the second image".107For example, if you want to combine images, instead of saying `combine` or `merge`, you can say something like "edit the first image by adding this element from the second image."

106 108 

107## Multi-turn editing109## Multi-turn editing

108 110 

109You can iteratively edit images by referencing previous response or image IDs. This allows you to refine images across multiple turns in a conversation.111You can iteratively edit images by referencing previous response or image IDs. This allows you to refine images across conversation turns.

110 112 

111 113 

112 114 


119const openai = new OpenAI();121const openai = new OpenAI();

120 122 

121const response = await openai.responses.create({123const response = await openai.responses.create({

122 model: "gpt-5",124 model: "gpt-5.4",

123 input:125 input:

124 "Generate an image of gray tabby cat hugging an otter with an orange scarf",126 "Generate an image of gray tabby cat hugging an otter with an orange scarf",

125 tools: [{ type: "image_generation" }],127 tools: [{ type: "image_generation" }],


138// Follow up140// Follow up

139 141 

140const response_fwup = await openai.responses.create({142const response_fwup = await openai.responses.create({

141 model: "gpt-5",143 model: "gpt-5.4",

142 previous_response_id: response.id,144 previous_response_id: response.id,

143 input: "Now make it look realistic",145 input: "Now make it look realistic",

144 tools: [{ type: "image_generation" }],146 tools: [{ type: "image_generation" }],


165client = OpenAI()167client = OpenAI()

166 168 

167response = client.responses.create(169response = client.responses.create(

168 model="gpt-5",170 model="gpt-5.4",

169 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",171 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

170 tools=[{"type": "image_generation"}],172 tools=[{"type": "image_generation"}],

171)173)


186# Follow up188# Follow up

187 189 

188response_fwup = client.responses.create(190response_fwup = client.responses.create(

189 model="gpt-5",191 model="gpt-5.4",

190 previous_response_id=response.id,192 previous_response_id=response.id,

191 input="Now make it look realistic",193 input="Now make it look realistic",

192 tools=[{"type": "image_generation"}],194 tools=[{"type": "image_generation"}],


214const openai = new OpenAI();216const openai = new OpenAI();

215 217 

216const response = await openai.responses.create({218const response = await openai.responses.create({

217 model: "gpt-5",219 model: "gpt-5.4",

218 input:220 input:

219 "Generate an image of gray tabby cat hugging an otter with an orange scarf",221 "Generate an image of gray tabby cat hugging an otter with an orange scarf",

220 tools: [{ type: "image_generation" }],222 tools: [{ type: "image_generation" }],


235// Follow up237// Follow up

236 238 

237const response_fwup = await openai.responses.create({239const response_fwup = await openai.responses.create({

238 model: "gpt-5",240 model: "gpt-5.4",

239 input: [241 input: [

240 {242 {

241 role: "user",243 role: "user",


268import base64270import base64

269 271 

270response = openai.responses.create(272response = openai.responses.create(

271 model="gpt-5",273 model="gpt-5.4",

272 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",274 input="Generate an image of gray tabby cat hugging an otter with an orange scarf",

273 tools=[{"type": "image_generation"}],275 tools=[{"type": "image_generation"}],

274)276)


291# Follow up293# Follow up

292 294 

293response_fwup = openai.responses.create(295response_fwup = openai.responses.create(

294 model="gpt-5",296 model="gpt-5.4",

295 input=[297 input=[

296 {298 {

297 "role": "user",299 "role": "user",


323 325 

324## Streaming326## Streaming

325 327 

326The image generation tool supports streaming partial images as the final result is being generated. This provides faster visual feedback for users and improves perceived latency.328The image generation tool supports streaming partial images while it generates the final result. This provides faster visual feedback for users and improves perceived latency.

327 329 

328You can set the number of partial images (1-3) with the `partial_images` parameter.330You can set the number of partial images (1-3) with the `partial_images` parameter.

329 331 


339 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape";341 "Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape";

340const stream = await openai.images.generate({342const stream = await openai.images.generate({

341 prompt: prompt,343 prompt: prompt,

342 model: "gpt-image-1.5",344 model: "gpt-image-2",

343 stream: true,345 stream: true,

344 partial_images: 2,346 partial_images: 2,

345});347});


362 364 

363stream = client.images.generate(365stream = client.images.generate(

364 prompt="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",366 prompt="Draw a gorgeous image of a river made of white owl feathers, snaking its way through a serene winter landscape",

365 model="gpt-image-1.5",367 model="gpt-image-2",

366 stream=True,368 stream=True,

367 partial_images=2,369 partial_images=2,

368)370)


379 381 

380## Supported models382## Supported models

381 383 

382The image generation tool is supported for the following models:384The following models support the image generation tool:

383 385 

384- `gpt-4o`386- `gpt-4o`

385- `gpt-4o-mini`387- `gpt-4o-mini`


394- `gpt-5.4`396- `gpt-5.4`

395- `gpt-5.2`397- `gpt-5.2`

396 398 

397The model used for the image generation process is always a GPT Image model (`gpt-image-1.5`, `gpt-image-1`, or `gpt-image-1-mini`), but these models are not valid values for the `model` field in the Responses API. Use a text-capable mainline model (for example, `gpt-4.1` or `gpt-5`) with the hosted `image_generation` tool.

399The model used for the image generation process is always a GPT Image model, including `gpt-image-2`, `gpt-image-1.5`, `gpt-image-1`, and `gpt-image-1-mini`, but these models aren't valid values for the `model` field in the Responses API. Use a text-capable mainline model (for example, `gpt-5.4` or `gpt-5`) with the hosted `image_generation` tool.