SpyBara
Go Premium Account
2026
7 May 2026, 17:08
14 May 2026, 21:00 14 May 2026, 07:00 13 May 2026, 00:57 12 May 2026, 01:59 11 May 2026, 18:00 7 May 2026, 20:02 7 May 2026, 17:08 5 May 2026, 23:00 2 May 2026, 06:45 2 May 2026, 00:48 1 May 2026, 18:29 30 Apr 2026, 18:36 29 Apr 2026, 12:40 29 Apr 2026, 00:50 25 Apr 2026, 06:37 25 Apr 2026, 00:42 24 Apr 2026, 18:20 24 Apr 2026, 12:28 23 Apr 2026, 18:31 23 Apr 2026, 12:28 23 Apr 2026, 00:46 22 Apr 2026, 18:29 22 Apr 2026, 00:42 21 Apr 2026, 18:29 21 Apr 2026, 12:30 21 Apr 2026, 06:45 20 Apr 2026, 18:26 20 Apr 2026, 06:53 18 Apr 2026, 18:18 17 Apr 2026, 00:44 16 Apr 2026, 18:31 16 Apr 2026, 00:46 15 Apr 2026, 18:31 15 Apr 2026, 06:44 14 Apr 2026, 18:31 14 Apr 2026, 12:29 13 Apr 2026, 18:37 13 Apr 2026, 00:44 12 Apr 2026, 06:38 10 Apr 2026, 18:23 9 Apr 2026, 00:33 8 Apr 2026, 18:32 8 Apr 2026, 00:40 7 Apr 2026, 00:40 2 Apr 2026, 18:23 31 Mar 2026, 06:35 31 Mar 2026, 00:39 28 Mar 2026, 06:26 28 Mar 2026, 00:36 27 Mar 2026, 18:23 27 Mar 2026, 00:39 26 Mar 2026, 18:27 25 Mar 2026, 18:24 23 Mar 2026, 18:22 20 Mar 2026, 00:35 18 Mar 2026, 12:23 18 Mar 2026, 00:36 17 Mar 2026, 18:24 17 Mar 2026, 00:33 16 Mar 2026, 18:25 16 Mar 2026, 12:23 14 Mar 2026, 00:32 13 Mar 2026, 18:15 13 Mar 2026, 00:34 11 Mar 2026, 00:31 9 Mar 2026, 00:34 8 Mar 2026, 18:10 8 Mar 2026, 00:35 7 Mar 2026, 18:10 7 Mar 2026, 06:14 7 Mar 2026, 00:33 6 Mar 2026, 00:38 5 Mar 2026, 18:41 5 Mar 2026, 06:22 5 Mar 2026, 00:34 4 Mar 2026, 18:18 4 Mar 2026, 06:20 3 Mar 2026, 18:20 3 Mar 2026, 00:35 27 Feb 2026, 18:15 24 Feb 2026, 06:27 24 Feb 2026, 00:33 23 Feb 2026, 18:27 21 Feb 2026, 00:33 20 Feb 2026, 12:16 19 Feb 2026, 20:53 19 Feb 2026, 20:37
13 May 2026, 00:57
14 May 2026, 21:00 14 May 2026, 07:00 13 May 2026, 00:57 12 May 2026, 01:59 11 May 2026, 18:00 7 May 2026, 20:02 7 May 2026, 17:08 5 May 2026, 23:00 2 May 2026, 06:45 2 May 2026, 00:48 1 May 2026, 18:29 30 Apr 2026, 18:36 29 Apr 2026, 12:40 29 Apr 2026, 00:50 25 Apr 2026, 06:37 25 Apr 2026, 00:42 24 Apr 2026, 18:20 24 Apr 2026, 12:28 23 Apr 2026, 18:31 23 Apr 2026, 12:28 23 Apr 2026, 00:46 22 Apr 2026, 18:29 22 Apr 2026, 00:42 21 Apr 2026, 18:29 21 Apr 2026, 12:30 21 Apr 2026, 06:45 20 Apr 2026, 18:26 20 Apr 2026, 06:53 18 Apr 2026, 18:18 17 Apr 2026, 00:44 16 Apr 2026, 18:31 16 Apr 2026, 00:46 15 Apr 2026, 18:31 15 Apr 2026, 06:44 14 Apr 2026, 18:31 14 Apr 2026, 12:29 13 Apr 2026, 18:37 13 Apr 2026, 00:44 12 Apr 2026, 06:38 10 Apr 2026, 18:23 9 Apr 2026, 00:33 8 Apr 2026, 18:32 8 Apr 2026, 00:40 7 Apr 2026, 00:40 2 Apr 2026, 18:23 31 Mar 2026, 06:35 31 Mar 2026, 00:39 28 Mar 2026, 06:26 28 Mar 2026, 00:36 27 Mar 2026, 18:23 27 Mar 2026, 00:39 26 Mar 2026, 18:27 25 Mar 2026, 18:24 23 Mar 2026, 18:22 20 Mar 2026, 00:35 18 Mar 2026, 12:23 18 Mar 2026, 00:36 17 Mar 2026, 18:24 17 Mar 2026, 00:33 16 Mar 2026, 18:25 16 Mar 2026, 12:23 14 Mar 2026, 00:32 13 Mar 2026, 18:15 13 Mar 2026, 00:34 11 Mar 2026, 00:31 9 Mar 2026, 00:34 8 Mar 2026, 18:10 8 Mar 2026, 00:35 7 Mar 2026, 18:10 7 Mar 2026, 06:14 7 Mar 2026, 00:33 6 Mar 2026, 00:38 5 Mar 2026, 18:41 5 Mar 2026, 06:22 5 Mar 2026, 00:34 4 Mar 2026, 18:18 4 Mar 2026, 06:20 3 Mar 2026, 18:20 3 Mar 2026, 00:35 27 Feb 2026, 18:15 24 Feb 2026, 06:27 24 Feb 2026, 00:33 23 Feb 2026, 18:27 21 Feb 2026, 00:33 20 Feb 2026, 12:16 19 Feb 2026, 20:53 19 Feb 2026, 20:37
Fri 1 18:29 Sat 2 00:48 Sat 2 06:45 Tue 5 23:00 Thu 7 17:08 Thu 7 20:02 Mon 11 18:00 Tue 12 01:59 Wed 13 00:57 Thu 14 07:00 Thu 14 21:00
Details

121approvals_reviewer = "auto_review"121approvals_reviewer = "auto_review"

122```122```

123 123 

124For the full reviewer lifecycle, trigger conditions, configuration precedence,

125and failure behavior, see

126[Auto-review](https://developers.openai.com/codex/concepts/sandboxing/auto-review).

127 

124The reviewer evaluates only actions that already need approval, such as sandbox128The reviewer evaluates only actions that already need approval, such as sandbox

125escalations, network requests, `request_permissions` prompts, or side-effecting129escalations, blocked network requests, `request_permissions` prompts, or

126app and MCP tool calls. Actions that stay inside the sandbox continue without an130side-effecting app and MCP tool calls. Actions that stay inside the sandbox

127extra review step.131continue without an extra review step.

128 132 

129The reviewer policy checks for data exfiltration, credential probing, persistent133The reviewer policy checks for data exfiltration, credential probing, persistent

130security weakening, and destructive actions. Low-risk and medium-risk actions134security weakening, and destructive actions. Low-risk and medium-risk actions

131can proceed when policy allows them. The policy denies critical-risk actions.135can proceed when policy allows them. The policy denies critical-risk actions.

132High-risk actions require enough user authorization and no matching deny rule.136High-risk actions require enough user authorization and no matching deny rule.

133Timeouts, parse failures, and review errors fail closed.137Prompt-build, review-session, and parse failures fail closed. Timeouts are

138surfaced separately, but the action still does not run.

134 139 

135The [default reviewer policy](https://github.com/openai/codex/blob/main/codex-rs/core/src/guardian/policy.md)140The [default reviewer policy](https://github.com/openai/codex/blob/main/codex-rs/core/src/guardian/policy.md)

136is in the open-source Codex repository. Enterprises can replace its141is in the open-source Codex repository. Enterprises can replace its


139take precedence. For setup details, see144take precedence. For setup details, see

140[Managed configuration](https://developers.openai.com/codex/enterprise/managed-configuration#configure-automatic-review-policy).145[Managed configuration](https://developers.openai.com/codex/enterprise/managed-configuration#configure-automatic-review-policy).

141 146 

142In the Codex app, these reviews appear as automatic review items with a status such147In the Codex app, these reviews appear as automatic review items with a status

143as Reviewing, Approved, Denied, Stopped, or Timed out. They can also include a148such as Reviewing, Approved, Denied, Aborted, or Timed out. They can also

144risk level for the reviewed request.149include a risk level and user-authorization assessment for the reviewed

150request.

145 151 

146Automatic review uses extra model calls, so it can add to Codex usage. Admins152Automatic review uses extra model calls, so it can add to Codex usage. Admins

147can constrain it with `allowed_approvals_reviewers`.153can constrain it with `allowed_approvals_reviewers`.

148 154 

149### Common sandbox and approval combinations155### Common sandbox and approval combinations

150 156 

151| Intent | Flags | Effect |157| Intent | Flags / config | Effect |

152| ----------------------------------------------------------------- | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------ |158| ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |

153| Auto (preset) | _no flags needed_ or `--sandbox workspace-write --ask-for-approval on-request` | Codex can read files, make edits, and run commands in the workspace. Codex requires approval to edit outside the workspace or to access network. |159| Auto (preset) | _no flags needed_ or `--sandbox workspace-write --ask-for-approval on-request` | Codex can read files, make edits, and run commands in the workspace. Codex requires approval to edit outside the workspace or to access network. |

154| Safe read-only browsing | `--sandbox read-only --ask-for-approval on-request` | Codex can read files and answer questions. Codex requires approval to make edits, run commands, or access network. |160| Safe read-only browsing | `--sandbox read-only --ask-for-approval on-request` | Codex can read files and answer questions. Codex requires approval to make edits, run commands, or access network. |

155| Read-only non-interactive (CI) | `--sandbox read-only --ask-for-approval never` | Codex can only read files; never asks for approval. |161| Read-only non-interactive (CI) | `--sandbox read-only --ask-for-approval never` | Codex can only read files; never asks for approval. |

156| Automatically edit but ask for approval to run untrusted commands | `--sandbox workspace-write --ask-for-approval untrusted` | Codex can read and edit files but asks for approval before running untrusted commands. |162| Automatically edit but ask for approval to run untrusted commands | `--sandbox workspace-write --ask-for-approval untrusted` | Codex can read and edit files but asks for approval before running untrusted commands. |

163| Auto-review mode | `--sandbox workspace-write --ask-for-approval on-request -c approvals_reviewer=auto_review` or `approvals_reviewer = "auto_review"` | Same sandbox boundary as standard on-request mode, but eligible approval requests are reviewed by Auto-review instead of surfacing to the user. |

157| Dangerous full access | `--dangerously-bypass-approvals-and-sandbox` (alias: `--yolo`) | <ElevatedRiskBadge /> No sandbox; no approvals _(not recommended)_ |164| Dangerous full access | `--dangerously-bypass-approvals-and-sandbox` (alias: `--yolo`) | <ElevatedRiskBadge /> No sandbox; no approvals _(not recommended)_ |

158 165 

159For non-interactive runs, use `codex exec --sandbox workspace-write`; Codex keeps older `codex exec --full-auto` invocations as a deprecated compatibility path and prints a warning.166For non-interactive runs, use `codex exec --sandbox workspace-write`; Codex keeps older `codex exec --full-auto` invocations as a deprecated compatibility path and prints a warning.

app.md +7 −0

Details

167 167 

168Open rendered pages, leave comments, or let Codex operate local browser flows.168Open rendered pages, leave comments, or let Codex operate local browser flows.

169 169 

170 </BentoContent>

171 <BentoContent href="/codex/app/chrome-extension">

172 

173### Chrome extension

174 

175Add the Chrome plugin so Codex can use Chrome for signed-in browser tasks while you manage website approvals.

176 

170 </BentoContent>177 </BentoContent>

171 <BentoContent href="/codex/app/features#image-generation">178 <BentoContent href="/codex/app/features#image-generation">

172 179 

app/browser.md +5 −1

Details

6 6 

7Use it for local development servers, file-backed previews, and public pages7Use it for local development servers, file-backed previews, and public pages

8that don't require sign-in. For anything that depends on login state or browser8that don't require sign-in. For anything that depends on login state or browser

9extensions, use your regular browser.9extensions, use your regular browser or the

10[Codex Chrome extension](https://developers.openai.com/codex/app/chrome-extension).

10 11 

11Open the in-app browser from the toolbar, by clicking a URL, by navigating12Open the in-app browser from the toolbar, by clicking a URL, by navigating

12manually in the browser, or by pressing <kbd>Cmd</kbd>+<kbd>Shift</kbd>+<kbd>B</kbd>13manually in the browser, or by pressing <kbd>Cmd</kbd>+<kbd>Shift</kbd>+<kbd>B</kbd>


48the allowed list means Codex asks again before using it; removing a site from the49the allowed list means Codex asks again before using it; removing a site from the

49blocked list means Codex can ask again instead of treating it as blocked.50blocked list means Codex can ask again instead of treating it as blocked.

50 51 

52For signed-in websites in Chrome, see

53[Codex Chrome extension](https://developers.openai.com/codex/app/chrome-extension).

54 

51## Preview a page55## Preview a page

52 56 

531. Start your app's development server in the [integrated terminal](https://developers.openai.com/codex/app/features#integrated-terminal) or with a [local environment action](https://developers.openai.com/codex/app/local-environments#actions).571. Start your app's development server in the [integrated terminal](https://developers.openai.com/codex/app/features#integrated-terminal) or with a [local environment action](https://developers.openai.com/codex/app/local-environments#actions).

app/chrome-extension.md +171 −0 added

Details

1# Codex Chrome extension

2 

3The Codex Chrome extension lets Codex use Chrome for browser tasks that need

4your signed-in browser state. Use it when Codex needs to read or act on sites

5such as LinkedIn, Salesforce, Gmail, or internal tools.

6 

7For local development servers, file-backed previews, and public pages that do

8not require sign-in, use the [in-app browser](https://developers.openai.com/codex/app/browser) first. The

9in-app browser keeps preview and verification work inside Codex without using

10your Chrome profile.

11 

12Codex can also switch between tools as a task requires, using plugins when a

13dedicated integration is available, Chrome when it needs logged-in browser

14context, and the in-app browser for localhost.

15 

16<div className="not-prose my-4">

17 <Alert

18 client:load

19 color="warning"

20 variant="soft"

21 description="Treat page content as untrusted context, and review the website before allowing Codex to continue."

22 />

23</div>

24 

25## Set up Chrome from Plugins

26 

27Set up the extension from Codex:

28 

291. Open Codex and go to **Plugins**.

302. Add the **Chrome** plugin.

313. Follow the setup flow. It guides you through installing or connecting the

32 Chrome extension and approving Chrome's permission prompts.

334. Open Chrome and confirm the Codex extension shows **Connected**.

34 

35After the plugin setup is complete, start a new Codex thread. Codex can suggest

36Chrome when a task needs a signed-in website. You can also invoke it directly in

37a prompt:

38 

39```text

40@Chrome open Salesforce and update the account from these call notes.

41```

42 

43If Chrome isn't already open, Codex can open it. Chrome browser tasks run in

44Chrome tab groups so the work for a thread stays grouped together.

45 

46## Control website access

47 

48By default, Codex asks before it interacts with each new website. Codex bases

49the prompt on the website host, such as `example.com`.

50 

51When Codex asks to use a website, you can choose the option that matches the

52task and your risk tolerance:

53 

54- Allow the website for the current chat.

55- Always allow the host so Codex can use that website again without asking.

56- Decline the website.

57 

58### Manage the allowlist and blocklist

59 

60In Computer Use settings, you can manage an allowlist and blocklist for

61domains. The allowlist contains domains Codex can use without asking again. The

62blocklist contains domains Codex shouldn't use.

63 

64Removing a domain from the allowlist means Codex asks again before using it.

65Removing a domain from the blocklist means Codex can ask again instead of

66treating the domain as blocked.

67 

68#### Always allow browser content <ElevatedRiskBadge class="ml-2" />

69 

70If you turn on always allow browser content, Codex no longer asks for

71confirmation before using websites.

72 

73#### Browser history <ElevatedRiskBadge class="ml-2" />

74 

75Browser history can include sensitive telemetry, internal URLs, search terms,

76and activity from Chrome sessions on signed-in devices. If you allow Codex to

77access browser history, relevant history entries can become part of the context

78Codex uses for the task. Malicious or misleading page content can increase the

79risk that Codex copies this data somewhere unintended.

80 

81Codex asks when it wants to use browser history. Codex scopes history access to

82the request, and history doesn't have an always-allow option.

83 

84## Data and security

85 

86### Chrome extension permissions

87 

88Chrome asks you to accept extension permissions when you install the extension.

89The permission prompt may include:

90 

91- Access the page debugger

92- Read and change all your data on all websites

93- Read and change your browsing history on all your signed-in devices

94- Display notifications

95- Read and change your bookmarks

96- Manage your downloads

97- Communicate with cooperating native applications

98- View and manage your tab groups

99 

100These Chrome permissions make the extension capable of operating browser

101workflows. Codex still uses its own confirmations, settings, allowlists, and

102blocklists before using websites or browser history during a task.

103 

104### Memories

105 

106Browser use follows your Codex Memories setting. If Memories is on, Codex can

107use relevant saved memories while working in Chrome. If Memories is off, browser

108use doesn't use memories.

109 

110### What OpenAI stores from browsing

111 

112OpenAI doesn't store a separate complete record of your Chrome actions from the

113extension. OpenAI stores browser activity only when it becomes part of the Codex

114context, such as text Codex reads from a page, screenshots, tool calls,

115summaries, messages, or other content included in the thread.

116 

117Your ChatGPT and Codex data controls apply to content processed in context.

118Avoid sending secrets or highly sensitive data through browser tasks unless

119they're required and you are present to review each prompt.

120 

121## Troubleshooting

122 

123If Codex can't connect to Chrome, first confirm the website Codex is trying to

124access isn't in the blocklist in Settings. If the website isn't blocked, work

125through these checks:

126 

1271. Open the Codex extension from the Chrome toolbar or Chrome's extensions

128 menu. Make sure it shows **Connected**. If it shows disconnected or mentions

129 a missing native host, remove and re-add the Chrome plugin from **Plugins**

130 in Codex, then follow the setup flow again.

1312. In Codex, open **Plugins** and confirm that the Chrome plugin is on. If the

132 plugin is off, turn it on and try the task again.

1333. Make sure you are using the same Chrome profile where the Codex extension is

134 installed. If you use more than one Chrome profile, install and enable the

135 extension in the active profile.

1364. Start a new Codex thread and try the Chrome task again. This can clear a

137 thread-specific connection state.

1385. Restart Chrome and Codex, then try again. If the extension still doesn't

139 connect, uninstall the Codex Chrome extension, remove and re-add the Chrome

140 plugin from **Plugins**, and follow the setup flow again.

1416. If the extension shows **Connected** but Codex still can't use Chrome, run

142 `/feedback` in the Codex app and include the thread ID when you contact

143 support.

144 

145<CodexScreenshot

146 alt="Codex Chrome extension showing connected status"

147 lightSrc="/images/codex/app/chrome-connected-light.png"

148 darkSrc="/images/codex/app/chrome-connected-dark.png"

149 maxHeight="300px"

150 class="mt-4"

151/>

152 

153### Upload Files

154 

155If a Chrome task needs to upload a file from your computer, allow the Codex

156extension to access file URLs in Chrome:

157 

1581. In Chrome, open the extensions icon in the toolbar, then click **Manage

159 Extensions**.

1602. On the Codex extension card, click **Details**.

1613. Turn on **Allow access to file URLs**.

162 

163After you change the setting, start the Chrome task again.

164 

165<CodexScreenshot

166 alt="Chrome extension settings showing Allow access to file URLs enabled for Codex"

167 lightSrc="/images/codex/app/chrome-file-url-access-light.webp"

168 darkSrc="/images/codex/app/chrome-file-url-access-dark.webp"

169 maxHeight="420px"

170 class="mt-4"

171/>

app/settings.md +5 −4

Details

85 85 

86## Browser use86## Browser use

87 87 

88Use these settings to install or enable the bundled Browser plugin and manage88Use these settings to install or enable the bundled Browser plugin, set up the

89allowed and blocked websites. Codex asks before using a website unless you've89[Codex Chrome extension](https://developers.openai.com/codex/app/chrome-extension), and manage allowlisted

90allowed it. Removing a site from the blocked list lets Codex ask90and blocklisted websites. Codex asks before using a website unless you've

91again before using it in the browser.91allowlisted it. Removing a site from the blocklist lets Codex ask again before

92using it in the browser.

92 93 

93See [In-app browser](https://developers.openai.com/codex/app/browser) for browser preview, comment, and94See [In-app browser](https://developers.openai.com/codex/app/browser) for browser preview, comment, and

94browser use workflows.95browser use workflows.

Details

126the composer or chat input. That selector lets you rely on Codex's default126the composer or chat input. That selector lets you rely on Codex's default

127permissions, switch to full access, or use your custom configuration.127permissions, switch to full access, or use your custom configuration.

128 128 

129<div class="not-prose max-w-[22rem] mr-auto mb-6">129<PermissionModeSelectorDemo client:load />

130 <img src="https://developers.openai.com/images/codex/app/permissions-selector-light.webp"

131 alt="Codex app permissions selector showing Default permissions, Full access, and Custom (config.toml)"

132 class="block h-auto w-full mx-0!"

133 />

134</div>

135 130 

136In the CLI, use [`/permissions`](https://developers.openai.com/codex/cli/slash-commands#update-permissions-with-permissions)131In the CLI, use [`/permissions`](https://developers.openai.com/codex/cli/slash-commands#update-permissions-with-permissions)

137to switch modes during a session.132to switch modes during a session.


142configuration. Codex stores those defaults in `config.toml`, its local settings137configuration. Codex stores those defaults in `config.toml`, its local settings

143file. [Config basics](https://developers.openai.com/codex/config-basic) explains how it works, and the138file. [Config basics](https://developers.openai.com/codex/config-basic) explains how it works, and the

144[Configuration reference](https://developers.openai.com/codex/config-reference) documents the exact keys for139[Configuration reference](https://developers.openai.com/codex/config-reference) documents the exact keys for

145`sandbox_mode`, `approval_policy`, and140`sandbox_mode`, `approval_policy`, `approvals_reviewer`, and

146`sandbox_workspace_write.writable_roots`. Use those settings to decide how much141`sandbox_workspace_write.writable_roots`. Use those settings to decide how much

147autonomy Codex gets by default, which directories it can write to, and when it142autonomy Codex gets by default, which directories it can write to, when it

148should pause for approval.143should pause for approval, and who reviews eligible approval requests.

149 144 

150At a high level, the common sandbox modes are:145At a high level, the common sandbox modes are:

151 146 


166 needs to go beyond that boundary.161 needs to go beyond that boundary.

167- `never`: Codex doesn't stop for approval prompts.162- `never`: Codex doesn't stop for approval prompts.

168 163 

164When approvals are interactive, you can also choose who reviews them with

165`approvals_reviewer`:

166 

167- `user`: approval prompts surface to the user. This is the default.

168- `auto_review`: eligible approval prompts go to a reviewer agent (see

169 [Auto-review](https://developers.openai.com/codex/concepts/sandboxing/auto-review)).

170 

169Full access means using `sandbox_mode = "danger-full-access"` together with171Full access means using `sandbox_mode = "danger-full-access"` together with

170`approval_policy = "never"`. By contrast, the lower-risk local automation172`approval_policy = "never"`. By contrast, the lower-risk local automation

171preset is `sandbox_mode = "workspace-write"` together with173preset is `sandbox_mode = "workspace-write"` together with

172`approval_policy = "on-request"`, or the matching CLI flags174`approval_policy = "on-request"`, or the matching CLI flags

173`--sandbox workspace-write --ask-for-approval on-request`.175`--sandbox workspace-write --ask-for-approval on-request`. You can then keep

176`approvals_reviewer = "user"` for manual approvals or set

177`approvals_reviewer = "auto_review"` for automatic approval review.

174 178 

175If you need Codex to work across more than one directory, writable roots let179If you need Codex to work across more than one directory, writable roots let

176you extend the places it can modify without removing the sandbox entirely. If180you extend the places it can modify without removing the sandbox entirely. If


193[Codex app features](https://developers.openai.com/codex/app/features#approvals-and-sandboxing), and for the197[Codex app features](https://developers.openai.com/codex/app/features#approvals-and-sandboxing), and for the

194IDE-specific settings entry points, see [Codex IDE extension settings](https://developers.openai.com/codex/ide/settings).198IDE-specific settings entry points, see [Codex IDE extension settings](https://developers.openai.com/codex/ide/settings).

195 199 

196Automatic review, when available, doesn't change the sandbox boundary. It200Automatic review, when available, does not change the sandbox boundary. It is

197reviews approval requests, such as sandbox escalations or network access, while201one possible `approvals_reviewer` for approval requests at that boundary, such

198actions already allowed inside the sandbox run without extra review. See202as sandbox escalations, blocked network access, or side-effecting tool calls

199[Automatic approval reviews](https://developers.openai.com/codex/agent-approvals-security#automatic-approval-reviews)203that still need approval. Actions already allowed inside the sandbox run

200for the policy behavior.204without extra review. For the reviewer lifecycle, trigger types, denial

205semantics, and configuration details, see

206[Auto-review](https://developers.openai.com/codex/concepts/sandboxing/auto-review).

201 207 

202Platform details live in the platform-specific docs. For native Windows setup,208Platform details live in the platform-specific docs. For native Windows setup,

203behavior, and troubleshooting, see [Windows](https://developers.openai.com/codex/windows). For admin209behavior, and troubleshooting, see [Windows](https://developers.openai.com/codex/windows). For admin

Details

1# Auto-review

2 

3Auto-review replaces manual approval at the sandbox boundary with a separate

4reviewer agent. The main Codex agent still runs inside the same sandbox, with

5the same approval policy and the same network and filesystem limits. The

6difference is who reviews eligible escalation requests.

7 

8Auto-review only applies when approvals are interactive. In practice, that

9 means `approval_policy = "on-request"` or a granular approval policy that

10 still surfaces the relevant prompt category. With `approval_policy = "never"`,

11 there is nothing to review.

12 

13## How auto-review works

14 

15At a high level, the flow is:

16 

171. The main agent works inside `read-only` or `workspace-write`.

182. When it needs to cross the sandbox boundary, it requests approval.

193. If `approvals_reviewer = "auto_review"`, Codex routes that approval request

20 to a separate reviewer agent instead of stopping for a person.

214. The reviewer decides whether the action should run and returns a rationale.

225. If the action is approved, execution continues. If it is denied, the main

23 agent is instructed to find a materially safer path or stop and ask the

24 user.

25 

26Auto-review is a reviewer swap, not a permission grant. It does not expand

27`writable_roots`, enable network access, or weaken protected paths. It only

28changes how Codex handles actions that already need approval.

29 

30## When it triggers

31 

32Auto-review evaluates approval requests that would otherwise pause for a human.

33These include:

34 

35- Shell or exec tool calls that request escalated sandbox permissions.

36- Network requests blocked by the current sandbox or policy.

37- File edits outside the allowed writable roots.

38- MCP or app tool calls that require approval based on their tool annotations

39 or configured approval mode.

40- Browser Use access to a new website or domain.

41 

42Auto-review does not run for routine actions already allowed inside the

43sandbox. If a command can run under the active `sandbox_mode`, or a tool call

44stays within the allowed policy, the main agent continues without review.

45 

46Computer Use is a separate case. App approvals for Computer Use still surface

47directly to the user, so Auto-review does not replace those app-level prompts.

48 

49## What auto-review blocks

50 

51At a high level, Auto-review is designed to block actions such as:

52 

53- sending private data, secrets, or credentials to untrusted destinations

54- probing for credentials, tokens, cookies, or session material

55- broad or persistent security weakening

56- destructive actions with significant risk of irreversible damage

57 

58The exact policy lives in the open-source Codex repository:

59[policy_template.md](https://github.com/openai/codex/blob/main/codex-rs/core/src/guardian/policy_template.md)

60and

61[policy.md](https://github.com/openai/codex/blob/main/codex-rs/core/src/guardian/policy.md).

62That policy can be customized per enterprise with `guardian_policy_config` or

63per user with local [`[auto_review].policy`](/codex/config-advanced#approval-policies-and-sandbox-modes).

64 

65## What the reviewer sees

66 

67The reviewer is itself a Codex agent with a narrower job than the main agent:

68decide whether a specific boundary-crossing action should run.

69 

70The reviewer sees a compact transcript plus the exact approval request. That

71typically includes user messages, surfaced assistant updates, relevant tool

72calls and tool outputs, and the action now being proposed for approval. It can

73also perform read-only checks to gather missing context, but it does so rarely.

74 

75Hidden assistant reasoning is not included. Auto-review sees retained

76conversation items and tool evidence, not private chain-of-thought.

77 

78## Denials and failure behavior

79 

80An explicit denial is not treated like an ordinary sandbox error. Codex returns

81the review rationale to the main agent and adds a stronger instruction:

82 

83- Do not pursue the same outcome via workaround, indirect execution, or policy

84 circumvention.

85- Continue only with a materially safer alternative.

86- Otherwise, stop and ask the user.

87 

88Codex also applies a rejection circuit breaker per turn. In the current

89open-source implementation, Auto-review interrupts the turn after `3`

90consecutive denials or `10` denials within a rolling window of the last `50`

91reviews in the same turn.

92 

93Any non-denial resets the consecutive-denial counter. When the breaker trips,

94Codex emits a warning and aborts the current turn with an interrupt rather than

95letting the agent loop on more escalation attempts.

96 

97Timeouts are surfaced separately from explicit denials, and the main agent is

98informed that a timeout alone is not proof that the action is unsafe.

99 

100There is also an explicit override path for denied actions. In the current

101open-source TUI, run `/approve` to open the **Auto-review Denials** picker, then

102select one recent denied action to approve for one retry. Codex records up to 10

103recent denials per thread. That approval is narrow: it applies to the exact

104denied action, not similar future actions; it is recorded for one retry in the

105same context; and the retry still goes through Auto-review. Under the hood,

106Codex injects a developer-scoped approval marker for that exact action. The

107reviewer then sees that explicit user override as context, but it still follows

108policy and can deny again if policy says the user cannot overwrite that class of

109denial.

110 

111## Configuration

112 

113For setup details, see

114[Managed configuration](https://developers.openai.com/codex/enterprise/managed-configuration#configure-automatic-review-policy).

115 

116The default reviewer policy is in the open-source Codex repository:

117[core/src/guardian/policy.md](https://github.com/openai/codex/blob/main/codex-rs/core/src/guardian/policy.md).

118Enterprises can replace its tenant-specific section with

119`guardian_policy_config` in managed requirements. Individual users can also set

120a local

121[`[auto_review].policy`](/codex/config-advanced#approval-policies-and-sandbox-modes)

122in their `config.toml`, but managed requirements take precedence:

123 

124```toml

125[auto_review]

126policy = """

127YOUR POLICY GOES HERE

128"""

129```

130 

131To customize the policy, copy the whole default policy wording first, then

132iterate based on your individual risk profile.

133 

134## Reduce review volume without weakening security

135 

136Auto-review works best when the sandbox already covers your common safe

137workflows. If too many mundane actions need review, fix the boundary first

138instead of teaching the reviewer to approve noisy escalations forever.

139 

140In practice, the highest-leverage changes are:

141 

142- Add narrow

143 [`writable_roots`](https://developers.openai.com/codex/config-advanced#approval-policies-and-sandbox-modes)

144 for scratch directories or neighboring repos you intentionally use.

145- Add narrowly scoped [prefix rules](https://developers.openai.com/codex/rules). Prefer precise command

146 prefixes such as `["cargo", "test"]` or `["pnpm", "run", "lint"]` over broad

147 patterns such as `["python"]` or `["curl"]`. Broad rules often erase the very

148 boundary Auto-review is meant to guard.

149 

150Auto-review session transcripts are retained under `~/.codex/sessions` by

151default, so you can ask Codex to analyze past traffic there before changing

152policy or permissions.

153 

154## Limits

155 

156Auto-review improves the default operating point for long-running agentic work,

157but it is not a deterministic security guarantee.

158 

159- It only evaluates actions that ask to cross a boundary.

160- It can still make mistakes, especially in adversarial or unusual contexts.

161- It should complement, not replace, good sandbox design, monitoring, and

162 organization-specific policy.

163 

164For the research rationale and published evaluation results, see the

165[Alignment Research post on Auto-review](https://alignment.openai.com/auto-review/).

Details

66If you don't pin a model or `model_reasoning_effort`, Codex can choose a setup66If you don't pin a model or `model_reasoning_effort`, Codex can choose a setup

67that balances intelligence, speed, and price for the task. It may favor67that balances intelligence, speed, and price for the task. It may favor

68`gpt-5.4-mini` for fast scans or a higher-effort `gpt-5.5` configuration for68`gpt-5.4-mini` for fast scans or a higher-effort `gpt-5.5` configuration for

69more demanding reasoning when that model is available. When you want finer69more demanding reasoning. When you want finer control, steer that choice in

70control, steer that choice in your prompt or set `model` and70your prompt or set `model` and

71`model_reasoning_effort` directly in the agent file.71`model_reasoning_effort` directly in the agent file.

72 72 

73For most tasks in Codex, start with `gpt-5.5` when it is available. Continue73For most tasks in Codex, start with `gpt-5.5`. Use `gpt-5.4-mini` when you

74 using `gpt-5.4` during the rollout if `gpt-5.5` is not yet available. Use74 want a faster, lower-cost option for lighter subagent work. If you have

75 `gpt-5.4-mini` when you want a faster, lower-cost option for lighter subagent75 ChatGPT Pro and want near-instant text-only iteration, `gpt-5.3-codex-spark`

76 work. If you have ChatGPT Pro and want near-instant text-only iteration,76 remains available in research preview.

77 `gpt-5.3-codex-spark` remains available in research preview.

78 77 

79### Model choice78### Model choice

80 79 

81- **`gpt-5.5`**: Start here for demanding agents when it is available. It is strongest for ambiguous, multi-step work that needs planning, tool use, validation, and follow-through across a larger context.80- **`gpt-5.5`**: Start here for demanding agents. It is strongest for ambiguous, multi-step work that needs planning, tool use, validation, and follow-through across a larger context.

82- **`gpt-5.4`**: Use this when `gpt-5.5` is not yet available or when a workflow is pinned to GPT-5.4. It combines strong coding, reasoning, tool use, and broader workflows.81- **`gpt-5.4`**: Use this when a workflow is pinned to GPT-5.4. It combines strong coding, reasoning, tool use, and broader workflows.

83- **`gpt-5.4-mini`**: Use for agents that favor speed and efficiency over depth, such as exploration, read-heavy scans, large-file review, or processing supporting documents. It works well for parallel workers that return distilled results to the main agent.82- **`gpt-5.4-mini`**: Use for agents that favor speed and efficiency over depth, such as exploration, read-heavy scans, large-file review, or processing supporting documents. It works well for parallel workers that return distilled results to the main agent.

84- **`gpt-5.3-codex-spark`**: If you have ChatGPT Pro, use this research preview model for near-instant, text-only iteration when latency matters more than broader capability.83- **`gpt-5.3-codex-spark`**: If you have ChatGPT Pro, use this research preview model for near-instant, text-only iteration when latency matters more than broader capability.

85 84 

config-basic.md +1 −0

Details

170| Key | Default | Maturity | Description |170| Key | Default | Maturity | Description |

171| -------------------- | :-------------------: | ------------ | ---------------------------------------------------------------------------------------- |171| -------------------- | :-------------------: | ------------ | ---------------------------------------------------------------------------------------- |

172| `apps` | false | Experimental | Enable ChatGPT Apps/connectors support |172| `apps` | false | Experimental | Enable ChatGPT Apps/connectors support |

173| `codex_git_commit` | false | Experimental | Enable Codex-generated git commits and commit attribution trailers |

173| `codex_hooks` | true | Stable | Enable lifecycle hooks from `hooks.json` or inline `[hooks]`. See [Hooks](https://developers.openai.com/codex/hooks). |174| `codex_hooks` | true | Stable | Enable lifecycle hooks from `hooks.json` or inline `[hooks]`. See [Hooks](https://developers.openai.com/codex/hooks). |

174| `fast_mode` | true | Stable | Enable Fast mode selection and the `service_tier = "fast"` path |175| `fast_mode` | true | Stable | Enable Fast mode selection and the `service_tier = "fast"` path |

175| `memories` | false | Stable | Enable [Memories](https://developers.openai.com/codex/memories) |176| `memories` | false | Stable | Enable [Memories](https://developers.openai.com/codex/memories) |

Details

208 key: "commit_attribution",208 key: "commit_attribution",

209 type: "string",209 type: "string",

210 description:210 description:

211 "Override the commit co-author trailer text. Set an empty string to disable automatic attribution.",211 'Commit co-author trailer used when `[features].codex_git_commit` is enabled. Defaults to `Codex <noreply@openai.com>`; set `""` to disable.',

212 },212 },

213 {213 {

214 key: "model_instructions_file",214 key: "model_instructions_file",


330 description:330 description:

331 "Enable lifecycle hooks loaded from `hooks.json` or inline `[hooks]` config.",331 "Enable lifecycle hooks loaded from `hooks.json` or inline `[hooks]` config.",

332 },332 },

333 {

334 key: "features.codex_git_commit",

335 type: "boolean",

336 description:

337 "Enable Codex-generated git commits. When enabled, Codex uses `commit_attribution` to append a `Co-authored-by:` trailer to generated commit messages.",

338 },

333 {339 {

334 key: "hooks",340 key: "hooks",

335 type: "table",341 type: "table",

Details

83# Inline override for the history compaction prompt. Default: unset.83# Inline override for the history compaction prompt. Default: unset.

84# compact_prompt = ""84# compact_prompt = ""

85 85 

86# Override the default commit co-author trailer. Set to "" to disable it.86# Override the default commit co-author trailer. This only takes effect when

87# [features].codex_git_commit is enabled. When enabled and unset, Codex uses

88# "Codex <noreply@openai.com>". Set to "" to disable it.

87# commit_attribution = "Jane Doe <jane@example.com>"89# commit_attribution = "Jane Doe <jane@example.com>"

88 90 

89# Override built-in base instructions with a file path. Default: unset.91# Override built-in base instructions with a file path. Default: unset.


398# Leave this table empty to accept defaults. Set explicit booleans to opt in/out.400# Leave this table empty to accept defaults. Set explicit booleans to opt in/out.

399# shell_tool = true401# shell_tool = true

400# apps = false402# apps = false

403# codex_git_commit = false

401# codex_hooks = false404# codex_hooks = false

402# unified_exec = true405# unified_exec = true

403# shell_snapshot = true406# shell_snapshot = true

Details

15## Analytics Dashboard15## Analytics Dashboard

16 16 

17<div class="max-w-1xl mx-auto">17<div class="max-w-1xl mx-auto">

18 <img src="https://developers.openai.com/images/codex/enterprise/analytics.png"18 <img src="https://developers.openai.com/images/codex/enterprise/analytics-dashboard.png"

19 alt="Codex analytics dashboard"19 alt="Codex analytics dashboard"

20 class="block w-full mx-auto rounded-lg"20 class="block w-full mx-auto rounded-lg !border-0"

21 />21 />

22</div>22</div>

23 23 

24### Dashboards24### Dashboard views

25 25 

26The [analytics dashboard](https://chatgpt.com/codex/settings/analytics) allows ChatGPT workspace administrators to track feature adoption.26The [analytics dashboard](https://chatgpt.com/codex/cloud/settings/analytics#usage) allows ChatGPT workspace administrators and analytics viewers to track Codex adoption, usage, and Code Review feedback. Usage data can lag by up to 12 hours.

27 27 

28Codex provides the following dashboards:28Codex provides date-range controls for daily and weekly views. Key charts include:

29 29 

30- Daily users by product (CLI, IDE, cloud, Code Review)30- Active users by product surface, including CLI, IDE extension, cloud, desktop, and Code Review

31- Daily code review users31- Workspace and personal usage breakdowns, including credit and token usage by product surface

32- Daily code reviews32- Product activity for threads and turns by client

33- Code reviews by priority level33- User ranking table, with filters for client and sort options such as credits, threads, turns, text tokens, and current streak

34- Daily code reviews by feedback sentiment34- Code Review activity, including PRs reviewed, issues by priority, comments, replies, reactions, and feedback sentiment

35- Daily cloud tasks35- Skill invocations and agent identity usage when your workspace has those features

36- Daily cloud users

37- Daily VS Code extension users

38- Daily CLI users

39 36 

40### Data export37### Data export

41 38 

42Administrators can also export Codex analytics data in CSV or JSON format. Codex provides the following export options:39Administrators can also export Codex analytics data in CSV or JSON format. Codex provides the following export options:

43 40 

44- Code review users and reviews (Daily unique users and total reviews completed in Code Review)41- Workspace usage, including daily active users, threads, turns, and credits by surface

45- Code review findings and feedback (Daily counts of comments, reactions, replies, and priority-level findings)42- Usage per user, including daily threads, turns, and credits across surfaces, with optional email addresses when allowed

46- cloud users and tasks (daily unique cloud users and tasks completed)43- Code Review details, including daily comments, reactions, replies, and priority-level findings

47- CLI and VS Code users (Daily unique users for the Codex CLI and VS Code extension)

48- Sessions and messages per user (Daily session starts and user message counts for each Codex user across surfaces)

49 44 

50## Analytics API45## Analytics API

51 46 

52Use the [Analytics API](https://chatgpt.com/codex/settings/apireference) when you want to automate reporting, build internal dashboards, or join Codex metrics with your existing engineering data.47Use the [Analytics API](https://chatgpt.com/codex/cloud/settings/apireference) when you want to automate reporting, build internal dashboards, or join Codex metrics with your existing engineering data.

53 48 

54### What it measures49### What it measures

55 50 

56The Analytics API provides daily, time-series metrics for a workspace, with optional per-user breakdowns and per-client usage.51The enterprise Analytics API returns daily or weekly UTC buckets for a workspace. It supports workspace-level and per-user usage, per-client breakdowns, Code Review throughput, Code Review comment priority, and user engagement with Code Review comments.

57 52 

58### Endpoints53### Endpoints

59 54 

60#### Daily usage and adoption55The base URL is `https://api.chatgpt.com/v1/analytics/codex`. All endpoints return paginated `page` objects with `has_more` and `next_page`.

61 56 

62- Daily totals for threads, turns, and credits57Use `start_time` for the inclusive Unix timestamp at the beginning of the reporting window, `end_time` for the exclusive Unix timestamp at the end of the reporting window, `group_by` for `day` or `week` buckets, `limit` for page size, and `page` to continue from a previous response. Requests can look back up to 90 days.

63- Breakdown by client surface58 

64- Optional per-user reporting for adoption and power-user analysis59#### Usage

60 

61`GET /workspaces/{workspace_id}/usage`

62 

63- Returns totals for threads, turns, credits, and per-client usage in daily or weekly buckets.

64- Omit `group` to return per-user rows.

65- Set `group=workspace` to return workspace-wide rows.

66- Includes text input, cached input, and output token fields.

65 67 

66#### Code review activity68#### Code review activity

67 69 

68- Pull request reviews completed by Codex70`GET /workspaces/{workspace_id}/code_reviews`

69- Total comments generated by Codex71 

70- Severity breakdown of comments72- Returns pull request reviews completed by Codex.

73- Returns total comments generated by Codex.

74- Breaks comments down by P0, P1, and P2 priority.

71 75 

72#### User engagement with code review76#### User engagement with code review

73 77 

74- Replies to Codex comments78`GET /workspaces/{workspace_id}/code_review_responses`

75- Reactions, including upvotes and downvotes79 

76- Engagement breakdowns for how teams respond to Codex feedback80- Returns replies and reactions to Codex comments.

81- Breaks reactions down into positive, negative, and other reactions.

82- Counts comments that received reactions, replies, or either form of engagement.

77 83 

78### How it works84### How it works

79 85 

80Analytics is daily and time-windowed. Results are time-ordered and returned in pages with cursor-based pagination. You can query by workspace and optionally group by user or aggregate at the workspace level.86Analytics uses time windows and supports day or week grouping. Results are time-ordered and returned in pages with cursor-based pagination. Use an API key scoped to `codex.enterprise.analytics.read`.

81 87 

82### Common use cases88### Common use cases

83 89 

models.md +5 −7

Details

37 value: false,37 value: false,

38 },38 },

39 { title: "ChatGPT Credits", value: true },39 { title: "ChatGPT Credits", value: true },

40 { title: "API Access", value: false },40 { title: "API Access", value: true },

41 ],41 ],

42 }}42 }}

43 />43 />


205For most tasks in Codex, start with `gpt-5.5` when it appears in your model205For most tasks in Codex, start with `gpt-5.5` when it appears in your model

206 picker. It is strongest for complex coding, computer use, knowledge work, and206 picker. It is strongest for complex coding, computer use, knowledge work, and

207 research workflows. GPT-5.5 is currently available in Codex when you sign in207 research workflows. GPT-5.5 is currently available in Codex when you sign in

208 with ChatGPT; it isn't available with API-key authentication. During the208 with ChatGPT or API-key authentication. Use `gpt-5.4-mini` when you want a

209 rollout, continue using `gpt-5.4` if `gpt-5.5` is not yet available. Use209 faster, lower-cost option for lighter coding tasks or subagents. The

210 `gpt-5.4-mini` when you want a faster, lower-cost option for lighter coding210 `gpt-5.3-codex-spark` model is available in research preview for ChatGPT Pro

211 tasks or subagents. The `gpt-5.3-codex-spark` model is available in research211 subscribers and is optimized for near-instant, real-time coding iteration.

212 preview for ChatGPT Pro subscribers and is optimized for near-instant,

213 real-time coding iteration.

214 212 

215## Alternative models213## Alternative models

216 214 

use-cases/ai-app-evals.md +123 −0 added

Details

1---

2name: Add evals to your AI application

3tagline: Use Codex to turn expected behavior into a Promptfoo eval suite.

4summary: Ask Codex to inspect your AI application, identify the behavior you

5 want to evaluate, and add a runnable Promptfoo eval suite.

6skills:

7 - token: promptfoo

8 url: https://github.com/promptfoo/promptfoo/tree/main/plugins/promptfoo

9 description: Plugin that includes `$promptfoo-evals` and

10 `$promptfoo-provider-setup` for creating, connecting, running, and QAing

11 eval suites.

12bestFor:

13 - AI applications that already have prompts, model calls, tools, retrieval,

14 agents, or product requirements but no repeatable eval suite.

15 - Teams preparing a model, prompt, retrieval, or agent change and wanting

16 regression tests before the pull request merges.

17 - Quality reviews where repeated manual checks should become committed eval

18 cases.

19starterPrompt:

20 title: Add Evals Before You Change Behavior

21 body: >-

22 Use $promptfoo-evals to add a Promptfoo eval suite for this AI application.

23 If there is not already a working Promptfoo provider or target adapter, use

24 $promptfoo-provider-setup first.

25 

26 

27 Behavior to evaluate: [support answer quality / tool-call correctness /

28 retrieval grounding / business rules / agent task completion]

29 

30 

31 Before editing:

32 

33 - Inspect the app path users hit and any existing evals or tests.

34 

35 - Propose the smallest useful eval plan: target adapter, seed cases,

36 assertions, files, commands, and required env vars or local services.

37 

38 - Do not change production prompts, model settings, or app behavior until

39 the baseline eval exists and has been run.

40 

41 

42 Requirements:

43 

44 - Exercise the application path users hit when possible, not only the raw

45 model prompt.

46 

47 - Keep fixtures free of secrets, customer data, and sensitive personal data.

48 

49 - Add a local eval command such as `npm run evals` or document the exact

50 command to run.

51 

52 

53 Finish with:

54 

55 - Files changed

56 

57 - Eval commands run

58 

59 - Passing and failing cases

60 

61 - Recommended next evals to add

62 suggestedEffort: medium

63relatedLinks:

64 - label: Promptfoo configuration

65 url: https://www.promptfoo.dev/docs/configuration/guide/

66 - label: Evaluation best practices

67 url: /api/docs/guides/evaluation-best-practices

68---

69 

70## Introduction

71 

72When you are building an AI application, or making changes to an existing one, you want to make sure it behaves as expected. Evals are a way to systematically test a set of scenarios and catch regressions before they ship.

73 

74You can use Promptfoo to run evals on your AI application, and Codex to help you create and maintain the evals.

75 

76## How to use

77 

78Use Codex with the Promptfoo plugin's `$promptfoo-evals` skill to turn one AI app behavior into a repeatable eval suite. When the app does not already have a working Promptfoo target, `$promptfoo-provider-setup` helps connect the suite to the application path you want to test.

79 

80Codex can inspect the app, propose high-signal cases, add the Promptfoo config and test data, run the suite locally, and give you a command to keep using.

81 

82This use case works best when the behavior is concrete: support answer quality, retrieval grounding, classifier labels, tool calls, JSON shape, business rules, or prompt and model migration confidence.

83 

84A strong first pass should be reviewable code and test data: a `promptfooconfig.yaml` or equivalent config, a small `evals/` directory, test cases, any target adapter needed to call the app, and a local command such as `npm run evals`.

85 

86## Choose what to evaluate

87 

88Start with one user-visible promise. Avoid asking Codex to evaluate the entire AI system in one pass. A smaller suite is easier to trust, review, and keep running.

89 

90Good first targets include:

91 

92- **Correctness:** classification, extraction, summarization, routing, or transformation.

93- **Grounding:** answers that should stay tied to retrieved documents or cited sources.

94- **Tool use:** choosing the right tool, passing valid arguments, and handling tool errors.

95- **Format or business rules:** JSON schemas, field names, business-rule limits, or UI-facing copy contracts.

96- **Prompt or model migration:** making sure a new prompt, model, system message, or retrieval setting does not break important cases.

97 

98Start from product requirements, bug reports, support escalations, or sanitized examples your team is comfortable committing to the repo.

99 

100## Ask for an eval plan

101 

102Codex should inspect before it edits. Ask for a plan that names the target path, fixtures, assertions, adapter, and commands. This gives you a chance to catch the wrong target or weak test cases before files are added.

103 

104Review the plan before implementation. It should name the app path or endpoint Promptfoo will call, the first seed cases, the assertions, the files Codex will create, the local command, and any required secrets or services. If the plan tests the raw model instead of the application path users hit, ask Codex whether that is intentional.

105 

106## Implement, run, and iterate

107 

108Once the plan is correct, ask Codex to implement it. The first implementation should be boring: config, cases, fixtures, a target adapter if needed, a command, and proof that the command ran.

109 

110A small app-backed suite might look like this:

111 

112```text

113evals/

114 promptfooconfig.yaml

115 tests/

116 cases.yaml

117 providers/

118 provider.js # only if the built-in provider cannot call the app directly

119```

120 

121Run the suite before changing behavior. The baseline tells you whether the app already fails the cases, whether the assertions need tuning, or whether the target adapter is wrong. Tune assertions when they are too brittle or vague, but keep real product failures visible.

122 

123After the first run, use the suite to compare app changes before they ship. Add new cases whenever a bug, launch requirement, or product review shows behavior you want to keep stable. Once the local command is stable, ask Codex to add it to CI or your release checklist.

Details

1---

2name: Review budget vs. actuals

3tagline: Turn plan, actuals, and close notes into a variance workbook.

4summary: Give Codex a budget, actuals export, and close notes, then ask it to

5 map actuals to plan, calculate variances, flag reconciliation issues, and

6 separate supported explanations from open finance questions.

7skills:

8 - token: $spreadsheets

9 description: Inspect spreadsheet inputs, clean and map rows, create variance

10 tables, and produce reviewable workbook outputs.

11bestFor:

12 - Month-end reviews that compare budget plans with actual spend exports.

13 - Finance teams preparing leadership commentary from GL, spend, or department

14 actuals.

15 - Workbooks where category mapping, tie-outs, and unsupported explanations

16 need review.

17starterPrompt:

18 title: Review budget vs. actuals

19 body: >-

20 Use $spreadsheets to update the budget vs. actuals review from the attached

21 files.

22 

23 

24 Compare actuals to plan, map actuals to the right budget categories,

25 summarize the major variances, and prepare a clean review view as an

26 editable .xlsx workbook.

27 

28 

29 Preserve the raw inputs, use formulas for dollar and percentage variance

30 calculations, and flag categories that do not map cleanly instead of forcing

31 a match. Use account type to determine favorable or unfavorable variance:

32 revenue above plan is favorable, while expense above plan is unfavorable.

33 suggestedEffort: medium

34relatedLinks:

35 - label: Agent skills

36 url: /codex/skills

37---

38 

39## Introduction

40 

41If you're working on a budget and want to review the variances or inspect any issues, Codex can help you create a fully functional review workbook you can work with.

42 

43Attach the budget plan, actuals export, and close notes, then ask Codex for an editable review workbook. Codex can preserve the raw inputs, map actuals to plan, calculate variances, and create a summary view you can inspect in the thread.

44 

45## Create the review workbook

46 

47 

48 

491. Attach the budget plan, actuals export, and close notes, or provide exact file references along with the source.

502. Run the starter prompt and ask for an editable `.xlsx` workbook.

513. Open the workbook in Codex. Expand it into the full-screen view to inspect the raw inputs, mappings, variance formulas, and summary tab.

524. Continue in the same thread to fix category mappings, add department cuts, or draft the finance summary.

53 

54 

55 

56If the source files are in a connected app, mention the exact files or folder. Avoid asking Codex to search a broad Drive or workspace when the review should use specific finance sources. When the workbook appears in the thread, open it in Codex and expand it full-screen to review the raw inputs, mappings, variance formulas, and summary tab before asking for revisions.

57 

58## Check the variances

59 

60Before sharing the workbook, ask Codex to audit the categories, formulas, and variance explanations.

Details

1---

2name: Forecast cash flow

3tagline: Find the liquidity low point in an editable forecast workbook.

4summary: Give Codex cash-flow inputs and model constraints, then ask it to

5 create an editable workbook that preserves the source cadence, flags

6 safety-balance breaches, and shows which assumptions drive cash pressure.

7skills:

8 - token: $spreadsheets

9 description: Build editable forecast workbooks, wire formulas to assumptions,

10 and add checks for scenarios and input gaps.

11bestFor:

12 - Finance and operations teams building a 13-week or monthly cash forecast.

13 - Forecasts that need receipts, payroll, vendor payments, and working-capital

14 assumptions in one workbook.

15 - Teams reviewing runway, safety-balance breaches, and scenario drivers before

16 a planning meeting.

17starterPrompt:

18 title: Forecast cash flow

19 body: >-

20 Use $spreadsheets to build an editable cash-flow forecast workbook from the

21 attached source files.

22 

23 

24 Use beginning cash, expected receipts, payroll, vendor payments, debt, tax,

25 capex, working-capital items, and timing assumptions where available.

26 Preserve the source cadence, whether weekly or monthly.

27 

28 

29 Include a summary view that flags the liquidity low point, the minimum

30 ending cash balance, and any breach of the safety cash threshold. Use

31 formulas so I can change assumptions later, and call out missing timing

32 assumptions before using placeholders.

33 suggestedEffort: medium

34relatedLinks:

35 - label: Agent skills

36 url: /codex/skills

37---

38 

39## Introduction

40 

41When you are building a cash-flow forecast, you want to make sure it is accurate and reflects the reality of your business. You can use Codex to help you create a forecast workbook that you can inspect and revise in Codex. Attach the cash-flow inputs, operating assumptions, and model constraints. You can also use file references when the inputs live in Google Drive or another connected source.

42 

43## Make the forecast

44 

45 

46 

471. Attach the cash-flow inputs, operating assumptions, and model constraints.

482. Run the starter prompt and ask for an editable `.xlsx` workbook.

493. Open the workbook in Codex. Expand it into the full-screen view to inspect assumptions, formulas, scenarios, and the summary tab.

504. Continue in the same thread to change collections, payroll, vendor payment, growth, or safety-balance assumptions.

51 

52 

53 

54When the workbook appears in the thread, open it in Codex and expand it full-screen. Review the timing assumptions, formulas, scenarios, and summary tab, then ask Codex to revise the same workbook from there.

55 

56## Review cash pressure

57 

58Before using the forecast, ask Codex to identify the low point, tie the workbook back to the source inputs, and list assumptions that need review.

59 

60## Run a scenario

61 

62After reviewing the workbook in Codex, use follow-up prompts to change one scenario driver at a time.

Details

55relatedLinks:55relatedLinks:

56 - label: Modernizing your Codebase with Codex56 - label: Modernizing your Codebase with Codex

57 url: /cookbook/examples/codex/code_modernization57 url: /cookbook/examples/codex/code_modernization

58 - label: Follow a goal

59 url: /codex/use-cases/follow-goals

58 - label: Worktrees in the Codex app60 - label: Worktrees in the Codex app

59 url: /codex/app/worktrees61 url: /codex/app/worktrees

60---62---

61 63 

62## Introduction64## Introduction

63 65 

64When you are moving from one stack to another, you can leverage codex to map and execute a controlled migration: routing, data models, configuration, auth, background jobs, build tooling, deployment, tests, or even the language and framework conventions themselves.66When you are moving from one stack to another, you can leverage Codex to map and execute a controlled migration: routing, data models, configuration, auth, background jobs, build tooling, deployment, tests, or even the language and framework conventions themselves.

65 67 

66Codex is useful here because it can inventory the legacy system, map old concepts to new ones, and land the change in checkpoints instead of one giant rewrite. That matters when you are moving off a legacy framework, porting to a new runtime, or incrementally replacing one stack with another while the product still has to keep working.68Codex is useful here because it can inventory the legacy system, map old concepts to new ones, and land the change in checkpoints instead of one giant rewrite. That matters when you are moving off a legacy framework, porting to a new runtime, or incrementally replacing one stack with another while the product still has to keep working.

67 69 


82 84 

83In our [code modernization cookbook](https://developers.openai.com/cookbook/examples/codex/code_modernization), we introduce ExecPlans: documents that let Codex keep an overview of the cleanup, spell out the intended end state, and log validation after each pass.85In our [code modernization cookbook](https://developers.openai.com/cookbook/examples/codex/code_modernization), we introduce ExecPlans: documents that let Codex keep an overview of the cleanup, spell out the intended end state, and log validation after each pass.

84When you ask Codex to run a complex migration, ask it to create an ExecPlan for each part of the system to make sure every decision and tech stack choice is recorded and can be reviewed later.86When you ask Codex to run a complex migration, ask it to create an ExecPlan for each part of the system to make sure every decision and tech stack choice is recorded and can be reviewed later.

87 

88## Combine with a goal

89 

90For long-running migration slices, use a [goal](https://developers.openai.com/codex/use-cases/follow-goals) to guide Codex through the work. Set the goal with a clear end state, parity checks, rollback expectations, and a stopping condition.

Details

26 26 

27- https://developers.openai.com/codex/use-cases/reusable-codex-skills27- https://developers.openai.com/codex/use-cases/reusable-codex-skills

28 28 

29## Keep documentation current

30 

31Ask Codex to compare source changes with existing docs, update the smallest useful docs surface, and verify the changes.

32 

33- https://developers.openai.com/codex/use-cases/update-documentation

34 

29## Maintain system health35## Maintain system health

30 36 

31Let Codex pick up feature requests and bug fixes automatically by using it from Slack and connecting it to your alerting, issue tracking, and daily bug sweeps.37Let Codex pick up feature requests and bug fixes automatically by using it from Slack and connecting it to your alerting, issue tracking, and daily bug sweeps.

Details

5Codex works great with existing design systems, taking into account constraints and visual inputs to produce a responsive UI.5Codex works great with existing design systems, taking into account constraints and visual inputs to produce a responsive UI.

6These use cases are helpful when you are building web apps and need to iterate on frontend designs.6These use cases are helpful when you are building web apps and need to iterate on frontend designs.

7 7 

8## Get from idea to prototype

9 

10Use Codex to turn a rough idea into a visual direction and implement a first prototype.

11 

12- https://developers.openai.com/codex/use-cases/idea-to-proof-of-concept

13 

8## Build from Figma14## Build from Figma

9 15 

10Use Codex to pull design context from Figma and turn it into code that follows the repo's components, styling, and design system.16Use Codex to pull design context from Figma and turn it into code that follows the repo's components, styling, and design system.

use-cases/dcf-model.md +70 −0 added

Details

1---

2name: Model a DCF valuation

3tagline: Turn financial inputs into an editable valuation workbook.

4summary: Attach historical financials, valuation assumptions, and modeling

5 notes, then ask Codex for an editable DCF workbook you can inspect and revise

6 in Codex.

7skills:

8 - token: $spreadsheets

9 description: Create editable spreadsheet workbooks from attached inputs,

10 formulas, and assumptions.

11bestFor:

12 - Analysts turning historical financials and assumptions into a DCF workbook.

13 - Finance teams that want to inspect and iterate on the workbook in Codex.

14 - Teams preparing a valuation model from source files.

15starterPrompt:

16 title: Model a DCF valuation

17 body: >-

18 Use $spreadsheets to build a DCF workbook for the company in the attached

19 source files.

20 

21 

22 Include explicit operating drivers for revenue growth, margins, capex, and

23 working capital. Calculate unlevered free cash flow, WACC, terminal value,

24 and enterprise value. If capital structure and diluted share count are

25 provided, bridge to implied equity value and implied equity value per share.

26 

27 

28 Use any assumptions included in the source files. If an assumption is

29 missing, add a clearly labeled placeholder in the assumptions tab instead of

30 hiding it in a formula. If full balance sheet or cash-flow statement inputs

31 are missing, create the operating forecast needed for unlevered free cash

32 flow and flag the missing statement inputs.

33 

34 

35 Generate the result as an editable .xlsx workbook.

36 suggestedEffort: medium

37relatedLinks:

38 - label: Agent skills

39 url: /codex/skills

40 - label: File inputs

41 url: /api/docs/guides/file-inputs

42---

43 

44## Introduction

45 

46Codex can help you create a fully functional DCF workbook that you can inspect and revise.

47 

48It can use multiple files as context, including the historical financials, valuation assumptions, and any modeling notes.

49You can provide these files directly, or use file references when the inputs live in Google Drive or another connected source. If so, provide the exact file references, as it will be more effective than asking Codex to search through all of your files.

50 

51## Create the workbook

52 

53 

54 

551. Attach the historical financials, valuation assumptions, and any modeling notes, or provide exact file references along with the source.

562. Run the starter prompt and ask for an editable `.xlsx` workbook.

573. Open the generated workbook in Codex. Expand it into the full-screen view to inspect the model tabs, formulas, assumptions, and valuation summary.

584. Continue in the same thread to check formula links, change assumptions, add scenarios, or tighten the model.

59 

60 

61 

62When the workbook appears in the thread, open it in Codex and expand it full-screen. Review the source inputs, forecast drivers, valuation outputs, and sensitivity tables, then ask Codex to revise the same workbook from there.

63 

64## Check the valuation

65 

66Before using the workbook, ask Codex to review the model like a finance teammate would: source tie-outs, formulas, hardcoded assumptions, and valuation outputs.

67 

68## Revise one assumption

69 

70After reviewing the workbook in Codex, ask for targeted revisions in the same thread. Change one driver at a time so the impact is easy to inspect.

Details

1---

2name: Draft PRDs from internal context

3tagline: Create product requirements documents from Linear, Slack, source

4 documents, and meeting notes.

5summary: Use Codex with the $documents skill and connected apps such as Linear,

6 Slack, Notion or Google Drive to create a reviewable PRD with the expected

7 sections, a timeline, decisions, open questions, and a source appendix.

8skills:

9 - token: $documents

10 description: Create, edit, and verify a DOCX when the PRD should become a

11 polished file instead of chat text.

12 - token: slack

13 url: https://github.com/openai/plugins/tree/main/plugins/slack

14 description: Read product discussions, launch threads, decision notes, and

15 follow-up questions from approved channels or thread links.

16 - token: linear

17 url: https://github.com/openai/plugins/tree/main/plugins/linear

18 description: Read projects, issues, priorities, acceptance criteria, and open

19 work that should shape the PRD.

20 - token: google-drive

21 url: https://github.com/openai/plugins/tree/main/plugins/google-drive

22 description: Read planning docs, research notes, specs, exported meeting notes,

23 and source folders.

24 - token: notion

25 url: https://github.com/openai/plugins/tree/main/plugins/notion

26 description: Read roadmap pages, project notes, meeting notes, and team wikis

27 that should shape the PRD.

28bestFor:

29 - Product teams turning planning context into a PRD, proposal, launch brief,

30 or decision memo.

31 - PMs who need to draft a PRD quickly after aligning with the team in internal

32 discussions.

33starterPrompt:

34 title: Draft the PRD

35 body: >-

36 Use $documents to create a PRD for [feature or product area] from @linear

37 [project or milestone], @slack [channel or thread], and @google-drive or

38 @notion [planning docs, research notes, meeting notes, or source folder].

39 

40 

41 Include the problem, users, goals/non-goals, requirements, UX, technical

42 considerations, metrics, launch plan, risks, open questions, decisions,

43 timeline, and source appendix.

44 

45 

46 Cite the sources behind requirement-level claims. If sources disagree, call

47 out the conflict instead of choosing silently. Draft only. Do not post,

48 update Linear, or share the document until I approve it.

49 suggestedEffort: medium

50relatedLinks:

51 - label: Codex plugins

52 url: /codex/plugins

53 - label: Agent skills

54 url: /codex/skills

55 - label: Codex app

56 url: /codex/app

57---

58 

59## Introduction

60 

61Before working on a new product or feature, it's common to draft a product requirements document (PRD) to align on the scope and requirements. Most often than not, the context needed to write that PRD is already available in the team's internal systems: tickets on Linear, discussions on Slack, drafts in Notion or Google Drive, etc. Codex can gather this context and draft a PRD that you can review and iterate on, while keeping the source trail visible.

62 

63## Choose the sources

64 

65Start with the sources you want Codex to use: the Linear project, the Slack planning channel or thread, and any Drive docs, Notion pages, meeting notes, or local files that should be cited in the PRD.

66You should also clearly outline the PRD sections you expect, such as the problem, users, requirements, UX, tech, launch plan, timeline, or decisions.

67 

68 

69 

701. Start with `$documents` when the output should be a real DOCX.

712. Name the sources directly: the Linear project or milestone, the Slack channel or thread, and the docs or notes Codex should cite.

723. Give Codex the PRD section contract.

734. Review the source appendix first, then the requirements and open questions.

745. Use the same thread to resolve gaps, tighten scope, and prepare the handoff.

75 

76 

77 

78## Refine in the same thread

79 

80Use the starter prompt on this page for the first draft. If something is missing, point Codex at the missing source instead of starting over.

81 

82## Check the source trail

83 

84Before sharing the PRD, ask Codex to list the claims with weak or missing support, the unresolved questions, and the decisions it treated as confirmed. If the source appendix does not make those easy to audit, keep refining the same thread before exporting or posting anything.

85 

86### Suggested prompt

87 

88**Check the Source Trail**

use-cases/follow-goals.md +80 −0 added

Details

1---

2name: Follow a goal

3tagline: Give Codex a durable objective for long-running work.

4summary: Use `/goal` when a task needs Codex to keep working across turns toward

5 a verifiable stopping condition.

6bestFor:

7 - Long-running coding work with a clear success condition and validation loop.

8 - Code migrations, large refactors, deployment retry loops, experiments,

9 games, and side projects where Codex can keep making scoped progress.

10 - Teams that need to run long experiments with clear success criteria.

11starterPrompt:

12 title: Set a Long-Running Goal

13 body: /goal Complete [objective] without stopping until [verifiable end state].

14relatedLinks:

15 - label: "`/goal` in CLI slash commands"

16 url: /codex/cli/slash-commands#set-an-experimental-goal-with-goal

17 - label: Codex workflows

18 url: /codex/workflows

19 - label: Run code migrations

20 url: /codex/use-cases/code-migrations

21 - label: Iterate on difficult problems

22 url: /codex/use-cases/iterate-on-difficult-problems

23---

24 

25## Introduction

26 

27Use `/goal` when you want Codex to keep working toward one durable objective instead of stopping after one normal turn. It is useful for work that has a clear target, a validation loop, and enough room for Codex to make progress without asking you to steer every step. When you use `/goal`, Codex can work independently for multiple hours without needing your input.

28 

29`/goal` is an experimental Codex CLI feature. Enable it from `/experimental`, or add `goals = true` under `[features]` in `config.toml`. Then set a goal with `/goal <objective>`, check the current goal with `/goal`, and use `/goal pause`, `/goal resume`, or `/goal clear` when you need to control the run.

30 

31## Choose the right work

32 

33A good goal is bigger than one prompt but smaller than an open-ended backlog. It should define what Codex should achieve, what it should not change, how it should validate progress, and when it should stop.

34 

35This works well for:

36 

37- code migration where the target stack, parity checks, and constraints are clear

38- large refactors where Codex can run tests after each checkpoint

39- experiments, games, or prototypes where Codex can keep improving a working artifact

40 

41Avoid using a goal for a loose list of unrelated work.

42 

43## Set up the loop

44 

45 

46 

471. Name one objective and one stopping condition.

482. Point Codex at the files, docs, issue, logs, or plan it must read first.

493. Define the commands or artifacts that prove progress.

504. Tell Codex to work in checkpoints and keep a short progress log.

515. Use `/goal` to inspect status while it runs.

526. Pause, resume, or clear the goal when the run is done, blocked, or changing direction.

53 

54 

55 

56The important part is the contract. Codex should know what "done" means before it starts. If the goal is a migration, "done" might mean the new path passes contract tests and the legacy path still has a rollback. If the goal is a game or prototype, "done" might mean the app builds, launches, and matches the input reference or expected behavior.

57 

58Ask Codex to help: start by having a conversation about what you want to

59 build, then ask it to directly set a goal and start working.

60 

61## Let Codex work independently

62 

63During a goal, ask for compact progress reports that make the run easy to trust. A useful status update names the current checkpoint, what was verified, what remains, and whether Codex is blocked.

64If the status becomes vague, tighten the goal rather than adding more ad hoc instructions. Tell Codex exactly which checkpoint matters next, which command proves it, and what should cause it to pause.

65 

66When Codex follows a goal, it can work independently for many hours without you having to check in. It will stop running when it is fairly confident it has reached the stopping condition, so you should think of `/goal` as a background task you don't need to monitor.

67 

68## Example goals

69 

70### Migrations

71 

72Whether you're migrating games to a new stack, mobile apps to a new platform, or a codebase to a new framework, you can use `/goal` to have Codex run the migration:

73 

74### Prototype creation

75 

76Whether you're creating a new app from scratch, a new game, or a new feature, you can use `/goal` to have Codex complete a polished first version. You can use a PLAN.md file to guide the creation of the first version, describing precisely what you want to build.

77 

78### Prompt optimization

79 

80When you have an eval suite, you can use `/goal` to optimize prompts against the eval results. Codex can inspect failures, update the prompt, rerun the evals, and keep iterating until the score improves or it reaches your stopping condition.

Details

1---

2name: Get from idea to proof of concept

3tagline: Explore the concept visually with ImageGen and build a first version of

4 your idea.

5summary: Use Codex with ImageGen to turn a rough idea into a visual direction,

6 implement the smallest useful prototype, and verify it in a browser.

7skills:

8 - token: $imagegen

9 description: Generate visual concepts, UI mockups, asset directions, and

10 variants with `gpt-image-2` before Codex implements the selected

11 direction.

12 - token: $playwright

13 url: https://github.com/openai/skills/tree/main/skills/.curated/playwright-interactive

14 description: Open the running app in a real browser, inspect the changed route,

15 and verify each small UI adjustment before the next iteration.

16 - token: build-web-apps

17 url: https://github.com/openai/plugins/tree/main/plugins/build-web-apps

18 description: Use the concept-first workflow for new web apps, dashboards, sites,

19 and frontend prototypes, then verify the implementation in the browser.

20 - token: game-studio

21 url: https://github.com/openai/plugins/tree/main/plugins/game-studio

22 description: Use Game Studio when the proof of concept is a browser game and

23 needs a playable loop, asset workflow, HUD, engine choice, and playtest

24 pass.

25bestFor:

26 - Early product ideas where a working prototype will answer more than a

27 written plan.

28 - Web apps, dashboards, and tools that need visual exploration before

29 implementation.

30 - Teams that want to validate a product idea with a working prototype before

31 investing further.

32starterPrompt:

33 title: Build the Proof of Concept

34 body: >-

35 Use ImageGen to generate a high quality UI mockup for the following idea,

36 then use the [Build Web Apps plugin/Game studio plugin] to implement it:

37 

38 

39 [describe the idea, target user, and the main workflow]

40 suggestedEffort: high

41relatedLinks:

42 - label: Image generation guide

43 url: /api/docs/guides/image-generation

44 - label: Codex plugins

45 url: /codex/plugins

46---

47 

48## Start with a visual direction

49 

50GPT Image 2 is great at generating high quality UI mockups. Instead of starting from scratch when exploring new ideas, you can leverage image generation to get a visual direction.

51 

52You can do this in two ways:

53 

54- Iterate on the visual direction using the ImageGen skill, and once you are satisfied with the proposed UI, you can ask Codex to build a prototype matching the visuals. In that case, make sure to copy the final image you want to implement in a new turn rather than continuing the conversation directly – Codex will do better when it can reference a user attachment.

55- Use a plugin and simply describe your idea: the plugin will generate the visual direction for you and handle next steps.

56 

57## Leverage a plugin

58 

59If you do not need to iterate on the visual direction before starting the implementation, you can use a plugin and describe your idea.

60 

61Use the [Build Web Apps plugin](https://github.com/openai/plugins/tree/main/plugins/build-web-apps)

62for web apps, dashboards, creative websites, and frontend-heavy tools. Its

63workflow pushes Codex to generate a design first, match it in code, and use the

64browser to compare the result back to the concept.

65 

66Use the [Game Studio plugin](https://github.com/openai/plugins/tree/main/plugins/game-studio)

67when the proof of concept is a browser game. That path should define the player

68verbs, first playable loop, engine, asset workflow, HUD, controls, and browser

69test before expanding the game.

70 

71## Iteration workflow

72 

73A good proof of concept is scoped to an MVP that can be implemented quickly and validated with the team.

74If you want to make sure the MVP is working as expected, you can use Playwright interactive to let Codex verify its work.

75 

76Once you have a first version working, you can iterate on it by asking for scoped changes in the same conversation:

Details

1---

2name: Build React Native apps with Expo

3tagline: Go from a mobile-app idea to a working Expo app with the dedicated plugin.

4summary: Use Codex with the Expo plugin to scaffold React Native apps, stay

5 inside Expo Router and Expo-native package conventions, test quickly with Expo

6 Go, and move to dev clients or EAS builds only when the app needs them.

7skills:

8 - token: expo

9 url: https://docs.expo.dev/skills/

10 description: Use Expo-authored skills for Expo Router UI, native-feeling

11 components, data fetching, dev clients, deployment, upgrades, modules, and

12 Codex Run action wiring.

13bestFor:

14 - Developers who want to prototype or ship a React Native app with Expo before

15 reaching for native IDE workflows.

16 - Expo Router projects where Codex should follow Expo conventions for routing,

17 UI, package installs, builds, and deployment.

18 - Developers that need to migrate a web app to a mobile app.

19starterPrompt:

20 title: Build the Expo App

21 body: >-

22 Use the Expo plugin to build a React Native app with Expo for this idea:

23 

24 

25 [describe the app idea, target users, and the main workflow]

26 

27 

28 Requirements:

29 

30 - Start with Expo Router and Expo-native project conventions.

31 

32 - Try `npx expo start` and Expo Go first before creating a custom build.

33 

34 - Use `npx expo install` for Expo packages so dependencies stay compatible.

35 

36 - Use native-feeling UI patterns for navigation, forms, lists, empty states,

37 and loading states.

38 

39 

40 Deliver:

41 

42 - the working app slice

43 

44 - the run command

45 

46 - the verification path you used, including Expo Go, device, simulator, dev

47 client, or EAS

48 suggestedEffort: medium

49relatedLinks:

50 - label: Expo plugin

51 url: https://docs.expo.dev/skills/

52 - label: Expo MCP Server setup

53 url: https://docs.expo.dev/eas/ai/mcp/

54techStack:

55 - need: Mobile framework

56 goodDefault: "[Expo](https://expo.dev/) and [React Native](https://reactnative.dev/)"

57 why: Expo gives Codex a managed React Native path with fast iteration,

58 compatible packages, and deployment tooling.

59 - need: Routing

60 goodDefault: "[Expo Router](https://docs.expo.dev/router/introduction/)"

61 why: Expo Router keeps navigation file-based and predictable, which helps Codex

62 add screens and flows without inventing a custom routing layer.

63---

64 

65## Start with Expo Go

66 

67Expo is a strong default when you want Codex to move from a mobile-app idea to a

68tested React Native app. The useful loop is `expo start` first, Expo Go

69on a device next, and then a dev client or EAS build only when the app needs

70custom native code, store distribution, or a capability that Expo Go can't run.

71 

72That keeps Codex focused on the app workflow instead of spending the first pass

73on native IDE setup, simulator setup, provisioning, or build configuration.

74 

75## Use the Expo plugin

76 

77Expo published an [Expo plugin](https://docs.expo.dev/skills/) that gives Codex Expo-native guidance for Expo Router, native UI, forms,

78navigation, animations, data fetching, NativeWind setup, Expo modules, dev

79clients, deployment, upgrades, and Codex Run action wiring.

80 

81Use it when Codex is building new Expo screens, adding packages, wiring API

82calls, preparing a dev client, or getting an app ready for TestFlight, App

83Store, Play Store, or EAS Hosting.

84 

85Optionally, add the [Expo MCP Server](https://docs.expo.dev/eas/ai/mcp/) when the task needs current

86Expo documentation lookup, compatible package installation, EAS build and

87workflow operations, screenshots, simulator interaction, React Native DevTools,

88or TestFlight data.

89 

90## Iteration process

91 

92 

93 

941. Ask Codex to inspect the repo and confirm whether it is a new Expo app or an

95 existing Expo project.

962. Start with Expo Router and Expo Go, and use `npx expo install` when adding

97 Expo packages.

983. Ask Codex to build one complete workflow with native-feeling navigation,

99 loading states, empty states, and error states.

1004. Verify on the fastest available path, such as Expo Go on a device or a

101 simulator, then move to a dev client or EAS only when needed.

102 

103 

104 

105## Suggested follow-up prompt

Details

1---

2name: Prioritize Slack action items

3tagline: Turn Slack threads and DMs into a ranked queue of next steps.

4summary: Use Codex with Slack and the tools where work happens to find direct

5 asks, implicit follow-ups, resolved items, and the highest-impact next actions

6 before drafting replies or handoffs.

7skills:

8 - token: slack

9 url: https://github.com/openai/plugins/tree/main/plugins/slack

10 description: Search DMs, channels, thread replies, mentions, and shared context

11 before deciding what still needs attention.

12 - token: gmail

13 url: https://github.com/openai/plugins/tree/main/plugins/gmail

14 description: Cross-check email when a Slack thread refers to an outreach, intro,

15 or sent follow-up.

16 - token: google-drive

17 url: https://github.com/openai/plugins/tree/main/plugins/google-drive

18 description: Read linked docs, decks, sheets, or source material when the Slack

19 thread depends on an artifact.

20 - token: google-calendar

21 url: https://github.com/openai/plugins/tree/main/plugins/google-calendar

22 description: Check event timing when a thread depends on a meeting, launch,

23 webinar, or deadline.

24bestFor:

25 - People who get work through Slack and need Codex to separate live asks from

26 already-handled chatter.

27 - Launch, community, support, product, and operations workstreams where

28 context is split across DMs, channels, and threads.

29 - Teams that want a ranked action queue before drafting replies, handoffs,

30 docs changes, or follow-up tasks.

31starterPrompt:

32 title: Find What Needs Attention in Slack

33 body: >-

34 Can you check @slack for messages to me about [workstream] from [time

35 window] and return a ranked action queue?

36 

37 

38 Look across DMs, group DMs, channel mentions, and threads.

39 

40 

41 For each item, include:

42 

43 - source link or thread

44 

45 - what is being asked

46 

47 - whether it needs my reply, a person or lead, a docs or code change, or

48 just a decision

49 

50 - why it matters

51 

52 - the recommended next step

53 

54 

55 Before calling anything unresolved, read the latest thread replies and skip

56 items that were already handled.

57 

58 

59 Do not post messages directly but suggest drafts for my review.

60 suggestedEffort: low

61relatedLinks:

62 - label: Codex plugins

63 url: /codex/plugins

64 - label: Use Codex in Slack

65 url: /codex/integrations/slack

66 - label: Codex automations

67 url: /codex/app/automations

68---

69 

70## Find the work hidden in Slack

71 

72Slack is often where a request starts, but not where the full context lives. A teammate might ask for a reply in a DM, clarify the real action in a thread, link a doc in a channel, and resolve the issue later without mentioning you again.

73 

74Use this workflow when you want Codex to read the Slack context, check whether the ask is still live, and return the few items that actually need your attention. The goal is to get a ranked action queue: what needs a reply, a decision, a person to contact, a doc update, or a handoff.

75 

76## Run the triage pass

77 

78 

79 

801. Give Codex a time window, workstream, person, channel, or topic.

812. Ask it to search DMs, group DMs, channel mentions, and relevant thread replies.

823. Have Codex read the latest thread tail before calling an item unresolved.

834. Ask for a ranked queue sorted by urgency and impact.

845. Ask Codex to draft the reply, handoff, or follow-up task.

85 

86 

87 

88After trying this and tweaking the flow to match your needs, you can turn it into a [thread automation](https://developers.openai.com/codex/app/automations#thread-automations) by asking Codex to do the same thing on a schedule.

89 

90## Ask for the right output

91 

92A useful triage result should explain why each item is still live. It should also skip old asks that someone answered later in the thread.

93 

94You should expect to see something like this:

95 

96 

97 

98<p>

99 <strong>Top action item:</strong> Priya is asking for concrete customer

100 examples, not just more ideas.

101 </p>

102 <p>

103 <strong>Why it matters:</strong> the launch update needs real people the

104 team can contact this week.

105 </p>

106 <p>

107 <strong>Evidence:</strong> the original channel message asked for use cases,

108 but the thread later says "please DM me if you have leads."

109 </p>

110 <p>

111 <strong>Next step:</strong> reply with two named leads, or say you can be

112 the example if that is more useful.

113 </p>

114 

115 

116 

117Good output makes the distinction explicit: an idea is different from a lead, a live ask is different from an FYI, and a request you already answered shouldn't stay in the queue.

118 

119If you get too much noise or too few actionable items, tweak the prompt and if needed, mention specific slack channels you want Codex to pay attention to.

120 

121## Draft the follow-up

122 

123Once the queue is right, keep the action in the same thread. Ask Codex to draft a reply or handoff from the evidence it already gathered:

Details

1---

2name: Keep documentation up-to-date

3tagline: Use code and other sources to automate docs updates.

4summary: Use Codex to compare source code changes, public docs, release notes,

5 and PR context, then draft focused documentation updates with verification

6 steps before publishing.

7skills:

8 - token: github

9 url: https://github.com/openai/plugins/tree/main/plugins/github

10 description: Read issues, pull requests, comments, review threads, and failed

11 checks when GitHub is part of your bug intake.

12bestFor:

13 - Developer docs, READMEs, runbooks, examples, and migration notes that need

14 to track behavior that changes frequently.

15 - Teams that maintain documentation for a technical product.

16starterPrompt:

17 title: Update Docs From Source Changes

18 body: >-

19 Update the [product/feature] documentation based on the following sources:

20 

21 - the changed source files in [this repo/source linked repo]

22 

23 - the existing docs pages that mention a new behavior

24 

25 - any linked issue, PR, release note, or public reference I provide below

26 

27 

28 Then:

29 

30 - identify what is user-facing

31 

32 - update only the docs that need to change

33 

34 - keep unpublished roadmap, private customer details, and internal-only

35 context out of public docs

36 

37 - preserve the existing docs structure, terminology, and cross-links

38 

39 - run the docs checks that fit the change

40 

41 

42 Before finalizing, summarize what changed, what you verified, and any claims

43 you could not prove from trusted sources.

44 

45 

46 [link release notes or other references here]

47relatedLinks:

48 - label: Workflows

49 url: /codex/workflows

50---

51 

52## Introduction

53 

54Documentation is easiest to keep current when it is updated alongside source changes, not weeks later. Codex can inspect changed code, tests, release notes, linked issues, and pull request context, then draft a scoped docs update that matches the existing structure.

55 

56Use this workflow for developer docs, README updates, changelog drafts, migration notes, runbooks, or anything else that needs to track behavior that changes frequently.

57 

58## How to use

59 

60 

61 

621. Start from the change you need to document.

63 

64 Share the branch, pull request, commit, issue, or files. If the docs are public, say explicitly that unpublished roadmap, private customer details, and internal-only context should stay out.

65 

662. Ask Codex to map the affected docs.

67 

68 Have it search existing docs for feature names, config keys, commands, examples, and related terms before drafting.

69 

703. Update the smallest useful docs surface.

71 

72 Codex should preserve the current page structure, terminology, cross-links, and frontmatter. It should avoid broad rewrites when a precise note, example, or section update is enough.

73 

744. Verify the changes.

75 

76 Ask Codex to run formatting and docs checks that fit the repo, then summarize the evidence behind each user-facing claim.

77 

78## What to give Codex

79 

80| Source | Why it helps |

81| ------------------------------------ | -------------------------------------------------------------------------- |

82| Changed code and tests | Lets Codex analyze actual behavior to draft focused documentation updates. |

83| Public release notes or product docs | Helps Codex match public terminology, availability, and feature status. |

84| Pull request or issue context | Explains why the change happened and which user-facing behavior matters. |

85| Local docs checks | Gives Codex a concrete definition of done before the docs are published. |

86 

87Adding more context such as public release notes lets Codex avoid including private context or updates that are not yet public.

88 

89## Make the workflow repeatable

90 

91For a repo-wide convention, add documentation expectations to [AGENTS.md](https://developers.openai.com/codex/guides/agents-md). For example:

92 

93```md

94## Documentation

95 

96- When user-facing behavior changes, check whether docs, examples, or changelogs need updates.

97- Public docs must only include public information or behavior visible in this repo.

98- Preserve existing terminology and frontmatter.

99- Run the docs formatting and build checks before final handoff.

100```

101 

102If the process has more steps, turn it into a [skill](https://developers.openai.com/codex/skills) so future Codex threads can follow the same source-checking, drafting, and verification loop. See [Save workflows as skills](https://developers.openai.com/codex/use-cases/reusable-codex-skills) that shares more details on this pattern.

103 

104You can also turn this workflow into a [thread automation](https://developers.openai.com/codex/app/automations#thread-automations) by asking Codex to run it on a schedule, asking to fetch all the recent PRs from GitHub to automatically keep docs up-to-date, for example on a weekly basis:

Details

1---

2name: Turn user stories into UI mocks

3tagline: Convert product feedback, issue threads, and design context into

4 mockups your team can react to and implement.

5summary: Use Codex to gather product feedback from Slack, Linear, Google Drive,

6 normalize it into user stories and constraints, then generate UI mockups with

7 ImageGen. When the direction is chosen, turn the mock into a working

8 prototype.

9skills:

10 - token: slack

11 url: https://github.com/openai/plugins/tree/main/plugins/slack

12 description: Search approved feedback channels and threads for user stories,

13 pain points, quotes, and open questions.

14 - token: linear

15 url: https://github.com/openai/plugins/tree/main/plugins/linear

16 description: Pull feature requests, bug reports, labels, priorities, and project

17 context into the mock brief.

18 - token: google-drive

19 url: https://github.com/openai/plugins/tree/main/plugins/google-drive

20 description: Read research notes, call summaries, docs, sheets, and slides that

21 contain product feedback or design requirements.

22 - token: figma

23 url: https://github.com/openai/plugins/tree/main/plugins/figma

24 description: Fetch design context, screenshots, and design-system references so

25 mocks do not drift away from the product's visual language.

26 - token: $imagegen

27 description: Generate UI mockups, variations, and visual truth from the

28 synthesized stories and design constraints.

29 - token: build-web-apps

30 url: https://github.com/openai/plugins/tree/main/plugins/build-web-apps

31 description: Turn the selected mock into a working web prototype and verify the

32 implementation against the mock.

33bestFor:

34 - Product teams turning scattered feedback into a visual direction for a

35 feature.

36 - Design and engineering teams that want mockups grounded in source material

37 before building.

38 - Teams who want to iterate fast based on user feedback.

39starterPrompt:

40 title: Create Mocks from User Stories

41 body: >-

42 Turn this [user story/set of user feedbacks] into a UI mock for a feature

43 that would solve the problem, using these sources as context:

44 

45 - @slack [channels or thread links]

46 

47 - @linear [issue links, project, team, or view]

48 

49 - @google-drive [research notes, survey export, doc, sheet, or slide deck]

50 

51 

52 Do that while respecting the current design system and existing UI [provide

53 Figma file or screenshot as reference].

54 suggestedEffort: medium

55relatedLinks:

56 - label: Codex plugins

57 url: /codex/plugins

58---

59 

60## Introduction

61 

62Product teams often collect feedback from various sources, such as Slack threads, Linear issues, Google Drive docs or sheets, or customer-call notes. Sometimes, they have clear user stories illustrating a problem they want to solve, and sometimes, the context lives in those sources.

63 

64Codex can gather this context and turn it into a UI mock for a feature that would solve the problem, and once validated, can be implemented into the product.

65 

66## Generate visual truth

67 

68If you have a clear user story, you can start with that. If not, you can have a discussion with Codex first, gathering context from different sources and synthesizing it into a user story.

69 

70Then, you can ask Codex to use ImageGen to create a few mock directions. The mocks should preserve the product's information architecture and design-system constraints.

71 

72If helpful, you can provide screenshots of the current UI or a Figma file as reference.

73 

74Do this until you are satisfied with the mock. The more scoped the changes are, the more likely Codex is to generate a mock that can be implemented directly.

75 

76## Move from mock to prototype

77 

78Use the final mock image that you want Codex to implement. Re-attach this image in a new turn rather than continuing the conversation directly.

79You can then ask Codex to implement the mock – optionally using the [Build Web Apps plugin](https://developers.openai.com/codex/plugins/build-web-apps) if you're building a web app – to turn it into a working prototype:

Details

1---

2name: Run verified operations

3tagline: Run repeatable workflows and verify the result.

4summary: Use Codex to normalize inputs, run approved scripts or APIs, retry

5 bounded failures, and verify the result from logs or artifacts before

6 reporting back.

7bestFor:

8 - Operations tasks with structured inputs, explicit approval, and a result

9 that should be auditable.

10 - Repeated workflows such as access updates, invite batches, quota changes,

11 customer setup tasks, routing checks, and migration follow-ups.

12 - Teams that need Codex to run a narrow scope and report exactly what

13 succeeded, failed, or needs a human decision.

14starterPrompt:

15 title: Run an Approved Workflow

16 body: >-

17 I need to run this workflow:

18 

19 

20 Goal: [what should happen]

21 

22 Inputs: [CSV, Google Sheet, list, ticket, or file path]

23 

24 Approval or policy source: [Slack thread, doc, ticket, or none]

25 

26 Runner: [script, API, CLI, skill, or manual app workflow]

27 

28 Verification artifact: [result CSV, log, dashboard, screenshot, or other

29 proof]

30 

31 

32 Please:

33 

34 - inspect the inputs and ask only for missing required fields

35 

36 - normalize dates, amounts, owners, and IDs before running the workflow

37 

38 - run a dry run first when the workflow supports it

39 

40 - run only the approved scope

41 

42 - record one success or failure row per item

43 

44 - retry transient failures once without restarting successful rows

45 

46 - summarize totals, failures, retries, and verification artifacts

47 

48 

49 Pause before irreversible actions or scope changes.

50 suggestedEffort: medium

51relatedLinks:

52 - label: Codex plugins

53 url: /codex/plugins

54 - label: Codex automations

55 url: /codex/app/automations

56 - label: Agent skills

57 url: /codex/skills

58---

59 

60## Run operations you can audit

61 

62If you have repeatable operations you need to run regularly, such as giving access to a user, applying a batch update, or calling a script with different parameters for example, you can use Codex to automate it and give you an auditable output.

63 

64Use this workflow when Codex should run a repeatable operation and show you what happened with an artifact that counts as verification.

65 

66## Describe the task and inputs

67 

68 

69 

701. Give Codex the input table, files, tickets, or other list it should batch run the process on.

712. Point it to the approval source or policy that defines the allowed scope, if applicable.

723. Tell Codex which script, API, skill, CLI, or app workflow should do the work.

734. Optionally, ask for a dry run when the workflow supports one.

745. Ask Codex to run the batch operation and record one success or failure row per item.

75 

76 

77 

78Keep the scope narrow, and add instructions for Codex to run the operation only when it has all the required inputs.

79If a row is missing a required field, Codex should flag that row instead of guessing.

80 

81Connect the tools you use to run the operation with [plugins](https://developers.openai.com/codex/plugins), for example your ticketing system or your spreadsheet with list items.

82 

83## Require proof to verify the result

84 

85A useful operations run includes an artifact that you or a teammate can inspect, such as a result CSV, a log file, a dashboard link, a screenshot, a PR check, or any other proof that the operation was successful. When using the Codex app, you can inspect this [artifact](https://developers.openai.com/codex/app/artifacts) directly in the artifact viewer after the run to verify the result.

86 

87## Turn the run into a reusable workflow

88 

89After the first successful run, ask Codex to capture the repeatable parts. For common workflows, this can become a [skill](https://developers.openai.com/codex/skills), or an [automation](https://developers.openai.com/codex/app/automations) that runs on a schedule.

90 

91For scheduled operations, use an automation only after the manual run produces reliable output. Keep sensitive actions that might affect access or data permanently draft-only unless you explicitly want Codex to take them.