subagents diff — Spybara

subagents.md +340 −0 added

Details

1# Subagents

3Codex can run subagent workflows by spawning specialized agents in parallel and then collecting their results in one response. This can be particularly helpful for complex tasks that are highly parallel, such as codebase exploration or implementing a multi-step feature plan.

5With subagent workflows, you can also define your own custom agents with different model configurations and instructions depending on the task.

7For the concepts and tradeoffs behind subagent workflows, including context pollution, context rot, and model-selection guidance, see [Subagent concepts](https://developers.openai.com/codex/concepts/subagents).

9## Availability

11Current Codex releases enable subagent workflows by default.

13Subagent activity is currently surfaced in the Codex app and CLI. Visibility

14 in the IDE Extension is coming soon.

16Codex only spawns subagents when you explicitly ask it to. Because each

17subagent does its own model and tool work, subagent workflows consume more

18tokens than comparable single-agent runs.

20## Typical workflow

22Codex handles orchestration across agents, including spawning new subagents,

23routing follow-up instructions, waiting for results, and closing agent

24threads.

26When many agents are running, Codex waits until all requested results are

27available, then returns a consolidated response.

29Codex only spawns a new agent when you explicitly ask it to do so.

31To see it in action, try the following prompt on your project:

33```text

34I would like to review the following points on the current PR (this branch vs main). Spawn one agent per point, wait for all of them, and summarize the result for each point.

351. Security issue

362. Code quality

373. Bugs

384. Race

395. Test flakiness

406. Maintainability of the code

41```

43## Managing subagents

45- Use `/agent` in the CLI to switch between active agent threads and inspect the ongoing thread.

46- Ask Codex directly to steer a running subagent, stop it, or close completed agent threads.

48## Approvals and sandbox controls

50Subagents inherit your current sandbox policy.

52In interactive CLI sessions, approval requests can surface from inactive agent

53threads even while you are looking at the main thread. The approval overlay

54shows the source thread label, and you can press `o` to open that thread before

55you approve, reject, or answer the request.

57In non-interactive flows, or whenever a run can't surface a fresh approval, an

58action that needs new approval fails and Codex surfaces the error back to the

59parent workflow.

61Codex also reapplies the parent turn's live runtime overrides when it spawns a

62child. That includes sandbox and approval choices you set interactively during

63the session, such as `/approvals` changes or `--yolo`, even if the selected

64custom agent file sets different defaults.

66You can also override the sandbox configuration for individual [custom agents](#custom-agents), such as explicitly marking one to work in read-only mode.

68## Custom agents

70Codex ships with built-in agents:

72- `default`: general-purpose fallback agent.

73- `worker`: execution-focused agent for implementation and fixes.

74- `explorer`: read-heavy codebase exploration agent.

76To define your own custom agents, add standalone TOML files under

77`~/.codex/agents/` for personal agents or `.codex/agents/` for project-scoped

78agents.

80Each file defines one custom agent. Codex loads these files as configuration

81layers for spawned sessions, so custom agents can override the same settings as

82a normal Codex session config. That can feel heavier than a dedicated agent

83manifest, and the format may evolve as authoring and sharing mature.

85Every standalone custom agent file must define:

87- `name`

88- `description`

89- `developer_instructions`

91Optional fields such as `nickname_candidates`, `model`,

92`model_reasoning_effort`, `sandbox_mode`, `mcp_servers`, and `skills.config`

93inherit from the parent session when you omit them.

95### Global settings

97Global subagent settings still live under `[agents]` in your [configuration](https://developers.openai.com/codex/config-basic#configuration-precedence).

100| --- | --- | --- | --- |

104

105**Notes:**

106

107- `agents.max_threads` defaults to `6` when you leave it unset.

108- `agents.max_depth` defaults to `1`, which allows a direct child agent to spawn but prevents deeper nesting. Keep the default unless you specifically need recursive delegation. Raising this value can turn broad delegation instructions into repeated fan-out, which increases token usage, latency, and local resource consumption. `agents.max_threads` still caps concurrent open threads, but it doesn't remove the cost and predictability risks of deeper recursion.

109- `agents.job_max_runtime_seconds` is optional. When you leave it unset, `spawn_agents_on_csv` falls back to its per-call default timeout of 1800 seconds per worker.

110- If a custom agent name matches a built-in agent such as `explorer`, your custom agent takes precedence.

111

112### Custom agent file schema

113

115| --- | --- | --- | --- |

120

121You can also include other supported `config.toml` keys in a custom agent file, such as `model`, `model_reasoning_effort`, `sandbox_mode`, `mcp_servers`, and `skills.config`.

122

123Codex identifies the custom agent by its `name` field. Matching the filename to

124the agent name is the simplest convention, but the `name` field is the source

125of truth.

126

127### Display nicknames

128

129Use `nickname_candidates` when you want Codex to assign more readable display

130names to spawned agents. This is especially helpful when you run many

131instances of the same custom agent and want the UI to show distinct labels

132instead of repeating the same agent name.

133

134Nicknames are presentation-only. Codex still identifies and spawns the agent by

135its `name`.

136

137Nickname candidates must be a non-empty list of unique names. Each nickname can

138use ASCII letters, digits, spaces, hyphens, and underscores.

139

140Example:

141

142```toml

143name = "reviewer"

144description = "PR reviewer focused on correctness, security, and missing tests."

145developer_instructions = """

146Review code like an owner.

147Prioritize correctness, security, behavior regressions, and missing test coverage.

148"""

149nickname_candidates = ["Atlas", "Delta", "Echo"]

150```

151

152In practice, the Codex app and CLI can show the nicknames where agent activity

153appears, while the underlying agent type stays

154`reviewer`.

155

156### Example custom agents

157

158The best custom agents are narrow and opinionated. Give each one clear job, a

159tool surface that matches that job, and instructions that keep it from

160drifting into adjacent work.

161

162#### Example 1: PR review

163

164This pattern splits review across three focused custom agents:

165

166- `pr_explorer` maps the codebase and gathers evidence.

167- `reviewer` looks for correctness, security, and test risks.

168- `docs_researcher` checks framework or API documentation through a dedicated MCP server.

169

170Project config (`.codex/config.toml`):

171

172```toml

173[agents]

174max_threads = 6

175max_depth = 1

176```

177

178`.codex/agents/pr-explorer.toml`:

179

180```toml

181name = "pr_explorer"

182description = "Read-only codebase explorer for gathering evidence before changes are proposed."

183model = "gpt-5.3-codex-spark"

184model_reasoning_effort = "medium"

185sandbox_mode = "read-only"

186developer_instructions = """

187Stay in exploration mode.

188Trace the real execution path, cite files and symbols, and avoid proposing fixes unless the parent agent asks for them.

189Prefer fast search and targeted file reads over broad scans.

190"""

191```

192

193`.codex/agents/reviewer.toml`:

194

195```toml

196name = "reviewer"

197description = "PR reviewer focused on correctness, security, and missing tests."

198model = "gpt-5.4"

199model_reasoning_effort = "high"

200sandbox_mode = "read-only"

201developer_instructions = """

202Review code like an owner.

203Prioritize correctness, security, behavior regressions, and missing test coverage.

204Lead with concrete findings, include reproduction steps when possible, and avoid style-only comments unless they hide a real bug.

205"""

206```

207

208`.codex/agents/docs-researcher.toml`:

209

210```toml

211name = "docs_researcher"

212description = "Documentation specialist that uses the docs MCP server to verify APIs and framework behavior."

213model = "gpt-5.4-mini"

214model_reasoning_effort = "medium"

215sandbox_mode = "read-only"

216developer_instructions = """

217Use the docs MCP server to confirm APIs, options, and version-specific behavior.

218Return concise answers with links or exact references when available.

219Do not make code changes.

220"""

221

222[mcp_servers.openaiDeveloperDocs]

223url = "https://developers.openai.com/mcp"

224```

225

226This setup works well for prompts like:

227

228```text

229Review this branch against main. Have pr_explorer map the affected code paths, reviewer find real risks, and docs_researcher verify the framework APIs that the patch relies on.

230```

231

232## Process CSV batches with subagents (experimental)

233

234This workflow is experimental and may change as subagent support evolves.

235Use `spawn_agents_on_csv` when you have many similar tasks that map to one row per work item. Codex reads the CSV, spawns one worker subagent per row, waits for the full batch to finish, and exports the combined results to CSV.

236

237This works well for repeated audits such as:

238

239- reviewing one file, package, or service per row

240- checking a list of incidents, PRs, or migration targets

241- generating structured summaries for many similar inputs

242

243The tool accepts:

244

245- `csv_path` for the source CSV

246- `instruction` for the worker prompt template, using `{column_name}` placeholders

247- `id_column` when you want stable item ids from a specific column

248- `output_schema` when each worker should return a JSON object with a fixed shape

249- `output_csv_path`, `max_concurrency`, and `max_runtime_seconds` for job control

250

251Each worker must call `report_agent_job_result` exactly once. If a worker exits without reporting a result, Codex marks that row with an error in the exported CSV.

252

253Example prompt:

254

255```text

256Create /tmp/components.csv with columns path,owner and one row per frontend component.

257

258Then call spawn_agents_on_csv with:

259- csv_path: /tmp/components.csv

260- id_column: path

261- instruction: "Review {path} owned by {owner}. Return JSON with keys path, risk, summary, and follow_up via report_agent_job_result."

262- output_csv_path: /tmp/components-review.csv

263- output_schema: an object with required string fields path, risk, summary, and follow_up

264```

265

266When you run this through `codex exec`, Codex shows a single-line progress update on `stderr` while the batch is running. The exported CSV includes the original row data plus metadata such as `job_id`, `item_id`, `status`, `last_error`, and `result_json`.

267

268Related runtime settings:

269

270- `agents.max_threads` caps how many agent threads can stay open concurrently.

271- `agents.job_max_runtime_seconds` sets the default per-worker timeout for CSV fan-out jobs. A per-call `max_runtime_seconds` override takes precedence.

272- `sqlite_home` controls where Codex stores the SQLite-backed state used for agent jobs and their exported results.

273

274#### Example 2: Frontend integration debugging

275

276This pattern is useful for UI regressions, flaky browser flows, or integration bugs that cross application code and the running product.

277

278Project config (`.codex/config.toml`):

279

280```toml

281[agents]

282max_threads = 6

283max_depth = 1

284```

285

286`.codex/agents/code-mapper.toml`:

287

288```toml

289name = "code_mapper"

290description = "Read-only codebase explorer for locating the relevant frontend and backend code paths."

291model = "gpt-5.4-mini"

292model_reasoning_effort = "medium"

293sandbox_mode = "read-only"

294developer_instructions = """

295Map the code that owns the failing UI flow.

296Identify entry points, state transitions, and likely files before the worker starts editing.

297"""

298```

299

300`.codex/agents/browser-debugger.toml`:

301

302```toml

303name = "browser_debugger"

304description = "UI debugger that uses browser tooling to reproduce issues and capture evidence."

305model = "gpt-5.4"

306model_reasoning_effort = "high"

307sandbox_mode = "workspace-write"

308developer_instructions = """

309Reproduce the issue in the browser, capture exact steps, and report what the UI actually does.

310Use browser tooling for screenshots, console output, and network evidence.

311Do not edit application code.

312"""

313

314[mcp_servers.chrome_devtools]

315url = "http://localhost:3000/mcp"

316startup_timeout_sec = 20

317```

318

319`.codex/agents/ui-fixer.toml`:

320

321```toml

322name = "ui_fixer"

323description = "Implementation-focused agent for small, targeted fixes after the issue is understood."

324model = "gpt-5.3-codex-spark"

325model_reasoning_effort = "medium"

326developer_instructions = """

327Own the fix once the issue is reproduced.

328Make the smallest defensible change, keep unrelated files untouched, and validate only the behavior you changed.

329"""

330

331[[skills.config]]

332path = "/Users/me/.agents/skills/docs-editor/SKILL.md"

333enabled = false

334```

335

336This setup works well for prompts like:

337

338```text

339Investigate why the settings modal fails to save. Have browser_debugger reproduce it, code_mapper trace the responsible code path, and ui_fixer implement the smallest fix once the failure mode is clear.

340```

subagents.md Codex Docs, 2026-03-06 00:38 UTC → 2026-04-23 12:28 UTC