subagents.md +340 −0 added
1# Subagents
2
3Codex can run subagent workflows by spawning specialized agents in parallel and then collecting their results in one response. This can be particularly helpful for complex tasks that are highly parallel, such as codebase exploration or implementing a multi-step feature plan.
4
5With subagent workflows, you can also define your own custom agents with different model configurations and instructions depending on the task.
6
7For the concepts and tradeoffs behind subagent workflows, including context pollution, context rot, and model-selection guidance, see [Subagent concepts](https://developers.openai.com/codex/concepts/subagents).
8
9## Availability
10
11Current Codex releases enable subagent workflows by default.
12
13Subagent activity is currently surfaced in the Codex app and CLI. Visibility
14 in the IDE Extension is coming soon.
15
16Codex only spawns subagents when you explicitly ask it to. Because each
17subagent does its own model and tool work, subagent workflows consume more
18tokens than comparable single-agent runs.
19
20## Typical workflow
21
22Codex handles orchestration across agents, including spawning new subagents,
23routing follow-up instructions, waiting for results, and closing agent
24threads.
25
26When many agents are running, Codex waits until all requested results are
27available, then returns a consolidated response.
28
29Codex only spawns a new agent when you explicitly ask it to do so.
30
31To see it in action, try the following prompt on your project:
32
33```text
34I would like to review the following points on the current PR (this branch vs main). Spawn one agent per point, wait for all of them, and summarize the result for each point.
351. Security issue
362. Code quality
373. Bugs
384. Race
395. Test flakiness
406. Maintainability of the code
41```
42
43## Managing subagents
44
45- Use `/agent` in the CLI to switch between active agent threads and inspect the ongoing thread.
46- Ask Codex directly to steer a running subagent, stop it, or close completed agent threads.
47
48## Approvals and sandbox controls
49
50Subagents inherit your current sandbox policy.
51
52In interactive CLI sessions, approval requests can surface from inactive agent
53threads even while you are looking at the main thread. The approval overlay
54shows the source thread label, and you can press `o` to open that thread before
55you approve, reject, or answer the request.
56
57In non-interactive flows, or whenever a run can't surface a fresh approval, an
58action that needs new approval fails and Codex surfaces the error back to the
59parent workflow.
60
61Codex also reapplies the parent turn's live runtime overrides when it spawns a
62child. That includes sandbox and approval choices you set interactively during
63the session, such as `/approvals` changes or `--yolo`, even if the selected
64custom agent file sets different defaults.
65
66You can also override the sandbox configuration for individual [custom agents](#custom-agents), such as explicitly marking one to work in read-only mode.
67
68## Custom agents
69
70Codex ships with built-in agents:
71
72- `default`: general-purpose fallback agent.
73- `worker`: execution-focused agent for implementation and fixes.
74- `explorer`: read-heavy codebase exploration agent.
75
76To define your own custom agents, add standalone TOML files under
77`~/.codex/agents/` for personal agents or `.codex/agents/` for project-scoped
78agents.
79
80Each file defines one custom agent. Codex loads these files as configuration
81layers for spawned sessions, so custom agents can override the same settings as
82a normal Codex session config. That can feel heavier than a dedicated agent
83manifest, and the format may evolve as authoring and sharing mature.
84
85Every standalone custom agent file must define:
86
87- `name`
88- `description`
89- `developer_instructions`
90
91Optional fields such as `nickname_candidates`, `model`,
92`model_reasoning_effort`, `sandbox_mode`, `mcp_servers`, and `skills.config`
93inherit from the parent session when you omit them.
94
95### Global settings
96
97Global subagent settings still live under `[agents]` in your [configuration](https://developers.openai.com/codex/config-basic#configuration-precedence).
98
99| Field | Type | Required | Purpose |
100| --- | --- | --- | --- |
101| `agents.max_threads` | number | No | Concurrent open agent thread cap. |
102| `agents.max_depth` | number | No | Spawned agent nesting depth (root session starts at 0). |
103| `agents.job_max_runtime_seconds` | number | No | Default timeout per worker for `spawn_agents_on_csv` jobs. |
104
105**Notes:**
106
107- `agents.max_threads` defaults to `6` when you leave it unset.
108- `agents.max_depth` defaults to `1`, which allows a direct child agent to spawn but prevents deeper nesting. Keep the default unless you specifically need recursive delegation. Raising this value can turn broad delegation instructions into repeated fan-out, which increases token usage, latency, and local resource consumption. `agents.max_threads` still caps concurrent open threads, but it doesn't remove the cost and predictability risks of deeper recursion.
109- `agents.job_max_runtime_seconds` is optional. When you leave it unset, `spawn_agents_on_csv` falls back to its per-call default timeout of 1800 seconds per worker.
110- If a custom agent name matches a built-in agent such as `explorer`, your custom agent takes precedence.
111
112### Custom agent file schema
113
114| Field | Type | Required | Purpose |
115| --- | --- | --- | --- |
116| `name` | string | Yes | Agent name Codex uses when spawning or referring to this agent. |
117| `description` | string | Yes | Human-facing guidance for when Codex should use this agent. |
118| `developer_instructions` | string | Yes | Core instructions that define the agent's behavior. |
119| `nickname_candidates` | string[] | No | Optional pool of display nicknames for spawned agents. |
120
121You can also include other supported `config.toml` keys in a custom agent file, such as `model`, `model_reasoning_effort`, `sandbox_mode`, `mcp_servers`, and `skills.config`.
122
123Codex identifies the custom agent by its `name` field. Matching the filename to
124the agent name is the simplest convention, but the `name` field is the source
125of truth.
126
127### Display nicknames
128
129Use `nickname_candidates` when you want Codex to assign more readable display
130names to spawned agents. This is especially helpful when you run many
131instances of the same custom agent and want the UI to show distinct labels
132instead of repeating the same agent name.
133
134Nicknames are presentation-only. Codex still identifies and spawns the agent by
135its `name`.
136
137Nickname candidates must be a non-empty list of unique names. Each nickname can
138use ASCII letters, digits, spaces, hyphens, and underscores.
139
140Example:
141
142```toml
143name = "reviewer"
144description = "PR reviewer focused on correctness, security, and missing tests."
145developer_instructions = """
146Review code like an owner.
147Prioritize correctness, security, behavior regressions, and missing test coverage.
148"""
149nickname_candidates = ["Atlas", "Delta", "Echo"]
150```
151
152In practice, the Codex app and CLI can show the nicknames where agent activity
153appears, while the underlying agent type stays
154`reviewer`.
155
156### Example custom agents
157
158The best custom agents are narrow and opinionated. Give each one clear job, a
159tool surface that matches that job, and instructions that keep it from
160drifting into adjacent work.
161
162#### Example 1: PR review
163
164This pattern splits review across three focused custom agents:
165
166- `pr_explorer` maps the codebase and gathers evidence.
167- `reviewer` looks for correctness, security, and test risks.
168- `docs_researcher` checks framework or API documentation through a dedicated MCP server.
169
170Project config (`.codex/config.toml`):
171
172```toml
173[agents]
174max_threads = 6
175max_depth = 1
176```
177
178`.codex/agents/pr-explorer.toml`:
179
180```toml
181name = "pr_explorer"
182description = "Read-only codebase explorer for gathering evidence before changes are proposed."
183model = "gpt-5.3-codex-spark"
184model_reasoning_effort = "medium"
185sandbox_mode = "read-only"
186developer_instructions = """
187Stay in exploration mode.
188Trace the real execution path, cite files and symbols, and avoid proposing fixes unless the parent agent asks for them.
189Prefer fast search and targeted file reads over broad scans.
190"""
191```
192
193`.codex/agents/reviewer.toml`:
194
195```toml
196name = "reviewer"
197description = "PR reviewer focused on correctness, security, and missing tests."
198model = "gpt-5.4"
199model_reasoning_effort = "high"
200sandbox_mode = "read-only"
201developer_instructions = """
202Review code like an owner.
203Prioritize correctness, security, behavior regressions, and missing test coverage.
204Lead with concrete findings, include reproduction steps when possible, and avoid style-only comments unless they hide a real bug.
205"""
206```
207
208`.codex/agents/docs-researcher.toml`:
209
210```toml
211name = "docs_researcher"
212description = "Documentation specialist that uses the docs MCP server to verify APIs and framework behavior."
213model = "gpt-5.4-mini"
214model_reasoning_effort = "medium"
215sandbox_mode = "read-only"
216developer_instructions = """
217Use the docs MCP server to confirm APIs, options, and version-specific behavior.
218Return concise answers with links or exact references when available.
219Do not make code changes.
220"""
221
222[mcp_servers.openaiDeveloperDocs]
223url = "https://developers.openai.com/mcp"
224```
225
226This setup works well for prompts like:
227
228```text
229Review this branch against main. Have pr_explorer map the affected code paths, reviewer find real risks, and docs_researcher verify the framework APIs that the patch relies on.
230```
231
232## Process CSV batches with subagents (experimental)
233
234This workflow is experimental and may change as subagent support evolves.
235Use `spawn_agents_on_csv` when you have many similar tasks that map to one row per work item. Codex reads the CSV, spawns one worker subagent per row, waits for the full batch to finish, and exports the combined results to CSV.
236
237This works well for repeated audits such as:
238
239- reviewing one file, package, or service per row
240- checking a list of incidents, PRs, or migration targets
241- generating structured summaries for many similar inputs
242
243The tool accepts:
244
245- `csv_path` for the source CSV
246- `instruction` for the worker prompt template, using `{column_name}` placeholders
247- `id_column` when you want stable item ids from a specific column
248- `output_schema` when each worker should return a JSON object with a fixed shape
249- `output_csv_path`, `max_concurrency`, and `max_runtime_seconds` for job control
250
251Each worker must call `report_agent_job_result` exactly once. If a worker exits without reporting a result, Codex marks that row with an error in the exported CSV.
252
253Example prompt:
254
255```text
256Create /tmp/components.csv with columns path,owner and one row per frontend component.
257
258Then call spawn_agents_on_csv with:
259- csv_path: /tmp/components.csv
260- id_column: path
261- instruction: "Review {path} owned by {owner}. Return JSON with keys path, risk, summary, and follow_up via report_agent_job_result."
262- output_csv_path: /tmp/components-review.csv
263- output_schema: an object with required string fields path, risk, summary, and follow_up
264```
265
266When you run this through `codex exec`, Codex shows a single-line progress update on `stderr` while the batch is running. The exported CSV includes the original row data plus metadata such as `job_id`, `item_id`, `status`, `last_error`, and `result_json`.
267
268Related runtime settings:
269
270- `agents.max_threads` caps how many agent threads can stay open concurrently.
271- `agents.job_max_runtime_seconds` sets the default per-worker timeout for CSV fan-out jobs. A per-call `max_runtime_seconds` override takes precedence.
272- `sqlite_home` controls where Codex stores the SQLite-backed state used for agent jobs and their exported results.
273
274#### Example 2: Frontend integration debugging
275
276This pattern is useful for UI regressions, flaky browser flows, or integration bugs that cross application code and the running product.
277
278Project config (`.codex/config.toml`):
279
280```toml
281[agents]
282max_threads = 6
283max_depth = 1
284```
285
286`.codex/agents/code-mapper.toml`:
287
288```toml
289name = "code_mapper"
290description = "Read-only codebase explorer for locating the relevant frontend and backend code paths."
291model = "gpt-5.4-mini"
292model_reasoning_effort = "medium"
293sandbox_mode = "read-only"
294developer_instructions = """
295Map the code that owns the failing UI flow.
296Identify entry points, state transitions, and likely files before the worker starts editing.
297"""
298```
299
300`.codex/agents/browser-debugger.toml`:
301
302```toml
303name = "browser_debugger"
304description = "UI debugger that uses browser tooling to reproduce issues and capture evidence."
305model = "gpt-5.4"
306model_reasoning_effort = "high"
307sandbox_mode = "workspace-write"
308developer_instructions = """
309Reproduce the issue in the browser, capture exact steps, and report what the UI actually does.
310Use browser tooling for screenshots, console output, and network evidence.
311Do not edit application code.
312"""
313
314[mcp_servers.chrome_devtools]
315url = "http://localhost:3000/mcp"
316startup_timeout_sec = 20
317```
318
319`.codex/agents/ui-fixer.toml`:
320
321```toml
322name = "ui_fixer"
323description = "Implementation-focused agent for small, targeted fixes after the issue is understood."
324model = "gpt-5.3-codex-spark"
325model_reasoning_effort = "medium"
326developer_instructions = """
327Own the fix once the issue is reproduced.
328Make the smallest defensible change, keep unrelated files untouched, and validate only the behavior you changed.
329"""
330
331[[skills.config]]
332path = "/Users/me/.agents/skills/docs-editor/SKILL.md"
333enabled = false
334```
335
336This setup works well for prompts like:
337
338```text
339Investigate why the settings modal fails to save. Have browser_debugger reproduce it, code_mapper trace the responsible code path, and ui_fixer implement the smallest fix once the failure mode is clear.
340```