SpyBara
Go Premium Account
2026
7 Apr 2026, 00:40
14 May 2026, 21:00 14 May 2026, 07:00 13 May 2026, 00:57 12 May 2026, 01:59 11 May 2026, 18:00 7 May 2026, 20:02 7 May 2026, 17:08 5 May 2026, 23:00 2 May 2026, 06:45 2 May 2026, 00:48 1 May 2026, 18:29 30 Apr 2026, 18:36 29 Apr 2026, 12:40 29 Apr 2026, 00:50 25 Apr 2026, 06:37 25 Apr 2026, 00:42 24 Apr 2026, 18:20 24 Apr 2026, 12:28 23 Apr 2026, 18:31 23 Apr 2026, 12:28 23 Apr 2026, 00:46 22 Apr 2026, 18:29 22 Apr 2026, 00:42 21 Apr 2026, 18:29 21 Apr 2026, 12:30 21 Apr 2026, 06:45 20 Apr 2026, 18:26 20 Apr 2026, 06:53 18 Apr 2026, 18:18 17 Apr 2026, 00:44 16 Apr 2026, 18:31 16 Apr 2026, 00:46 15 Apr 2026, 18:31 15 Apr 2026, 06:44 14 Apr 2026, 18:31 14 Apr 2026, 12:29 13 Apr 2026, 18:37 13 Apr 2026, 00:44 12 Apr 2026, 06:38 10 Apr 2026, 18:23 9 Apr 2026, 00:33 8 Apr 2026, 18:32 8 Apr 2026, 00:40 7 Apr 2026, 00:40 2 Apr 2026, 18:23 31 Mar 2026, 06:35 31 Mar 2026, 00:39 28 Mar 2026, 06:26 28 Mar 2026, 00:36 27 Mar 2026, 18:23 27 Mar 2026, 00:39 26 Mar 2026, 18:27 25 Mar 2026, 18:24 23 Mar 2026, 18:22 20 Mar 2026, 00:35 18 Mar 2026, 12:23 18 Mar 2026, 00:36 17 Mar 2026, 18:24 17 Mar 2026, 00:33 16 Mar 2026, 18:25 16 Mar 2026, 12:23 14 Mar 2026, 00:32 13 Mar 2026, 18:15 13 Mar 2026, 00:34 11 Mar 2026, 00:31 9 Mar 2026, 00:34 8 Mar 2026, 18:10 8 Mar 2026, 00:35 7 Mar 2026, 18:10 7 Mar 2026, 06:14 7 Mar 2026, 00:33 6 Mar 2026, 00:38 5 Mar 2026, 18:41 5 Mar 2026, 06:22 5 Mar 2026, 00:34 4 Mar 2026, 18:18 4 Mar 2026, 06:20 3 Mar 2026, 18:20 3 Mar 2026, 00:35 27 Feb 2026, 18:15 24 Feb 2026, 06:27 24 Feb 2026, 00:33 23 Feb 2026, 18:27 21 Feb 2026, 00:33 20 Feb 2026, 12:16 19 Feb 2026, 20:53 19 Feb 2026, 20:37
30 Apr 2026, 18:36
14 May 2026, 21:00 14 May 2026, 07:00 13 May 2026, 00:57 12 May 2026, 01:59 11 May 2026, 18:00 7 May 2026, 20:02 7 May 2026, 17:08 5 May 2026, 23:00 2 May 2026, 06:45 2 May 2026, 00:48 1 May 2026, 18:29 30 Apr 2026, 18:36 29 Apr 2026, 12:40 29 Apr 2026, 00:50 25 Apr 2026, 06:37 25 Apr 2026, 00:42 24 Apr 2026, 18:20 24 Apr 2026, 12:28 23 Apr 2026, 18:31 23 Apr 2026, 12:28 23 Apr 2026, 00:46 22 Apr 2026, 18:29 22 Apr 2026, 00:42 21 Apr 2026, 18:29 21 Apr 2026, 12:30 21 Apr 2026, 06:45 20 Apr 2026, 18:26 20 Apr 2026, 06:53 18 Apr 2026, 18:18 17 Apr 2026, 00:44 16 Apr 2026, 18:31 16 Apr 2026, 00:46 15 Apr 2026, 18:31 15 Apr 2026, 06:44 14 Apr 2026, 18:31 14 Apr 2026, 12:29 13 Apr 2026, 18:37 13 Apr 2026, 00:44 12 Apr 2026, 06:38 10 Apr 2026, 18:23 9 Apr 2026, 00:33 8 Apr 2026, 18:32 8 Apr 2026, 00:40 7 Apr 2026, 00:40 2 Apr 2026, 18:23 31 Mar 2026, 06:35 31 Mar 2026, 00:39 28 Mar 2026, 06:26 28 Mar 2026, 00:36 27 Mar 2026, 18:23 27 Mar 2026, 00:39 26 Mar 2026, 18:27 25 Mar 2026, 18:24 23 Mar 2026, 18:22 20 Mar 2026, 00:35 18 Mar 2026, 12:23 18 Mar 2026, 00:36 17 Mar 2026, 18:24 17 Mar 2026, 00:33 16 Mar 2026, 18:25 16 Mar 2026, 12:23 14 Mar 2026, 00:32 13 Mar 2026, 18:15 13 Mar 2026, 00:34 11 Mar 2026, 00:31 9 Mar 2026, 00:34 8 Mar 2026, 18:10 8 Mar 2026, 00:35 7 Mar 2026, 18:10 7 Mar 2026, 06:14 7 Mar 2026, 00:33 6 Mar 2026, 00:38 5 Mar 2026, 18:41 5 Mar 2026, 06:22 5 Mar 2026, 00:34 4 Mar 2026, 18:18 4 Mar 2026, 06:20 3 Mar 2026, 18:20 3 Mar 2026, 00:35 27 Feb 2026, 18:15 24 Feb 2026, 06:27 24 Feb 2026, 00:33 23 Feb 2026, 18:27 21 Feb 2026, 00:33 20 Feb 2026, 12:16 19 Feb 2026, 20:53 19 Feb 2026, 20:37
Thu 2 18:23 Tue 7 00:40 Wed 8 00:40 Wed 8 18:32 Thu 9 00:33 Fri 10 18:23 Sun 12 06:38 Mon 13 00:44 Mon 13 18:37 Tue 14 12:29 Tue 14 18:31 Wed 15 06:44 Wed 15 18:31 Thu 16 00:46 Thu 16 18:31 Fri 17 00:44 Sat 18 18:18 Mon 20 06:53 Mon 20 18:26 Tue 21 06:45 Tue 21 12:30 Tue 21 18:29 Wed 22 00:42 Wed 22 18:29 Thu 23 00:46 Thu 23 12:28 Thu 23 18:31 Fri 24 12:28 Fri 24 18:20 Sat 25 00:42 Sat 25 06:37 Wed 29 00:50 Wed 29 12:40 Thu 30 18:36
Details

1# Iterate on difficult problems | Codex use cases1# Iterate on difficult problems | Codex use cases

2 2 

3Codex use cases

4 

5![](/assets/OpenAI-black-wordmark.svg)

6 

7![Codex](/assets/OAI_Codex-Lockup_Fallback_Black.svg)

8 

9Codex use case

10 

11# Iterate on difficult problems

12 

13Use Codex as a scored improvement loop to solve hard tasks.

14 

15Difficulty **Advanced**

16 

17Time horizon **Long-running**

18 

19Give Codex an evaluation system, such as scripts and reviewable artifacts, so it can keep improving a hard task until the scores are good enough.

20 

21## Best for

22 

23- Problems where each iteration can be scored, but the best result usually takes many passes

24- Tasks with visual or subjective outputs that need both deterministic checks and an LLM-as-a-judge score

25- Long-running Codex sessions where you want progress tracked clearly instead of relying on context

26 

27# Contents

28 

3[← All use cases](https://developers.openai.com/codex/use-cases)29[← All use cases](https://developers.openai.com/codex/use-cases)

4 30 

31Copy page [Export as PDF](https://developers.openai.com/codex/use-cases/iterate-on-difficult-problems/?export=pdf)

32 

5Give Codex an evaluation system, such as scripts and reviewable artifacts, so it can keep improving a hard task until the scores are good enough.33Give Codex an evaluation system, such as scripts and reviewable artifacts, so it can keep improving a hard task until the scores are good enough.

6 34 

7Advanced35Advanced


20 48 

21## Starter prompt49## Starter prompt

22 50 

51I have a difficult task in this workspace and I want you to run it as an eval-driven improvement loop.

52 Before changing anything:

53 - Read `AGENTS.md`.

54 - Find the script or command that scores the current output.

55 Iteration loop:

56 - Make one focused improvement at a time.

57 - Re-run the eval command after each meaningful change.

58 - Log the scores and what changed.

59- Inspect generated artifacts directly. If the output is visual, use `view\_image`.

60 - Keep going until both the overall score and the LLM average are above 90%.

61 Constraints:

62 - Do not stop at the first acceptable result.

63- Do not revert to an earlier version unless the new result is clearly worse in scores or artifacts.

64- If the eval improves but is still below target, explain the bottleneck and continue.

65 Output:

66 - current best scores

67 - log of major iterations

68 - remaining risks or weak spots

69 

70[Open in the Codex app](codex://new?prompt=I+have+a+difficult+task+in+this+workspace+and+I+want+you+to+run+it+as+an+eval-driven+improvement+loop.%0A%0ABefore+changing+anything%3A%0A-+Read+%60AGENTS.md%60.%0A-+Find+the+script+or+command+that+scores+the+current+output.%0A%0AIteration+loop%3A%0A-+Make+one+focused+improvement+at+a+time.%0A-+Re-run+the+eval+command+after+each+meaningful+change.%0A-+Log+the+scores+and+what+changed.%0A-+Inspect+generated+artifacts+directly.+If+the+output+is+visual%2C+use+%60view_image%60.%0A-+Keep+going+until+both+the+overall+score+and+the+LLM+average+are+above+90%25.%0A%0AConstraints%3A%0A-+Do+not+stop+at+the+first+acceptable+result.%0A-+Do+not+revert+to+an+earlier+version+unless+the+new+result+is+clearly+worse+in+scores+or+artifacts.%0A-+If+the+eval+improves+but+is+still+below+target%2C+explain+the+bottleneck+and+continue.%0A%0AOutput%3A%0A-+current+best+scores%0A-+log+of+major+iterations%0A-+remaining+risks+or+weak+spots "Open in the Codex app")

71 

23I have a difficult task in this workspace and I want you to run it as an eval-driven improvement loop.72I have a difficult task in this workspace and I want you to run it as an eval-driven improvement loop.

24 Before changing anything:73 Before changing anything:

25 - Read `AGENTS.md`.74 - Read `AGENTS.md`.


127 176 

128Use Codex to turn a game brief into first a well-defined plan, and then a real browser-based...177Use Codex to turn a game brief into first a well-defined plan, and then a real browser-based...

129 178 

130Engineering Code](https://developers.openai.com/codex/use-cases/browser-games)[![](/images/codex/codex-wallpaper-2.webp)179Engineering Code](https://developers.openai.com/codex/use-cases/browser-games)[![](/images/codex/codex-wallpaper-1.webp)

131 180 

132### Analyze datasets and ship reports181### Learn a new concept

133 182 

134Use Codex to clean data, join sources, explore hypotheses, model results, and package the...183Use Codex to study material such as research papers or courses, split the reading across...

135 184 

136Data Analysis](https://developers.openai.com/codex/use-cases/datasets-and-reports)185Knowledge Work Data](https://developers.openai.com/codex/use-cases/learn-a-new-concept)