1# Discover protein folding architectures | Codex use cases
2
3Codex use cases
4
5
6
7
8
9Codex use case
10
11# Discover protein folding architectures
12
13Turn protein-folding hypotheses into benchmarked experiment loops.
14
15Difficulty **Advanced**
16
17Time horizon **Long-running**
18
19Use Codex with Goal Mode to research and implement novel architectural modifications to AlphaFold2 for improved protein folding performance.
20
21## Best for
22
23- Computational biologists exploring architecture, loss, or curriculum changes against an automatically scorable benchmark.
24- Researchers who have a scientifically motivated hypothesis and want to compress the path from idea to working experimental fork.
25- ML engineers running long-lived autoresearch loops that require persistent experiment tracking and iterative debugging.
26
27# Contents
28
29[← All use cases](https://developers.openai.com/codex/use-cases)
30
31Copy page [Export as PDF](https://developers.openai.com/codex/use-cases/discover-protein-folding-architectures/?export=pdf)
32
33Use Codex with Goal Mode to research and implement novel architectural modifications to AlphaFold2 for improved protein folding performance.
34
35Advanced
36
37Long-running
38
39Related links
40
41[Follow a goal](https://developers.openai.com/codex/use-cases/follow-goals) [SimplexFold repository](https://github.com/ChrisHayduk/SimplexFold) [SimplexFold benchmark plan](https://github.com/ChrisHayduk/SimplexFold/blob/main/BENCHMARK_PLAN.md) [NanoFold competition](https://github.com/ChrisHayduk/nanoFold-Competition)
42
43## Best for
44
45- Computational biologists exploring architecture, loss, or curriculum changes against an automatically scorable benchmark.
46- Researchers who have a scientifically motivated hypothesis and want to compress the path from idea to working experimental fork.
47- ML engineers running long-lived autoresearch loops that require persistent experiment tracking and iterative debugging.
48
49## Starter prompt
50
51Use Goal Mode to improve the validation lDDT-Cα score of this AlphaFold2-style protein-structure model on the NanoFold public benchmark.
52The scientific hypothesis is that persistent higher-order geometric states may help the model learn protein geometry more efficiently from limited data:
53- retain the standard MSA and pairwise representations;
54- add sparse learned 2-simplex face states for selected residue triplets;
55- add sparse learned 3-simplex tetrahedral states for selected residue quadruplets;
56- construct topology only from official benchmark inputs and model-generated recycled geometry;
57- keep the implementation computationally practical under NanoFold constraints.
58Maintain durable tracking files for:
591. The current strategy, status, and proposed next steps in PLAN.md
602. A structured log of experiments and results in EXPERIMENTS.md
613. A running scratchpad of notes and thoughts in EXPERIMENT\_NOTES.md
62For each iteration:
631. state the hypothesis being tested;
642. make the smallest coherent code or configuration change;
653. run the relevant tests and benchmark slice;
664. record metrics, latency, memory, and failure modes;
675. decide whether to keep, revert, or refine the change;
686. periodically reassess the architecture-level search direction rather than only tuning local hyperparameters.
69Do not claim generalization gains from smoke tests or single-chain overfit diagnostics. Prefer matched comparisons and preserve the evidence boundary.
70
71[Open in the Codex app](codex://threads/new?prompt=Use+Goal+Mode+to+improve+the+validation+lDDT-C%CE%B1+score+of+this+AlphaFold2-style+protein-structure+model+on+the+NanoFold+public+benchmark.%0A%0AThe+scientific+hypothesis+is+that+persistent+higher-order+geometric+states+may+help+the+model+learn+protein+geometry+more+efficiently+from+limited+data%3A%0A%0A-+retain+the+standard+MSA+and+pairwise+representations%3B%0A-+add+sparse+learned+2-simplex+face+states+for+selected+residue+triplets%3B%0A-+add+sparse+learned+3-simplex+tetrahedral+states+for+selected+residue+quadruplets%3B%0A-+construct+topology+only+from+official+benchmark+inputs+and+model-generated+recycled+geometry%3B%0A-+keep+the+implementation+computationally+practical+under+NanoFold+constraints.%0A%0AMaintain+durable+tracking+files+for%3A%0A1.+The+current+strategy%2C+status%2C+and+proposed+next+steps+in+PLAN.md%0A2.+A+structured+log+of+experiments+and+results+in+EXPERIMENTS.md%0A3.+A+running+scratchpad+of+notes+and+thoughts+in+EXPERIMENT_NOTES.md%0A%0AFor+each+iteration%3A%0A1.+state+the+hypothesis+being+tested%3B%0A2.+make+the+smallest+coherent+code+or+configuration+change%3B%0A3.+run+the+relevant+tests+and+benchmark+slice%3B%0A4.+record+metrics%2C+latency%2C+memory%2C+and+failure+modes%3B%0A5.+decide+whether+to+keep%2C+revert%2C+or+refine+the+change%3B%0A6.+periodically+reassess+the+architecture-level+search+direction+rather+than+only+tuning+local+hyperparameters.%0A%0ADo+not+claim+generalization+gains+from+smoke+tests+or+single-chain+overfit+diagnostics.+Prefer+matched+comparisons+and+preserve+the+evidence+boundary. "Open in the Codex app")
72
73Use Goal Mode to improve the validation lDDT-Cα score of this AlphaFold2-style protein-structure model on the NanoFold public benchmark.
74The scientific hypothesis is that persistent higher-order geometric states may help the model learn protein geometry more efficiently from limited data:
75- retain the standard MSA and pairwise representations;
76- add sparse learned 2-simplex face states for selected residue triplets;
77- add sparse learned 3-simplex tetrahedral states for selected residue quadruplets;
78- construct topology only from official benchmark inputs and model-generated recycled geometry;
79- keep the implementation computationally practical under NanoFold constraints.
80Maintain durable tracking files for:
811. The current strategy, status, and proposed next steps in PLAN.md
822. A structured log of experiments and results in EXPERIMENTS.md
833. A running scratchpad of notes and thoughts in EXPERIMENT\_NOTES.md
84For each iteration:
851. state the hypothesis being tested;
862. make the smallest coherent code or configuration change;
873. run the relevant tests and benchmark slice;
884. record metrics, latency, memory, and failure modes;
895. decide whether to keep, revert, or refine the change;
906. periodically reassess the architecture-level search direction rather than only tuning local hyperparameters.
91Do not claim generalization gains from smoke tests or single-chain overfit diagnostics. Prefer matched comparisons and preserve the evidence boundary.
92
93## Explore a protein-folding architecture hypothesis
94
95Use Codex Goal Mode when you have a protein-folding hypothesis that needs more
96than one implementation pass. Give Codex a bounded scientific direction, a
97working baseline, and an automatically scorable benchmark. Codex can implement
98the architecture fork, track experiments, diagnose failures, and continue
99iterating while you review the evidence.
100
101This example started with a specific question: could an AlphaFold2-style model
102learn useful protein geometry more efficiently if its trunk represented not
103only residues and residue pairs, but also explicit higher-order topological
104objects?
105
106## Define a bounded experiment
107
108AlphaFold2 already uses powerful pairwise and triangle-style reasoning inside
109the Evoformer. Its triangle operations improve edge representations, but still
110write back into a pair tensor. The scientist proposed testing whether persistent
111learned representations for triangular faces and tetrahedral cells could
112provide a useful inductive bias in a data-limited setting.
113
114The resulting public repository, [SimplexFold](https://github.com/ChrisHayduk/SimplexFold),
115adds sparse face states `F_ijk` and tetrahedral states `U_ijkl` alongside the
116conventional pair representation `Z_ij`.
117
118```
119MSA representation M
120 <-> pair / edge tensor Z_ij
121 <-> sparse face tensor F_ijk
122 <-> sparse tetra tensor U_ijkl
123 -> structure module
124 -> recycled geometry
125 loops back into the next pass
126```
127
128Start with the starter prompt on this page, a minimal AlphaFold2-style baseline,
129and the public NanoFold benchmark. The benchmark provides a small, curated
130fixed-data and automatically scorable substrate for structural-biology
131experimentation. Keep the first implementation small enough to test with
132targeted unit tests and microbenchmarks before launching expensive training
133runs.
134
135## Run the search with Goal Mode
136
1371. Supply a falsifiable, high-level scientific hypothesis instead of asking the model to invent an entire research agenda from scratch.
1382. Use GPT-5.5 Pro in ChatGPT to convert that direction into an implementation plan with explicit constraints and ablations.
1393. Ask Codex to implement the smallest runnable [SimplexFold](https://github.com/ChrisHayduk/SimplexFold) baseline, then verify it with targeted unit tests and microbenchmarks.
1404. Give the resulting repository to Codex Goal Mode and instruct it to hill-climb validation `lDDT-Cα` on the NanoFold benchmark while preserving experiment logs, plans, and artifact references.
1415. Run Goal Mode continuously while it uses benchmark feedback to iterate on the architecture, training recipe, and experimental harness. In this example, the loop ran for more than 150 hours.
142
143Use `PLAN.md` for the current strategy and next steps, `EXPERIMENTS.md` for a
144structured log of results, and `EXPERIMENT_NOTES.md` for the running scratchpad.
145These artifacts make a long-running search auditable and give you a stable
146place to steer the next iteration.
147
148Goal Mode is useful here because the search requires repeated implementation,
149testing, experiment tracking, failure diagnosis, and benchmark-driven
150iteration. Unguided autoresearch often drifted toward familiar local changes
151such as losses, optimizers, and hyperparameters. A compact scientist-supplied
152architecture hypothesis gave Codex a more meaningful search space while still
153leaving room to test, diagnose, and refine the implementation.
154
155This workflow is also useful for teams evaluating how scientist-in-the-loop
156steering changes the quality of agentic scientific search.
157
158## Example result
159
160The result of this workflow was [SimplexFold](https://github.com/ChrisHayduk/SimplexFold),
161an experimental architecture with explicit higher-order simplex states. Review
162the topology alongside the benchmark logs to confirm that each iteration still
163tests the original scientific idea.
164
165
166
167The useful lesson is not that Codex autonomously solved protein folding. The
168workflow shows how Goal Mode can act as a persistent scientific engineering
169loop: a scientist contributes the conceptual move, and Codex compresses the
170implementation, experimentation, debugging, and follow-up search cycle.
171
172Treat promising diagnostics as evidence that the implementation path works,
173not as proof of generalization. Review the agent’s trajectory periodically,
174steer it back toward scientifically meaningful architecture questions if it
175collapses into local hyperparameter tuning, and promote claims only after
176matched public-validation comparisons and appropriate replicates.
177
178## Resources
179
180- [SimplexFold repository](https://github.com/ChrisHayduk/SimplexFold)
181- [SimplexFold benchmark plan](https://github.com/ChrisHayduk/SimplexFold/blob/main/BENCHMARK_PLAN.md)
182- [NanoFold competition](https://github.com/ChrisHayduk/nanoFold-Competition)
183- [NanoFold competition rules](https://github.com/ChrisHayduk/nanoFold-Competition/blob/main/docs/COMPETITION.md)
184- [Goal Mode running for more than 150 hours](https://x.com/ChrisHayduk/status/2055757345506877759?s=20)
185- [Goal Mode article](https://x.com/ChrisHayduk/status/2053807198870880743?s=20)
186
187## Related use cases
188
189[
190
191### Annotate scRNA-seq data
192
193Use Codex with the NGS Analysis plugin to turn a 10x-style matrix bundle into QC-filtered...
194
195Sciences Data](https://developers.openai.com/codex/use-cases/scrna-seq-post-count-qc)[
196
197### Validate bulk RNA-seq inputs
198
199Use Codex with the NGS Analysis plugin to validate sample sheets, FASTQs, and references...
200
201Sciences Data](https://developers.openai.com/codex/use-cases/bulk-rna-seq-fastq-qc)[
202
203### Prioritize drug targets
204
205Use Codex with the Life Science Research plugin to normalize entities, retrieve genetics...
206
207Sciences Data](https://developers.openai.com/codex/use-cases/target-prioritization)
208