cli/resources/fine_tuning/subresources/methods/index.md +0 −722 deleted
File Deleted View Diff
1# Methods
2
3## Domain Types
4
5### Dpo Hyperparameters
6
7- `dpo_hyperparameters: object { batch_size, beta, learning_rate_multiplier, n_epochs }`
8
9 The hyperparameters used for the DPO fine-tuning job.
10
11 - `batch_size: optional "auto" or number`
12
13 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
14
15 - `union_member_0: "auto"`
16
17 - `union_member_1: number`
18
19 - `beta: optional "auto" or number`
20
21 The beta value for the DPO method. A higher beta value will increase the weight of the penalty between the policy and reference model.
22
23 - `union_member_0: "auto"`
24
25 - `union_member_1: number`
26
27 - `learning_rate_multiplier: optional "auto" or number`
28
29 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
30
31 - `union_member_0: "auto"`
32
33 - `union_member_1: number`
34
35 - `n_epochs: optional "auto" or number`
36
37 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
38
39 - `union_member_0: "auto"`
40
41 - `union_member_1: number`
42
43### Dpo Method
44
45- `dpo_method: object { hyperparameters }`
46
47 Configuration for the DPO fine-tuning method.
48
49 - `hyperparameters: optional object { batch_size, beta, learning_rate_multiplier, n_epochs }`
50
51 The hyperparameters used for the DPO fine-tuning job.
52
53 - `batch_size: optional "auto" or number`
54
55 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
56
57 - `union_member_0: "auto"`
58
59 - `union_member_1: number`
60
61 - `beta: optional "auto" or number`
62
63 The beta value for the DPO method. A higher beta value will increase the weight of the penalty between the policy and reference model.
64
65 - `union_member_0: "auto"`
66
67 - `union_member_1: number`
68
69 - `learning_rate_multiplier: optional "auto" or number`
70
71 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
72
73 - `union_member_0: "auto"`
74
75 - `union_member_1: number`
76
77 - `n_epochs: optional "auto" or number`
78
79 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
80
81 - `union_member_0: "auto"`
82
83 - `union_member_1: number`
84
85### Reinforcement Hyperparameters
86
87- `reinforcement_hyperparameters: object { batch_size, compute_multiplier, eval_interval, 4 more }`
88
89 The hyperparameters used for the reinforcement fine-tuning job.
90
91 - `batch_size: optional "auto" or number`
92
93 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
94
95 - `union_member_0: "auto"`
96
97 - `union_member_1: number`
98
99 - `compute_multiplier: optional "auto" or number`
100
101 Multiplier on amount of compute used for exploring search space during training.
102
103 - `union_member_0: "auto"`
104
105 - `union_member_1: number`
106
107 - `eval_interval: optional "auto" or number`
108
109 The number of training steps between evaluation runs.
110
111 - `union_member_0: "auto"`
112
113 - `union_member_1: number`
114
115 - `eval_samples: optional "auto" or number`
116
117 Number of evaluation samples to generate per training step.
118
119 - `union_member_0: "auto"`
120
121 - `union_member_1: number`
122
123 - `learning_rate_multiplier: optional "auto" or number`
124
125 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
126
127 - `union_member_0: "auto"`
128
129 - `union_member_1: number`
130
131 - `n_epochs: optional "auto" or number`
132
133 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
134
135 - `union_member_0: "auto"`
136
137 - `union_member_1: number`
138
139 - `reasoning_effort: optional "default" or "low" or "medium" or "high"`
140
141 Level of reasoning effort.
142
143 - `"default"`
144
145 - `"low"`
146
147 - `"medium"`
148
149 - `"high"`
150
151### Reinforcement Method
152
153- `reinforcement_method: object { grader, hyperparameters }`
154
155 Configuration for the reinforcement fine-tuning method.
156
157 - `grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`
158
159 The grader used for the fine-tuning job.
160
161 - `string_check_grader: object { input, name, operation, 2 more }`
162
163 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
164
165 - `input: string`
166
167 The input text. This may include template strings.
168
169 - `name: string`
170
171 The name of the grader.
172
173 - `operation: "eq" or "ne" or "like" or "ilike"`
174
175 The string check operation to perform. One of `eq`, `ne`, `like`, or `ilike`.
176
177 - `"eq"`
178
179 - `"ne"`
180
181 - `"like"`
182
183 - `"ilike"`
184
185 - `reference: string`
186
187 The reference text. This may include template strings.
188
189 - `type: "string_check"`
190
191 The object type, which is always `string_check`.
192
193 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`
194
195 A TextSimilarityGrader object which grades text based on similarity metrics.
196
197 - `evaluation_metric: "cosine" or "fuzzy_match" or "bleu" or 8 more`
198
199 The evaluation metric to use. One of `cosine`, `fuzzy_match`, `bleu`,
200 `gleu`, `meteor`, `rouge_1`, `rouge_2`, `rouge_3`, `rouge_4`, `rouge_5`,
201 or `rouge_l`.
202
203 - `"cosine"`
204
205 - `"fuzzy_match"`
206
207 - `"bleu"`
208
209 - `"gleu"`
210
211 - `"meteor"`
212
213 - `"rouge_1"`
214
215 - `"rouge_2"`
216
217 - `"rouge_3"`
218
219 - `"rouge_4"`
220
221 - `"rouge_5"`
222
223 - `"rouge_l"`
224
225 - `input: string`
226
227 The text being graded.
228
229 - `name: string`
230
231 The name of the grader.
232
233 - `reference: string`
234
235 The text being graded against.
236
237 - `type: "text_similarity"`
238
239 The type of grader.
240
241 - `python_grader: object { name, source, type, image_tag }`
242
243 A PythonGrader object that runs a python script on the input.
244
245 - `name: string`
246
247 The name of the grader.
248
249 - `source: string`
250
251 The source code of the python script.
252
253 - `type: "python"`
254
255 The object type, which is always `python`.
256
257 - `image_tag: optional string`
258
259 The image tag to use for the python script.
260
261 - `score_model_grader: object { input, model, name, 3 more }`
262
263 A ScoreModelGrader object that uses a model to assign a score to the input.
264
265 - `input: array of object { content, role, type }`
266
267 The input messages evaluated by the grader. Supports text, output text, input image, and input audio content blocks, and may include template strings.
268
269 - `content: string or ResponseInputText or object { text, type } or 3 more`
270
271 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
272
273 - `Text input: string`
274
275 A text input to the model.
276
277 - `response_input_text: object { text, type }`
278
279 A text input to the model.
280
281 - `text: string`
282
283 The text input to the model.
284
285 - `type: "input_text"`
286
287 The type of the input item. Always `input_text`.
288
289 - `Output text: object { text, type }`
290
291 A text output from the model.
292
293 - `text: string`
294
295 The text output from the model.
296
297 - `type: "output_text"`
298
299 The type of the output text. Always `output_text`.
300
301 - `Input image: object { image_url, type, detail }`
302
303 An image input block used within EvalItem content arrays.
304
305 - `image_url: string`
306
307 The URL of the image input.
308
309 - `type: "input_image"`
310
311 The type of the image input. Always `input_image`.
312
313 - `detail: optional string`
314
315 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.
316
317 - `response_input_audio: object { input_audio, type }`
318
319 An audio input to the model.
320
321 - `input_audio: object { data, format }`
322
323 - `data: string`
324
325 Base64-encoded audio data.
326
327 - `format: "mp3" or "wav"`
328
329 The format of the audio data. Currently supported formats are `mp3` and
330 `wav`.
331
332 - `"mp3"`
333
334 - `"wav"`
335
336 - `type: "input_audio"`
337
338 The type of the input item. Always `input_audio`.
339
340 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`
341
342 A list of inputs, each of which may be either an input text, output text, input
343 image, or input audio object.
344
345 - `Text input: string`
346
347 A text input to the model.
348
349 - `response_input_text: object { text, type }`
350
351 A text input to the model.
352
353 - `Output text: object { text, type }`
354
355 A text output from the model.
356
357 - `text: string`
358
359 The text output from the model.
360
361 - `type: "output_text"`
362
363 The type of the output text. Always `output_text`.
364
365 - `Input image: object { image_url, type, detail }`
366
367 An image input block used within EvalItem content arrays.
368
369 - `image_url: string`
370
371 The URL of the image input.
372
373 - `type: "input_image"`
374
375 The type of the image input. Always `input_image`.
376
377 - `detail: optional string`
378
379 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.
380
381 - `response_input_audio: object { input_audio, type }`
382
383 An audio input to the model.
384
385 - `role: "user" or "assistant" or "system" or "developer"`
386
387 The role of the message input. One of `user`, `assistant`, `system`, or
388 `developer`.
389
390 - `"user"`
391
392 - `"assistant"`
393
394 - `"system"`
395
396 - `"developer"`
397
398 - `type: optional "message"`
399
400 The type of the message input. Always `message`.
401
402 - `"message"`
403
404 - `model: string`
405
406 The model to use for the evaluation.
407
408 - `name: string`
409
410 The name of the grader.
411
412 - `type: "score_model"`
413
414 The object type, which is always `score_model`.
415
416 - `range: optional array of number`
417
418 The range of the score. Defaults to `[0, 1]`.
419
420 - `sampling_params: optional object { max_completions_tokens, reasoning_effort, seed, 2 more }`
421
422 The sampling parameters for the model.
423
424 - `max_completions_tokens: optional number`
425
426 The maximum number of tokens the grader model may generate in its response.
427
428 - `reasoning_effort: optional "none" or "minimal" or "low" or 3 more`
429
430 Constrains effort on reasoning for
431 [reasoning models](https://platform.openai.com/docs/guides/reasoning).
432 Currently supported values are `none`, `minimal`, `low`, `medium`, `high`, and `xhigh`. Reducing
433 reasoning effort can result in faster responses and fewer tokens used
434 on reasoning in a response.
435
436 - `gpt-5.1` defaults to `none`, which does not perform reasoning. The supported reasoning values for `gpt-5.1` are `none`, `low`, `medium`, and `high`. Tool calls are supported for all reasoning values in gpt-5.1.
437 - All models before `gpt-5.1` default to `medium` reasoning effort, and do not support `none`.
438 - The `gpt-5-pro` model defaults to (and only supports) `high` reasoning effort.
439 - `xhigh` is supported for all models after `gpt-5.1-codex-max`.
440
441 - `"none"`
442
443 - `"minimal"`
444
445 - `"low"`
446
447 - `"medium"`
448
449 - `"high"`
450
451 - `"xhigh"`
452
453 - `seed: optional number`
454
455 A seed value to initialize the randomness, during sampling.
456
457 - `temperature: optional number`
458
459 A higher temperature increases randomness in the outputs.
460
461 - `top_p: optional number`
462
463 An alternative to temperature for nucleus sampling; 1.0 includes all tokens.
464
465 - `multi_grader: object { calculate_output, graders, name, type }`
466
467 A MultiGrader object combines the output of multiple graders to produce a single score.
468
469 - `calculate_output: string`
470
471 A formula to calculate the output based on grader results.
472
473 - `graders: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`
474
475 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
476
477 - `string_check_grader: object { input, name, operation, 2 more }`
478
479 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
480
481 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`
482
483 A TextSimilarityGrader object which grades text based on similarity metrics.
484
485 - `python_grader: object { name, source, type, image_tag }`
486
487 A PythonGrader object that runs a python script on the input.
488
489 - `score_model_grader: object { input, model, name, 3 more }`
490
491 A ScoreModelGrader object that uses a model to assign a score to the input.
492
493 - `label_model_grader: object { input, labels, model, 3 more }`
494
495 A LabelModelGrader object which uses a model to assign labels to each item
496 in the evaluation.
497
498 - `input: array of object { content, role, type }`
499
500 - `content: string or ResponseInputText or object { text, type } or 3 more`
501
502 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
503
504 - `Text input: string`
505
506 A text input to the model.
507
508 - `response_input_text: object { text, type }`
509
510 A text input to the model.
511
512 - `Output text: object { text, type }`
513
514 A text output from the model.
515
516 - `text: string`
517
518 The text output from the model.
519
520 - `type: "output_text"`
521
522 The type of the output text. Always `output_text`.
523
524 - `Input image: object { image_url, type, detail }`
525
526 An image input block used within EvalItem content arrays.
527
528 - `image_url: string`
529
530 The URL of the image input.
531
532 - `type: "input_image"`
533
534 The type of the image input. Always `input_image`.
535
536 - `detail: optional string`
537
538 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.
539
540 - `response_input_audio: object { input_audio, type }`
541
542 An audio input to the model.
543
544 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`
545
546 A list of inputs, each of which may be either an input text, output text, input
547 image, or input audio object.
548
549 - `role: "user" or "assistant" or "system" or "developer"`
550
551 The role of the message input. One of `user`, `assistant`, `system`, or
552 `developer`.
553
554 - `"user"`
555
556 - `"assistant"`
557
558 - `"system"`
559
560 - `"developer"`
561
562 - `type: optional "message"`
563
564 The type of the message input. Always `message`.
565
566 - `"message"`
567
568 - `labels: array of string`
569
570 The labels to assign to each item in the evaluation.
571
572 - `model: string`
573
574 The model to use for the evaluation. Must support structured outputs.
575
576 - `name: string`
577
578 The name of the grader.
579
580 - `passing_labels: array of string`
581
582 The labels that indicate a passing result. Must be a subset of labels.
583
584 - `type: "label_model"`
585
586 The object type, which is always `label_model`.
587
588 - `name: string`
589
590 The name of the grader.
591
592 - `type: "multi"`
593
594 The object type, which is always `multi`.
595
596 - `hyperparameters: optional object { batch_size, compute_multiplier, eval_interval, 4 more }`
597
598 The hyperparameters used for the reinforcement fine-tuning job.
599
600 - `batch_size: optional "auto" or number`
601
602 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
603
604 - `union_member_0: "auto"`
605
606 - `union_member_1: number`
607
608 - `compute_multiplier: optional "auto" or number`
609
610 Multiplier on amount of compute used for exploring search space during training.
611
612 - `union_member_0: "auto"`
613
614 - `union_member_1: number`
615
616 - `eval_interval: optional "auto" or number`
617
618 The number of training steps between evaluation runs.
619
620 - `union_member_0: "auto"`
621
622 - `union_member_1: number`
623
624 - `eval_samples: optional "auto" or number`
625
626 Number of evaluation samples to generate per training step.
627
628 - `union_member_0: "auto"`
629
630 - `union_member_1: number`
631
632 - `learning_rate_multiplier: optional "auto" or number`
633
634 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
635
636 - `union_member_0: "auto"`
637
638 - `union_member_1: number`
639
640 - `n_epochs: optional "auto" or number`
641
642 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
643
644 - `union_member_0: "auto"`
645
646 - `union_member_1: number`
647
648 - `reasoning_effort: optional "default" or "low" or "medium" or "high"`
649
650 Level of reasoning effort.
651
652 - `"default"`
653
654 - `"low"`
655
656 - `"medium"`
657
658 - `"high"`
659
660### Supervised Hyperparameters
661
662- `supervised_hyperparameters: object { batch_size, learning_rate_multiplier, n_epochs }`
663
664 The hyperparameters used for the fine-tuning job.
665
666 - `batch_size: optional "auto" or number`
667
668 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
669
670 - `union_member_0: "auto"`
671
672 - `union_member_1: number`
673
674 - `learning_rate_multiplier: optional "auto" or number`
675
676 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
677
678 - `union_member_0: "auto"`
679
680 - `union_member_1: number`
681
682 - `n_epochs: optional "auto" or number`
683
684 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
685
686 - `union_member_0: "auto"`
687
688 - `union_member_1: number`
689
690### Supervised Method
691
692- `supervised_method: object { hyperparameters }`
693
694 Configuration for the supervised fine-tuning method.
695
696 - `hyperparameters: optional object { batch_size, learning_rate_multiplier, n_epochs }`
697
698 The hyperparameters used for the fine-tuning job.
699
700 - `batch_size: optional "auto" or number`
701
702 Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
703
704 - `union_member_0: "auto"`
705
706 - `union_member_1: number`
707
708 - `learning_rate_multiplier: optional "auto" or number`
709
710 Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
711
712 - `union_member_0: "auto"`
713
714 - `union_member_1: number`
715
716 - `n_epochs: optional "auto" or number`
717
718 The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
719
720 - `union_member_0: "auto"`
721
722 - `union_member_1: number`