Methods
Domain Types
Dpo Hyperparameters
-
dpo_hyperparameters: object { batch_size, beta, learning_rate_multiplier, n_epochs }The hyperparameters used for the DPO fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
beta: optional "auto" or numberThe beta value for the DPO method. A higher beta value will increase the weight of the penalty between the policy and reference model.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
Dpo Method
-
dpo_method: object { hyperparameters }Configuration for the DPO fine-tuning method.
-
hyperparameters: optional object { batch_size, beta, learning_rate_multiplier, n_epochs }The hyperparameters used for the DPO fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
beta: optional "auto" or numberThe beta value for the DPO method. A higher beta value will increase the weight of the penalty between the policy and reference model.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
-
Reinforcement Hyperparameters
-
reinforcement_hyperparameters: object { batch_size, compute_multiplier, eval_interval, 4 more }The hyperparameters used for the reinforcement fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
compute_multiplier: optional "auto" or numberMultiplier on amount of compute used for exploring search space during training.
-
union_member_0: "auto" -
union_member_1: number
-
-
eval_interval: optional "auto" or numberThe number of training steps between evaluation runs.
-
union_member_0: "auto" -
union_member_1: number
-
-
eval_samples: optional "auto" or numberNumber of evaluation samples to generate per training step.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
reasoning_effort: optional "default" or "low" or "medium" or "high"Level of reasoning effort.
-
"default" -
"low" -
"medium" -
"high"
-
-
Reinforcement Method
-
reinforcement_method: object { grader, hyperparameters }Configuration for the reinforcement fine-tuning method.
-
grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 moreThe grader used for the fine-tuning job.
-
string_check_grader: object { input, name, operation, 2 more }A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
input: stringThe input text. This may include template strings.
-
name: stringThe name of the grader.
-
operation: "eq" or "ne" or "like" or "ilike"The string check operation to perform. One of
eq,ne,like, orilike.-
"eq" -
"ne" -
"like" -
"ilike"
-
-
reference: stringThe reference text. This may include template strings.
-
type: "string_check"The object type, which is always
string_check.
-
-
text_similarity_grader: object { evaluation_metric, input, name, 2 more }A TextSimilarityGrader object which grades text based on similarity metrics.
-
evaluation_metric: "cosine" or "fuzzy_match" or "bleu" or 8 moreThe evaluation metric to use. One of
cosine,fuzzy_match,bleu,gleu,meteor,rouge_1,rouge_2,rouge_3,rouge_4,rouge_5, orrouge_l.-
"cosine" -
"fuzzy_match" -
"bleu" -
"gleu" -
"meteor" -
"rouge_1" -
"rouge_2" -
"rouge_3" -
"rouge_4" -
"rouge_5" -
"rouge_l"
-
-
input: stringThe text being graded.
-
name: stringThe name of the grader.
-
reference: stringThe text being graded against.
-
type: "text_similarity"The type of grader.
-
-
python_grader: object { name, source, type, image_tag }A PythonGrader object that runs a python script on the input.
-
name: stringThe name of the grader.
-
source: stringThe source code of the python script.
-
type: "python"The object type, which is always
python. -
image_tag: optional stringThe image tag to use for the python script.
-
-
score_model_grader: object { input, model, name, 3 more }A ScoreModelGrader object that uses a model to assign a score to the input.
-
input: array of object { content, role, type }The input messages evaluated by the grader. Supports text, output text, input image, and input audio content blocks, and may include template strings.
-
content: string or ResponseInputText or object { text, type } or 3 moreInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
Text input: stringA text input to the model.
-
response_input_text: object { text, type }A text input to the model.
-
text: stringThe text input to the model.
-
type: "input_text"The type of the input item. Always
input_text.
-
-
Output text: object { text, type }A text output from the model.
-
text: stringThe text output from the model.
-
type: "output_text"The type of the output text. Always
output_text.
-
-
Input image: object { image_url, type, detail }An image input block used within EvalItem content arrays.
-
image_url: stringThe URL of the image input.
-
type: "input_image"The type of the image input. Always
input_image. -
detail: optional stringThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
response_input_audio: object { input_audio, type }An audio input to the model.
-
input_audio: object { data, format }-
data: stringBase64-encoded audio data.
-
format: "mp3" or "wav"The format of the audio data. Currently supported formats are
mp3andwav.-
"mp3" -
"wav"
-
-
-
type: "input_audio"The type of the input item. Always
input_audio.
-
-
grader_inputs: array of string or ResponseInputText or object { text, type } or 2 moreA list of inputs, each of which may be either an input text, output text, input image, or input audio object.
-
Text input: stringA text input to the model.
-
response_input_text: object { text, type }A text input to the model.
-
Output text: object { text, type }A text output from the model.
-
text: stringThe text output from the model.
-
type: "output_text"The type of the output text. Always
output_text.
-
-
Input image: object { image_url, type, detail }An image input block used within EvalItem content arrays.
-
image_url: stringThe URL of the image input.
-
type: "input_image"The type of the image input. Always
input_image. -
detail: optional stringThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
response_input_audio: object { input_audio, type }An audio input to the model.
-
-
-
role: "user" or "assistant" or "system" or "developer"The role of the message input. One of
user,assistant,system, ordeveloper.-
"user" -
"assistant" -
"system" -
"developer"
-
-
type: optional "message"The type of the message input. Always
message."message"
-
-
model: stringThe model to use for the evaluation.
-
name: stringThe name of the grader.
-
type: "score_model"The object type, which is always
score_model. -
range: optional array of numberThe range of the score. Defaults to
[0, 1]. -
sampling_params: optional object { max_completions_tokens, reasoning_effort, seed, 2 more }The sampling parameters for the model.
-
max_completions_tokens: optional numberThe maximum number of tokens the grader model may generate in its response.
-
reasoning_effort: optional "none" or "minimal" or "low" or 3 moreConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
"none" -
"minimal" -
"low" -
"medium" -
"high" -
"xhigh"
-
-
seed: optional numberA seed value to initialize the randomness, during sampling.
-
temperature: optional numberA higher temperature increases randomness in the outputs.
-
top_p: optional numberAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
multi_grader: object { calculate_output, graders, name, type }A MultiGrader object combines the output of multiple graders to produce a single score.
-
calculate_output: stringA formula to calculate the output based on grader results.
-
graders: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 moreA StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
string_check_grader: object { input, name, operation, 2 more }A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
text_similarity_grader: object { evaluation_metric, input, name, 2 more }A TextSimilarityGrader object which grades text based on similarity metrics.
-
python_grader: object { name, source, type, image_tag }A PythonGrader object that runs a python script on the input.
-
score_model_grader: object { input, model, name, 3 more }A ScoreModelGrader object that uses a model to assign a score to the input.
-
label_model_grader: object { input, labels, model, 3 more }A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
input: array of object { content, role, type }-
content: string or ResponseInputText or object { text, type } or 3 moreInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
Text input: stringA text input to the model.
-
response_input_text: object { text, type }A text input to the model.
-
Output text: object { text, type }A text output from the model.
-
text: stringThe text output from the model.
-
type: "output_text"The type of the output text. Always
output_text.
-
-
Input image: object { image_url, type, detail }An image input block used within EvalItem content arrays.
-
image_url: stringThe URL of the image input.
-
type: "input_image"The type of the image input. Always
input_image. -
detail: optional stringThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
response_input_audio: object { input_audio, type }An audio input to the model.
-
grader_inputs: array of string or ResponseInputText or object { text, type } or 2 moreA list of inputs, each of which may be either an input text, output text, input image, or input audio object.
-
-
role: "user" or "assistant" or "system" or "developer"The role of the message input. One of
user,assistant,system, ordeveloper.-
"user" -
"assistant" -
"system" -
"developer"
-
-
type: optional "message"The type of the message input. Always
message."message"
-
-
labels: array of stringThe labels to assign to each item in the evaluation.
-
model: stringThe model to use for the evaluation. Must support structured outputs.
-
name: stringThe name of the grader.
-
passing_labels: array of stringThe labels that indicate a passing result. Must be a subset of labels.
-
type: "label_model"The object type, which is always
label_model.
-
-
-
name: stringThe name of the grader.
-
type: "multi"The object type, which is always
multi.
-
-
-
hyperparameters: optional object { batch_size, compute_multiplier, eval_interval, 4 more }The hyperparameters used for the reinforcement fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
compute_multiplier: optional "auto" or numberMultiplier on amount of compute used for exploring search space during training.
-
union_member_0: "auto" -
union_member_1: number
-
-
eval_interval: optional "auto" or numberThe number of training steps between evaluation runs.
-
union_member_0: "auto" -
union_member_1: number
-
-
eval_samples: optional "auto" or numberNumber of evaluation samples to generate per training step.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
reasoning_effort: optional "default" or "low" or "medium" or "high"Level of reasoning effort.
-
"default" -
"low" -
"medium" -
"high"
-
-
-
Supervised Hyperparameters
-
supervised_hyperparameters: object { batch_size, learning_rate_multiplier, n_epochs }The hyperparameters used for the fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
Supervised Method
-
supervised_method: object { hyperparameters }Configuration for the supervised fine-tuning method.
-
hyperparameters: optional object { batch_size, learning_rate_multiplier, n_epochs }The hyperparameters used for the fine-tuning job.
-
batch_size: optional "auto" or numberNumber of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
-
union_member_0: "auto" -
union_member_1: number
-
-
learning_rate_multiplier: optional "auto" or numberScaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
-
union_member_0: "auto" -
union_member_1: number
-
-
n_epochs: optional "auto" or numberThe number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
-
union_member_0: "auto" -
union_member_1: number
-
-
-