Evals
List evals
EvalListPage evals().list(EvalListParamsparams = EvalListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())
get /evals
List evaluations for a project.
Parameters
-
EvalListParams params-
Optional<String> afterIdentifier for the last eval from the previous pagination request.
-
Optional<Long> limitNumber of evals to retrieve.
-
Optional<Order> orderSort order for evals by timestamp. Use
ascfor ascending order ordescfor descending order.-
ASC("asc") -
DESC("desc")
-
-
Optional<OrderBy> orderByEvals can be ordered by creation time or last updated time. Use
created_atfor creation time orupdated_atfor last updated time.-
CREATED_AT("created_at") -
UPDATED_AT("updated_at")
-
-
Returns
-
class EvalListResponse:An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:
-
Improve the quality of my chatbot
-
See how well my chatbot handles customer support
-
Check if o4-mini is better at my usecase than gpt-4o
-
String idUnique identifier for the evaluation.
-
long createdAtThe Unix timestamp (in seconds) for when the eval was created.
-
DataSourceConfig dataSourceConfigConfiguration of data sources used in runs of the evaluation.
-
class EvalCustomDataSourceConfig:A CustomDataSourceConfig which specifies the schema of your
itemand optionallysamplenamespaces. The response schema defines the shape of the data that will be:-
Used to define your testing criteria and
-
What data is required when creating a run
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
-
class Logs:A LogsDataSourceConfig which specifies the metadata property of your logs query. This is usually metadata like
usecase=chatbotorprompt-version=v2, etc. The schema returned by this data source config is used to defined what variables are available in your evals.itemandsampleare both defined when using this data source config.-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "logs"constantThe type of data source. Always
logs.LOGS("logs")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
class EvalStoredCompletionsDataSourceConfig:Deprecated in favor of LogsDataSourceConfig.
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String nameThe name of the evaluation.
-
JsonValue; object_ "eval"constantThe object type.
EVAL("eval")
-
List<TestingCriterion> testingCriteriaA list of testing criteria.
-
class LabelModelGrader:A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
List<Input> input-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
List<String> labelsThe labels to assign to each item in the evaluation.
-
String modelThe model to use for the evaluation. Must support structured outputs.
-
String nameThe name of the grader.
-
List<String> passingLabelsThe labels that indicate a passing result. Must be a subset of labels.
-
JsonValue; type "label_model"constantThe object type, which is always
label_model.LABEL_MODEL("label_model")
-
-
class StringCheckGrader:A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
String inputThe input text. This may include template strings.
-
String nameThe name of the grader.
-
Operation operationThe string check operation to perform. One of
eq,ne,like, orilike.-
EQ("eq") -
NE("ne") -
LIKE("like") -
ILIKE("ilike")
-
-
String referenceThe reference text. This may include template strings.
-
JsonValue; type "string_check"constantThe object type, which is always
string_check.STRING_CHECK("string_check")
-
-
class EvalGraderTextSimilarity:A TextSimilarityGrader object which grades text based on similarity metrics.
-
double passThresholdThe threshold for the score.
-
-
class EvalGraderPython:A PythonGrader object that runs a python script on the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
class EvalGraderScoreModel:A ScoreModelGrader object that uses a model to assign a score to the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.EvalListPage;
import com.openai.models.evals.EvalListParams;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
EvalListPage page = client.evals().list();
}
}
Response
{
"data": [
{
"id": "id",
"created_at": 0,
"data_source_config": {
"schema": {
"foo": "bar"
},
"type": "custom"
},
"metadata": {
"foo": "string"
},
"name": "Chatbot effectiveness Evaluation",
"object": "eval",
"testing_criteria": [
{
"input": [
{
"content": "string",
"role": "user",
"type": "message"
}
],
"labels": [
"string"
],
"model": "model",
"name": "name",
"passing_labels": [
"string"
],
"type": "label_model"
}
]
}
],
"first_id": "first_id",
"has_more": true,
"last_id": "last_id",
"object": "list"
}
Create eval
EvalCreateResponse evals().create(EvalCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
post /evals
Create the structure of an evaluation that can be used to test a model's performance. An evaluation is a set of testing criteria and the config for a data source, which dictates the schema of the data used in the evaluation. After creating an evaluation, you can run it on different models and model parameters. We support several types of graders and datasources. For more information, see the Evals guide.
Parameters
-
EvalCreateParams params-
DataSourceConfig dataSourceConfigThe configuration for the data source used for the evaluation runs. Dictates the schema of the data used in the evaluation.
-
class Custom:A CustomDataSourceConfig object that defines the schema for the data source used for the evaluation runs. This schema is used to define the shape of the data that will be:
-
Used to define your testing criteria and
-
What data is required when creating a run
-
ItemSchema itemSchemaThe json schema for each row in the data source.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
Optional<Boolean> includeSampleSchemaWhether the eval should expect you to populate the sample namespace (ie, by generating responses off of your data source)
-
-
class Logs:A data source config which specifies the metadata property of your logs query. This is usually metadata like
usecase=chatbotorprompt-version=v2, etc.-
JsonValue; type "logs"constantThe type of data source. Always
logs.LOGS("logs")
-
Optional<Metadata> metadataMetadata filters for the logs data source.
-
-
class StoredCompletions:Deprecated in favor of LogsDataSourceConfig.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataMetadata filters for the stored completions data source.
-
-
-
List<TestingCriterion> testingCriteriaA list of graders for all eval runs in this group. Graders can reference variables in the data source using double curly braces notation, like
{{item.variable_name}}. To reference the model's output, use thesamplenamespace (ie,{{sample.output_text}}).-
class LabelModel:A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
List<Input> inputA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class SimpleInputMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
List<String> labelsThe labels to classify to each item in the evaluation.
-
String modelThe model to use for the evaluation. Must support structured outputs.
-
String nameThe name of the grader.
-
List<String> passingLabelsThe labels that indicate a passing result. Must be a subset of labels.
-
JsonValue; type "label_model"constantThe object type, which is always
label_model.LABEL_MODEL("label_model")
-
-
class StringCheckGrader:A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
String inputThe input text. This may include template strings.
-
String nameThe name of the grader.
-
Operation operationThe string check operation to perform. One of
eq,ne,like, orilike.-
EQ("eq") -
NE("ne") -
LIKE("like") -
ILIKE("ilike")
-
-
String referenceThe reference text. This may include template strings.
-
JsonValue; type "string_check"constantThe object type, which is always
string_check.STRING_CHECK("string_check")
-
-
class TextSimilarity:A TextSimilarityGrader object which grades text based on similarity metrics.
-
double passThresholdThe threshold for the score.
-
-
class Python:A PythonGrader object that runs a python script on the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
class ScoreModel:A ScoreModelGrader object that uses a model to assign a score to the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> nameThe name of the evaluation.
-
Returns
-
class EvalCreateResponse:An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:
-
Improve the quality of my chatbot
-
See how well my chatbot handles customer support
-
Check if o4-mini is better at my usecase than gpt-4o
-
String idUnique identifier for the evaluation.
-
long createdAtThe Unix timestamp (in seconds) for when the eval was created.
-
DataSourceConfig dataSourceConfigConfiguration of data sources used in runs of the evaluation.
-
class EvalCustomDataSourceConfig:A CustomDataSourceConfig which specifies the schema of your
itemand optionallysamplenamespaces. The response schema defines the shape of the data that will be:-
Used to define your testing criteria and
-
What data is required when creating a run
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
-
class Logs:A LogsDataSourceConfig which specifies the metadata property of your logs query. This is usually metadata like
usecase=chatbotorprompt-version=v2, etc. The schema returned by this data source config is used to defined what variables are available in your evals.itemandsampleare both defined when using this data source config.-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "logs"constantThe type of data source. Always
logs.LOGS("logs")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
class EvalStoredCompletionsDataSourceConfig:Deprecated in favor of LogsDataSourceConfig.
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String nameThe name of the evaluation.
-
JsonValue; object_ "eval"constantThe object type.
EVAL("eval")
-
List<TestingCriterion> testingCriteriaA list of testing criteria.
-
class LabelModelGrader:A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
List<Input> input-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
List<String> labelsThe labels to assign to each item in the evaluation.
-
String modelThe model to use for the evaluation. Must support structured outputs.
-
String nameThe name of the grader.
-
List<String> passingLabelsThe labels that indicate a passing result. Must be a subset of labels.
-
JsonValue; type "label_model"constantThe object type, which is always
label_model.LABEL_MODEL("label_model")
-
-
class StringCheckGrader:A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
String inputThe input text. This may include template strings.
-
String nameThe name of the grader.
-
Operation operationThe string check operation to perform. One of
eq,ne,like, orilike.-
EQ("eq") -
NE("ne") -
LIKE("like") -
ILIKE("ilike")
-
-
String referenceThe reference text. This may include template strings.
-
JsonValue; type "string_check"constantThe object type, which is always
string_check.STRING_CHECK("string_check")
-
-
class EvalGraderTextSimilarity:A TextSimilarityGrader object which grades text based on similarity metrics.
-
double passThresholdThe threshold for the score.
-
-
class EvalGraderPython:A PythonGrader object that runs a python script on the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
class EvalGraderScoreModel:A ScoreModelGrader object that uses a model to assign a score to the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.JsonValue;
import com.openai.models.evals.EvalCreateParams;
import com.openai.models.evals.EvalCreateResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
EvalCreateParams params = EvalCreateParams.builder()
.customDataSourceConfig(EvalCreateParams.DataSourceConfig.Custom.ItemSchema.builder()
.putAdditionalProperty("foo", JsonValue.from("bar"))
.build())
.addTestingCriterion(EvalCreateParams.TestingCriterion.LabelModel.builder()
.addInput(EvalCreateParams.TestingCriterion.LabelModel.Input.SimpleInputMessage.builder()
.content("content")
.role("role")
.build())
.addLabel("string")
.model("model")
.name("name")
.addPassingLabel("string")
.build())
.build();
EvalCreateResponse eval = client.evals().create(params);
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source_config": {
"schema": {
"foo": "bar"
},
"type": "custom"
},
"metadata": {
"foo": "string"
},
"name": "Chatbot effectiveness Evaluation",
"object": "eval",
"testing_criteria": [
{
"input": [
{
"content": "string",
"role": "user",
"type": "message"
}
],
"labels": [
"string"
],
"model": "model",
"name": "name",
"passing_labels": [
"string"
],
"type": "label_model"
}
]
}
Get an eval
EvalRetrieveResponse evals().retrieve(EvalRetrieveParamsparams = EvalRetrieveParams.none(), RequestOptionsrequestOptions = RequestOptions.none())
get /evals/{eval_id}
Get an evaluation by ID.
Parameters
-
EvalRetrieveParams paramsOptional<String> evalId
Returns
-
class EvalRetrieveResponse:An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:
-
Improve the quality of my chatbot
-
See how well my chatbot handles customer support
-
Check if o4-mini is better at my usecase than gpt-4o
-
String idUnique identifier for the evaluation.
-
long createdAtThe Unix timestamp (in seconds) for when the eval was created.
-
DataSourceConfig dataSourceConfigConfiguration of data sources used in runs of the evaluation.
-
class EvalCustomDataSourceConfig:A CustomDataSourceConfig which specifies the schema of your
itemand optionallysamplenamespaces. The response schema defines the shape of the data that will be:-
Used to define your testing criteria and
-
What data is required when creating a run
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
-
class Logs:A LogsDataSourceConfig which specifies the metadata property of your logs query. This is usually metadata like
usecase=chatbotorprompt-version=v2, etc. The schema returned by this data source config is used to defined what variables are available in your evals.itemandsampleare both defined when using this data source config.-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "logs"constantThe type of data source. Always
logs.LOGS("logs")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
class EvalStoredCompletionsDataSourceConfig:Deprecated in favor of LogsDataSourceConfig.
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String nameThe name of the evaluation.
-
JsonValue; object_ "eval"constantThe object type.
EVAL("eval")
-
List<TestingCriterion> testingCriteriaA list of testing criteria.
-
class LabelModelGrader:A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
List<Input> input-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
List<String> labelsThe labels to assign to each item in the evaluation.
-
String modelThe model to use for the evaluation. Must support structured outputs.
-
String nameThe name of the grader.
-
List<String> passingLabelsThe labels that indicate a passing result. Must be a subset of labels.
-
JsonValue; type "label_model"constantThe object type, which is always
label_model.LABEL_MODEL("label_model")
-
-
class StringCheckGrader:A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
String inputThe input text. This may include template strings.
-
String nameThe name of the grader.
-
Operation operationThe string check operation to perform. One of
eq,ne,like, orilike.-
EQ("eq") -
NE("ne") -
LIKE("like") -
ILIKE("ilike")
-
-
String referenceThe reference text. This may include template strings.
-
JsonValue; type "string_check"constantThe object type, which is always
string_check.STRING_CHECK("string_check")
-
-
class EvalGraderTextSimilarity:A TextSimilarityGrader object which grades text based on similarity metrics.
-
double passThresholdThe threshold for the score.
-
-
class EvalGraderPython:A PythonGrader object that runs a python script on the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
class EvalGraderScoreModel:A ScoreModelGrader object that uses a model to assign a score to the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.EvalRetrieveParams;
import com.openai.models.evals.EvalRetrieveResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
EvalRetrieveResponse eval = client.evals().retrieve("eval_id");
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source_config": {
"schema": {
"foo": "bar"
},
"type": "custom"
},
"metadata": {
"foo": "string"
},
"name": "Chatbot effectiveness Evaluation",
"object": "eval",
"testing_criteria": [
{
"input": [
{
"content": "string",
"role": "user",
"type": "message"
}
],
"labels": [
"string"
],
"model": "model",
"name": "name",
"passing_labels": [
"string"
],
"type": "label_model"
}
]
}
Update an eval
EvalUpdateResponse evals().update(EvalUpdateParamsparams = EvalUpdateParams.none(), RequestOptionsrequestOptions = RequestOptions.none())
post /evals/{eval_id}
Update certain properties of an evaluation.
Parameters
-
EvalUpdateParams params-
Optional<String> evalId -
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> nameRename the evaluation.
-
Returns
-
class EvalUpdateResponse:An Eval object with a data source config and testing criteria. An Eval represents a task to be done for your LLM integration. Like:
-
Improve the quality of my chatbot
-
See how well my chatbot handles customer support
-
Check if o4-mini is better at my usecase than gpt-4o
-
String idUnique identifier for the evaluation.
-
long createdAtThe Unix timestamp (in seconds) for when the eval was created.
-
DataSourceConfig dataSourceConfigConfiguration of data sources used in runs of the evaluation.
-
class EvalCustomDataSourceConfig:A CustomDataSourceConfig which specifies the schema of your
itemand optionallysamplenamespaces. The response schema defines the shape of the data that will be:-
Used to define your testing criteria and
-
What data is required when creating a run
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
-
class Logs:A LogsDataSourceConfig which specifies the metadata property of your logs query. This is usually metadata like
usecase=chatbotorprompt-version=v2, etc. The schema returned by this data source config is used to defined what variables are available in your evals.itemandsampleare both defined when using this data source config.-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "logs"constantThe type of data source. Always
logs.LOGS("logs")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
class EvalStoredCompletionsDataSourceConfig:Deprecated in favor of LogsDataSourceConfig.
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String nameThe name of the evaluation.
-
JsonValue; object_ "eval"constantThe object type.
EVAL("eval")
-
List<TestingCriterion> testingCriteriaA list of testing criteria.
-
class LabelModelGrader:A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
-
List<Input> input-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
List<String> labelsThe labels to assign to each item in the evaluation.
-
String modelThe model to use for the evaluation. Must support structured outputs.
-
String nameThe name of the grader.
-
List<String> passingLabelsThe labels that indicate a passing result. Must be a subset of labels.
-
JsonValue; type "label_model"constantThe object type, which is always
label_model.LABEL_MODEL("label_model")
-
-
class StringCheckGrader:A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
-
String inputThe input text. This may include template strings.
-
String nameThe name of the grader.
-
Operation operationThe string check operation to perform. One of
eq,ne,like, orilike.-
EQ("eq") -
NE("ne") -
LIKE("like") -
ILIKE("ilike")
-
-
String referenceThe reference text. This may include template strings.
-
JsonValue; type "string_check"constantThe object type, which is always
string_check.STRING_CHECK("string_check")
-
-
class EvalGraderTextSimilarity:A TextSimilarityGrader object which grades text based on similarity metrics.
-
double passThresholdThe threshold for the score.
-
-
class EvalGraderPython:A PythonGrader object that runs a python script on the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
class EvalGraderScoreModel:A ScoreModelGrader object that uses a model to assign a score to the input.
-
Optional<Double> passThresholdThe threshold for the score.
-
-
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.EvalUpdateParams;
import com.openai.models.evals.EvalUpdateResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
EvalUpdateResponse eval = client.evals().update("eval_id");
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source_config": {
"schema": {
"foo": "bar"
},
"type": "custom"
},
"metadata": {
"foo": "string"
},
"name": "Chatbot effectiveness Evaluation",
"object": "eval",
"testing_criteria": [
{
"input": [
{
"content": "string",
"role": "user",
"type": "message"
}
],
"labels": [
"string"
],
"model": "model",
"name": "name",
"passing_labels": [
"string"
],
"type": "label_model"
}
]
}
Delete an eval
EvalDeleteResponse evals().delete(EvalDeleteParamsparams = EvalDeleteParams.none(), RequestOptionsrequestOptions = RequestOptions.none())
delete /evals/{eval_id}
Delete an evaluation.
Parameters
-
EvalDeleteParams paramsOptional<String> evalId
Returns
-
class EvalDeleteResponse:-
boolean deleted -
String evalId -
String object_
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.EvalDeleteParams;
import com.openai.models.evals.EvalDeleteResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
EvalDeleteResponse eval = client.evals().delete("eval_id");
}
}
Response
{
"deleted": true,
"eval_id": "eval_abc123",
"object": "eval.deleted"
}
Domain Types
Eval Custom Data Source Config
-
class EvalCustomDataSourceConfig:A CustomDataSourceConfig which specifies the schema of your
itemand optionallysamplenamespaces. The response schema defines the shape of the data that will be:-
Used to define your testing criteria and
-
What data is required when creating a run
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "custom"constantThe type of data source. Always
custom.CUSTOM("custom")
-
Eval Stored Completions Data Source Config
-
class EvalStoredCompletionsDataSourceConfig:Deprecated in favor of LogsDataSourceConfig.
-
Schema schemaThe json schema for the run data source items. Learn how to build JSON schemas here.
-
JsonValue; type "stored_completions"constantThe type of data source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Runs
Get eval runs
RunListPage evals().runs().list(RunListParamsparams = RunListParams.none(), RequestOptionsrequestOptions = RequestOptions.none())
get /evals/{eval_id}/runs
Get a list of runs for an evaluation.
Parameters
-
RunListParams params-
Optional<String> evalId -
Optional<String> afterIdentifier for the last run from the previous pagination request.
-
Optional<Long> limitNumber of runs to retrieve.
-
Optional<Order> orderSort order for runs by timestamp. Use
ascfor ascending order ordescfor descending order. Defaults toasc.-
ASC("asc") -
DESC("desc")
-
-
Optional<Status> statusFilter runs by status. One of
queued|in_progress|failed|completed|canceled.-
QUEUED("queued") -
IN_PROGRESS("in_progress") -
COMPLETED("completed") -
CANCELED("canceled") -
FAILED("failed")
-
-
Returns
-
class RunListResponse:A schema representing an evaluation run.
-
String idUnique identifier for the evaluation run.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DataSource dataSourceInformation about the run's data source.
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
class Responses:A ResponsesRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class InnerResponses:A EvalResponsesSource object describing a run data source configuration.
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<Long> createdAfterOnly include items created after this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<Long> createdBeforeOnly include items created before this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<String> instructionsSearchOptional string to search the 'instructions' field. This is a query parameter used to select responses.
-
Optional<JsonValue> metadataMetadata filter for the responses. This is a query parameter used to select responses.
-
Optional<String> modelThe name of the model to find responses for. This is a query parameter used to select responses.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Double> temperatureSampling temperature. This is a query parameter used to select responses.
-
Optional<List<String>> toolsList of tool names. This is a query parameter used to select responses.
-
Optional<Double> topPNucleus sampling parameter. This is a query parameter used to select responses.
-
Optional<List<String>> usersList of user identifiers. This is a query parameter used to select responses.
-
-
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class ChatMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText -
InputImage -
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.name" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<Text> textConfiguration options for a text response from the model. Can be plain text or structured JSON data. Learn more:
-
Optional<ResponseFormatTextConfig> formatAn object specifying the format that the model must output.
Configuring
{ "type": "json_schema" }enables Structured Outputs, which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.The default format is
{ "type": "text" }with no additional options.Not recommended for gpt-4o and newer models:
Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
class ResponseFormatTextJsonSchemaConfig:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Schema schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.
-
-
Optional<List<Tool>> toolsAn array of tools the model may call while generating a response. You can specify which tool to use by setting the
tool_choiceparameter.The two categories of tools you can provide the model are:
-
Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, like web search or file search. Learn more about built-in tools.
-
Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling.
-
class FunctionTool:Defines a function in your own code the model can choose to call. Learn more about function calling.
-
String nameThe name of the function to call.
-
Optional<Parameters> parametersA JSON schema object describing the parameters of the function.
-
Optional<Boolean> strictWhether to enforce strict parameter validation. Default
true. -
JsonValue; type "function"constantThe type of the function tool. Always
function.FUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function is deferred and loaded via tool search.
-
Optional<String> descriptionA description of the function. Used by the model to determine whether or not to call the function.
-
-
class FileSearchTool:A tool that searches for relevant content from uploaded files. Learn more about the file search tool.
-
JsonValue; type "file_search"constantThe type of the file search tool. Always
file_search.FILE_SEARCH("file_search")
-
List<String> vectorStoreIdsThe IDs of the vector stores to search.
-
Optional<Filters> filtersA filter to apply.
-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
String keyThe key to compare against the value.
-
Type typeSpecifies the comparison operator:
eq,ne,gt,gte,lt,lte,in,nin.-
eq: equals -
ne: not equal -
gt: greater than -
gte: greater than or equal -
lt: less than -
lte: less than or equal -
in: in -
nin: not in -
EQ("eq") -
NE("ne") -
GT("gt") -
GTE("gte") -
LT("lt") -
LTE("lte") -
IN("in") -
NIN("nin")
-
-
Value valueThe value to compare against the attribute key; supports string, number, or boolean types.
-
String -
double -
boolean -
List<ComparisonFilterValueItem>-
String -
double
-
-
-
-
class CompoundFilter:Combine multiple filters using
andoror.-
List<Filter> filtersArray of filters to combine. Items can be
ComparisonFilterorCompoundFilter.-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
JsonValue
-
-
Type typeType of operation:
andoror.-
AND("and") -
OR("or")
-
-
-
-
Optional<Long> maxNumResultsThe maximum number of results to return. This number should be between 1 and 50 inclusive.
-
Optional<RankingOptions> rankingOptionsRanking options for search.
-
Optional<HybridSearch> hybridSearchWeights that control how reciprocal rank fusion balances semantic embedding matches versus sparse keyword matches when hybrid search is enabled.
-
double embeddingWeightThe weight of the embedding in the reciprocal ranking fusion.
-
double textWeightThe weight of the text in the reciprocal ranking fusion.
-
-
Optional<Ranker> rankerThe ranker to use for the file search.
-
AUTO("auto") -
DEFAULT_2024_11_15("default-2024-11-15")
-
-
Optional<Double> scoreThresholdThe score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.
-
-
-
class ComputerTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
JsonValue; type "computer"constantThe type of the computer tool. Always
computer.COMPUTER("computer")
-
-
class ComputerUsePreviewTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
long displayHeightThe height of the computer display.
-
long displayWidthThe width of the computer display.
-
Environment environmentThe type of computer environment to control.
-
WINDOWS("windows") -
MAC("mac") -
LINUX("linux") -
UBUNTU("ubuntu") -
BROWSER("browser")
-
-
JsonValue; type "computer_use_preview"constantThe type of the computer use tool. Always
computer_use_preview.COMPUTER_USE_PREVIEW("computer_use_preview")
-
-
class WebSearchTool:Search the Internet for sources related to the prompt. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_searchorweb_search_2025_08_26.-
WEB_SEARCH("web_search") -
WEB_SEARCH_2025_08_26("web_search_2025_08_26")
-
-
Optional<Filters> filtersFilters for the search.
-
Optional<List<String>> allowedDomainsAllowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well.
Example:
["pubmed.ncbi.nlm.nih.gov"]
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe approximate location of the user.
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles. -
Optional<Type> typeThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
-
-
Mcp-
String serverLabelA label for this MCP server, used to identify it in tool calls.
-
JsonValue; type "mcp"constantThe type of the MCP tool. Always
mcp.MCP("mcp")
-
Optional<AllowedTools> allowedToolsList of allowed tool names or a filter object.
-
List<String> -
class McpToolFilter:A filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
Optional<String> authorizationAn OAuth access token that can be used with a remote MCP server, either with a custom MCP server URL or a service connector. Your application must handle the OAuth authorization flow and provide the token here.
-
Optional<ConnectorId> connectorIdIdentifier for service connectors, like those available in ChatGPT. One of
server_url,connector_id, ortunnel_idmust be provided. Learn more about service connectors here.Currently supported
connector_idvalues are:-
Dropbox:
connector_dropbox -
Gmail:
connector_gmail -
Google Calendar:
connector_googlecalendar -
Google Drive:
connector_googledrive -
Microsoft Teams:
connector_microsoftteams -
Outlook Calendar:
connector_outlookcalendar -
Outlook Email:
connector_outlookemail -
SharePoint:
connector_sharepoint -
CONNECTOR_DROPBOX("connector_dropbox") -
CONNECTOR_GMAIL("connector_gmail") -
CONNECTOR_GOOGLECALENDAR("connector_googlecalendar") -
CONNECTOR_GOOGLEDRIVE("connector_googledrive") -
CONNECTOR_MICROSOFTTEAMS("connector_microsoftteams") -
CONNECTOR_OUTLOOKCALENDAR("connector_outlookcalendar") -
CONNECTOR_OUTLOOKEMAIL("connector_outlookemail") -
CONNECTOR_SHAREPOINT("connector_sharepoint")
-
-
Optional<Boolean> deferLoadingWhether this MCP tool is deferred and discovered via tool search.
-
Optional<Headers> headersOptional HTTP headers to send to the MCP server. Use for authentication or other purposes.
-
Optional<RequireApproval> requireApprovalSpecify which of the MCP server's tools require approval.
-
class McpToolApprovalFilter:Specify which of the MCP server's tools require approval. Can be
always,never, or a filter object associated with tools that require approval.-
Optional<Always> alwaysA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
Optional<Never> neverA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
enum McpToolApprovalSetting:Specify a single approval policy for all tools. One of
alwaysornever. When set toalways, all tools will require approval. When set tonever, all tools will not require approval.-
ALWAYS("always") -
NEVER("never")
-
-
-
Optional<String> serverDescriptionOptional description of the MCP server, used to provide more context.
-
Optional<String> serverUrlThe URL for the MCP server. One of
server_url,connector_id, ortunnel_idmust be provided. -
Optional<String> tunnelIdThe Secure MCP Tunnel ID to use instead of a direct server URL. One of
server_url,connector_id, ortunnel_idmust be provided.
-
-
CodeInterpreter-
Container containerThe code interpreter container. Can be a container ID or an object that specifies uploaded file IDs to make available to your code, along with an optional
memory_limitsetting.-
String -
class CodeInterpreterToolAuto:Configuration for a code interpreter container. Optionally specify the IDs of the files to run the code on.
-
JsonValue; type "auto"constantAlways
auto.AUTO("auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the code interpreter container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled:-
JsonValue; type "disabled"constantDisable outbound network access. Always
disabled.DISABLED("disabled")
-
-
class ContainerNetworkPolicyAllowlist:-
List<String> allowedDomainsA list of allowed domains when type is
allowlist. -
JsonValue; type "allowlist"constantAllow outbound network access only to specified domains. Always
allowlist.ALLOWLIST("allowlist")
-
Optional<List<ContainerNetworkPolicyDomainSecret>> domainSecretsOptional domain-scoped secrets for allowlisted domains.
-
String domainThe domain associated with the secret.
-
String nameThe name of the secret to inject for the domain.
-
String valueThe secret value to inject for the domain.
-
-
-
-
-
-
JsonValue; type "code_interpreter"constantThe type of the code interpreter tool. Always
code_interpreter.CODE_INTERPRETER("code_interpreter")
-
-
ImageGeneration-
JsonValue; type "image_generation"constantThe type of the image generation tool. Always
image_generation.IMAGE_GENERATION("image_generation")
-
Optional<Action> actionWhether to generate a new image or edit an existing image. Default:
auto.-
GENERATE("generate") -
EDIT("edit") -
AUTO("auto")
-
-
Optional<Background> backgroundAllows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of
transparent,opaque, orauto(default value). Whenautois used, the model will automatically determine the best background for the image.gpt-image-2andgpt-image-2-2026-04-21do not support transparent backgrounds. Requests withbackgroundset totransparentwill return an error for these models; useopaqueorautoinstead.If
transparent, the output format needs to support transparency, so it should be set to eitherpng(default value) orwebp.-
TRANSPARENT("transparent") -
OPAQUE("opaque") -
AUTO("auto")
-
-
Optional<InputFidelity> inputFidelityControl how much effort the model will exert to match the style and features, especially facial features, of input images. This parameter is only supported for
gpt-image-1andgpt-image-1.5and later models, unsupported forgpt-image-1-mini. Supportshighandlow. Defaults tolow.-
HIGH("high") -
LOW("low")
-
-
Optional<InputImageMask> inputImageMaskOptional mask for inpainting. Contains
image_url(string, optional) andfile_id(string, optional).-
Optional<String> fileIdFile ID for the mask image.
-
Optional<String> imageUrlBase64-encoded mask image.
-
-
Optional<Model> modelThe image generation model to use. Default:
gpt-image-1.-
GPT_IMAGE_1("gpt-image-1") -
GPT_IMAGE_1_MINI("gpt-image-1-mini") -
GPT_IMAGE_2("gpt-image-2") -
GPT_IMAGE_2_2026_04_21("gpt-image-2-2026-04-21") -
GPT_IMAGE_1_5("gpt-image-1.5") -
CHATGPT_IMAGE_LATEST("chatgpt-image-latest")
-
-
Optional<Moderation> moderationModeration level for the generated image. Default:
auto.-
AUTO("auto") -
LOW("low")
-
-
Optional<Long> outputCompressionCompression level for the output image. Default: 100.
-
Optional<OutputFormat> outputFormatThe output format of the generated image. One of
png,webp, orjpeg. Default:png.-
PNG("png") -
WEBP("webp") -
JPEG("jpeg")
-
-
Optional<Long> partialImagesNumber of partial images to generate in streaming mode, from 0 (default value) to 3.
-
Optional<Quality> qualityThe quality of the generated image. One of
low,medium,high, orauto. Default:auto.-
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
AUTO("auto")
-
-
Optional<Size> sizeThe size of the generated images. For
gpt-image-2andgpt-image-2-2026-04-21, arbitrary resolutions are supported asWIDTHxHEIGHTstrings, for example1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above2560x1440are experimental, and the maximum supported resolution is3840x2160. The requested size must also satisfy the model's current pixel and edge limits. The standard sizes1024x1024,1536x1024, and1024x1536are supported by the GPT image models;autois supported for models that allow automatic sizing. Fordall-e-2, use one of256x256,512x512, or1024x1024. Fordall-e-3, use one of1024x1024,1792x1024, or1024x1792.-
_1024X1024("1024x1024") -
_1024X1536("1024x1536") -
_1536X1024("1536x1024") -
AUTO("auto")
-
-
-
JsonValue;-
JsonValue; type "local_shell"constantThe type of the local shell tool. Always
local_shell.LOCAL_SHELL("local_shell")
-
-
class FunctionShellTool:A tool that allows the model to execute shell commands.
-
JsonValue; type "shell"constantThe type of the shell tool. Always
shell.SHELL("shell")
-
Optional<Environment> environment-
class ContainerAuto:-
JsonValue; type "container_auto"constantAutomatically creates a container for this request
CONTAINER_AUTO("container_auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled: -
class ContainerNetworkPolicyAllowlist:
-
-
Optional<List<Skill>> skillsAn optional list of skills referenced by id or inline data.
-
class SkillReference:-
String skillIdThe ID of the referenced skill.
-
JsonValue; type "skill_reference"constantReferences a skill created with the /v1/skills endpoint.
SKILL_REFERENCE("skill_reference")
-
Optional<String> versionOptional skill version. Use a positive integer or 'latest'. Omit for default.
-
-
class InlineSkill:-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
InlineSkillSource sourceInline skill payload
-
String dataBase64-encoded skill zip bundle.
-
JsonValue; mediaType "application/zip"constantThe media type of the inline skill payload. Must be
application/zip.APPLICATION_ZIP("application/zip")
-
JsonValue; type "base64"constantThe type of the inline skill source. Must be
base64.BASE64("base64")
-
-
JsonValue; type "inline"constantDefines an inline skill for this request.
INLINE("inline")
-
-
-
-
class LocalEnvironment:-
JsonValue; type "local"constantUse a local computer environment.
LOCAL("local")
-
Optional<List<LocalSkill>> skillsAn optional list of skills.
-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
String pathThe path to the directory containing the skill.
-
-
-
class ContainerReference:-
String containerIdThe ID of the referenced container.
-
JsonValue; type "container_reference"constantReferences a container created with the /v1/containers endpoint
CONTAINER_REFERENCE("container_reference")
-
-
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
String nameThe name of the custom tool, used to identify it in tool calls.
-
JsonValue; type "custom"constantThe type of the custom tool. Always
custom.CUSTOM("custom")
-
Optional<Boolean> deferLoadingWhether this tool should be deferred and discovered via tool search.
-
Optional<String> descriptionOptional description of the custom tool, used to provide more context.
-
Optional<CustomToolInputFormat> formatThe input format for the custom tool. Default is unconstrained text.
-
JsonValue;-
JsonValue; type "text"constantUnconstrained text format. Always
text.TEXT("text")
-
-
Grammar-
String definitionThe grammar definition.
-
Syntax syntaxThe syntax of the grammar definition. One of
larkorregex.-
LARK("lark") -
REGEX("regex")
-
-
JsonValue; type "grammar"constantGrammar format. Always
grammar.GRAMMAR("grammar")
-
-
-
-
class NamespaceTool:Groups function/custom tools under a shared namespace.
-
String descriptionA description of the namespace shown to the model.
-
String nameThe namespace name used in tool calls (for example,
crm). -
List<Tool> toolsThe function/custom tools available inside this namespace.
-
class Function:-
String name -
JsonValue; type "function"constantFUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function should be deferred and discovered via tool search.
-
Optional<String> description -
Optional<JsonValue> parameters -
Optional<Boolean> strict
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
-
JsonValue; type "namespace"constantThe type of the tool. Always
namespace.NAMESPACE("namespace")
-
-
class ToolSearchTool:Hosted or BYOT tool search configuration for deferred tools.
-
JsonValue; type "tool_search"constantThe type of the tool. Always
tool_search.TOOL_SEARCH("tool_search")
-
Optional<String> descriptionDescription shown to the model for a client-executed tool search tool.
-
Optional<Execution> executionWhether tool search is executed by the server or by the client.
-
SERVER("server") -
CLIENT("client")
-
-
Optional<JsonValue> parametersParameter schema for a client-executed tool search tool.
-
-
class WebSearchPreviewTool:This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_search_previeworweb_search_preview_2025_03_11.-
WEB_SEARCH_PREVIEW("web_search_preview") -
WEB_SEARCH_PREVIEW_2025_03_11("web_search_preview_2025_03_11")
-
-
Optional<List<SearchContentType>> searchContentTypes-
TEXT("text") -
IMAGE("image")
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe user's location.
-
JsonValue; type "approximate"constantThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles.
-
-
-
class ApplyPatchTool:Allows the assistant to create, delete, or update files using unified diffs.
-
JsonValue; type "apply_patch"constantThe type of the tool. Always
apply_patch.APPLY_PATCH("apply_patch")
-
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String evalIdThe identifier of the associated evaluation.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String modelThe model that is evaluated, if applicable.
-
String nameThe name of the evaluation run.
-
JsonValue; object_ "eval.run"constantThe type of the object. Always "eval.run".
EVAL_RUN("eval.run")
-
List<PerModelUsage> perModelUsageUsage statistics for each model during the evaluation run.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long invocationCountThe number of invocations.
-
String modelNameThe name of the model.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
List<PerTestingCriteriaResult> perTestingCriteriaResultsResults per testing criteria applied during the evaluation run.
-
long failedNumber of tests failed for this criteria.
-
long passedNumber of tests passed for this criteria.
-
String testingCriteriaA description of the testing criteria.
-
-
String reportUrlThe URL to the rendered evaluation run report on the UI dashboard.
-
ResultCounts resultCountsCounters summarizing the outcomes of the evaluation run.
-
long erroredNumber of output items that resulted in an error.
-
long failedNumber of output items that failed to pass the evaluation.
-
long passedNumber of output items that passed the evaluation.
-
long totalTotal number of executed output items.
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.RunListPage;
import com.openai.models.evals.runs.RunListParams;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
RunListPage page = client.evals().runs().list("eval_id");
}
}
Response
{
"data": [
{
"id": "id",
"created_at": 0,
"data_source": {
"source": {
"content": [
{
"item": {
"foo": "bar"
},
"sample": {
"foo": "bar"
}
}
],
"type": "file_content"
},
"type": "jsonl"
},
"error": {
"code": "code",
"message": "message"
},
"eval_id": "eval_id",
"metadata": {
"foo": "string"
},
"model": "model",
"name": "name",
"object": "eval.run",
"per_model_usage": [
{
"cached_tokens": 0,
"completion_tokens": 0,
"invocation_count": 0,
"model_name": "model_name",
"prompt_tokens": 0,
"total_tokens": 0
}
],
"per_testing_criteria_results": [
{
"failed": 0,
"passed": 0,
"testing_criteria": "testing_criteria"
}
],
"report_url": "https://example.com",
"result_counts": {
"errored": 0,
"failed": 0,
"passed": 0,
"total": 0
},
"status": "status"
}
],
"first_id": "first_id",
"has_more": true,
"last_id": "last_id",
"object": "list"
}
Create eval run
RunCreateResponse evals().runs().create(RunCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
post /evals/{eval_id}/runs
Kicks off a new run for a given evaluation, specifying the data source, and what model configuration to use to test. The datasource will be validated against the schema specified in the config of the evaluation.
Parameters
-
RunCreateParams params-
Optional<String> evalId -
DataSource dataSourceDetails about the run's data source.
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
class CreateEvalResponsesRunDataSource:A ResponsesRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class Responses:A EvalResponsesSource object describing a run data source configuration.
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<Long> createdAfterOnly include items created after this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<Long> createdBeforeOnly include items created before this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<String> instructionsSearchOptional string to search the 'instructions' field. This is a query parameter used to select responses.
-
Optional<JsonValue> metadataMetadata filter for the responses. This is a query parameter used to select responses.
-
Optional<String> modelThe name of the model to find responses for. This is a query parameter used to select responses.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Double> temperatureSampling temperature. This is a query parameter used to select responses.
-
Optional<List<String>> toolsList of tool names. This is a query parameter used to select responses.
-
Optional<Double> topPNucleus sampling parameter. This is a query parameter used to select responses.
-
Optional<List<String>> usersList of user identifiers. This is a query parameter used to select responses.
-
-
-
Type typeThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class ChatMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText -
InputImage -
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.name" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<Text> textConfiguration options for a text response from the model. Can be plain text or structured JSON data. Learn more:
-
Optional<ResponseFormatTextConfig> formatAn object specifying the format that the model must output.
Configuring
{ "type": "json_schema" }enables Structured Outputs, which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.The default format is
{ "type": "text" }with no additional options.Not recommended for gpt-4o and newer models:
Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
class ResponseFormatTextJsonSchemaConfig:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Schema schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.
-
-
Optional<List<Tool>> toolsAn array of tools the model may call while generating a response. You can specify which tool to use by setting the
tool_choiceparameter.The two categories of tools you can provide the model are:
-
Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, like web search or file search. Learn more about built-in tools.
-
Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling.
-
class FunctionTool:Defines a function in your own code the model can choose to call. Learn more about function calling.
-
String nameThe name of the function to call.
-
Optional<Parameters> parametersA JSON schema object describing the parameters of the function.
-
Optional<Boolean> strictWhether to enforce strict parameter validation. Default
true. -
JsonValue; type "function"constantThe type of the function tool. Always
function.FUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function is deferred and loaded via tool search.
-
Optional<String> descriptionA description of the function. Used by the model to determine whether or not to call the function.
-
-
class FileSearchTool:A tool that searches for relevant content from uploaded files. Learn more about the file search tool.
-
JsonValue; type "file_search"constantThe type of the file search tool. Always
file_search.FILE_SEARCH("file_search")
-
List<String> vectorStoreIdsThe IDs of the vector stores to search.
-
Optional<Filters> filtersA filter to apply.
-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
String keyThe key to compare against the value.
-
Type typeSpecifies the comparison operator:
eq,ne,gt,gte,lt,lte,in,nin.-
eq: equals -
ne: not equal -
gt: greater than -
gte: greater than or equal -
lt: less than -
lte: less than or equal -
in: in -
nin: not in -
EQ("eq") -
NE("ne") -
GT("gt") -
GTE("gte") -
LT("lt") -
LTE("lte") -
IN("in") -
NIN("nin")
-
-
Value valueThe value to compare against the attribute key; supports string, number, or boolean types.
-
String -
double -
boolean -
List<ComparisonFilterValueItem>-
String -
double
-
-
-
-
class CompoundFilter:Combine multiple filters using
andoror.-
List<Filter> filtersArray of filters to combine. Items can be
ComparisonFilterorCompoundFilter.-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
JsonValue
-
-
Type typeType of operation:
andoror.-
AND("and") -
OR("or")
-
-
-
-
Optional<Long> maxNumResultsThe maximum number of results to return. This number should be between 1 and 50 inclusive.
-
Optional<RankingOptions> rankingOptionsRanking options for search.
-
Optional<HybridSearch> hybridSearchWeights that control how reciprocal rank fusion balances semantic embedding matches versus sparse keyword matches when hybrid search is enabled.
-
double embeddingWeightThe weight of the embedding in the reciprocal ranking fusion.
-
double textWeightThe weight of the text in the reciprocal ranking fusion.
-
-
Optional<Ranker> rankerThe ranker to use for the file search.
-
AUTO("auto") -
DEFAULT_2024_11_15("default-2024-11-15")
-
-
Optional<Double> scoreThresholdThe score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.
-
-
-
class ComputerTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
JsonValue; type "computer"constantThe type of the computer tool. Always
computer.COMPUTER("computer")
-
-
class ComputerUsePreviewTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
long displayHeightThe height of the computer display.
-
long displayWidthThe width of the computer display.
-
Environment environmentThe type of computer environment to control.
-
WINDOWS("windows") -
MAC("mac") -
LINUX("linux") -
UBUNTU("ubuntu") -
BROWSER("browser")
-
-
JsonValue; type "computer_use_preview"constantThe type of the computer use tool. Always
computer_use_preview.COMPUTER_USE_PREVIEW("computer_use_preview")
-
-
class WebSearchTool:Search the Internet for sources related to the prompt. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_searchorweb_search_2025_08_26.-
WEB_SEARCH("web_search") -
WEB_SEARCH_2025_08_26("web_search_2025_08_26")
-
-
Optional<Filters> filtersFilters for the search.
-
Optional<List<String>> allowedDomainsAllowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well.
Example:
["pubmed.ncbi.nlm.nih.gov"]
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe approximate location of the user.
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles. -
Optional<Type> typeThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
-
-
Mcp-
String serverLabelA label for this MCP server, used to identify it in tool calls.
-
JsonValue; type "mcp"constantThe type of the MCP tool. Always
mcp.MCP("mcp")
-
Optional<AllowedTools> allowedToolsList of allowed tool names or a filter object.
-
List<String> -
class McpToolFilter:A filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
Optional<String> authorizationAn OAuth access token that can be used with a remote MCP server, either with a custom MCP server URL or a service connector. Your application must handle the OAuth authorization flow and provide the token here.
-
Optional<ConnectorId> connectorIdIdentifier for service connectors, like those available in ChatGPT. One of
server_url,connector_id, ortunnel_idmust be provided. Learn more about service connectors here.Currently supported
connector_idvalues are:-
Dropbox:
connector_dropbox -
Gmail:
connector_gmail -
Google Calendar:
connector_googlecalendar -
Google Drive:
connector_googledrive -
Microsoft Teams:
connector_microsoftteams -
Outlook Calendar:
connector_outlookcalendar -
Outlook Email:
connector_outlookemail -
SharePoint:
connector_sharepoint -
CONNECTOR_DROPBOX("connector_dropbox") -
CONNECTOR_GMAIL("connector_gmail") -
CONNECTOR_GOOGLECALENDAR("connector_googlecalendar") -
CONNECTOR_GOOGLEDRIVE("connector_googledrive") -
CONNECTOR_MICROSOFTTEAMS("connector_microsoftteams") -
CONNECTOR_OUTLOOKCALENDAR("connector_outlookcalendar") -
CONNECTOR_OUTLOOKEMAIL("connector_outlookemail") -
CONNECTOR_SHAREPOINT("connector_sharepoint")
-
-
Optional<Boolean> deferLoadingWhether this MCP tool is deferred and discovered via tool search.
-
Optional<Headers> headersOptional HTTP headers to send to the MCP server. Use for authentication or other purposes.
-
Optional<RequireApproval> requireApprovalSpecify which of the MCP server's tools require approval.
-
class McpToolApprovalFilter:Specify which of the MCP server's tools require approval. Can be
always,never, or a filter object associated with tools that require approval.-
Optional<Always> alwaysA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
Optional<Never> neverA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
enum McpToolApprovalSetting:Specify a single approval policy for all tools. One of
alwaysornever. When set toalways, all tools will require approval. When set tonever, all tools will not require approval.-
ALWAYS("always") -
NEVER("never")
-
-
-
Optional<String> serverDescriptionOptional description of the MCP server, used to provide more context.
-
Optional<String> serverUrlThe URL for the MCP server. One of
server_url,connector_id, ortunnel_idmust be provided. -
Optional<String> tunnelIdThe Secure MCP Tunnel ID to use instead of a direct server URL. One of
server_url,connector_id, ortunnel_idmust be provided.
-
-
CodeInterpreter-
Container containerThe code interpreter container. Can be a container ID or an object that specifies uploaded file IDs to make available to your code, along with an optional
memory_limitsetting.-
String -
class CodeInterpreterToolAuto:Configuration for a code interpreter container. Optionally specify the IDs of the files to run the code on.
-
JsonValue; type "auto"constantAlways
auto.AUTO("auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the code interpreter container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled:-
JsonValue; type "disabled"constantDisable outbound network access. Always
disabled.DISABLED("disabled")
-
-
class ContainerNetworkPolicyAllowlist:-
List<String> allowedDomainsA list of allowed domains when type is
allowlist. -
JsonValue; type "allowlist"constantAllow outbound network access only to specified domains. Always
allowlist.ALLOWLIST("allowlist")
-
Optional<List<ContainerNetworkPolicyDomainSecret>> domainSecretsOptional domain-scoped secrets for allowlisted domains.
-
String domainThe domain associated with the secret.
-
String nameThe name of the secret to inject for the domain.
-
String valueThe secret value to inject for the domain.
-
-
-
-
-
-
JsonValue; type "code_interpreter"constantThe type of the code interpreter tool. Always
code_interpreter.CODE_INTERPRETER("code_interpreter")
-
-
ImageGeneration-
JsonValue; type "image_generation"constantThe type of the image generation tool. Always
image_generation.IMAGE_GENERATION("image_generation")
-
Optional<Action> actionWhether to generate a new image or edit an existing image. Default:
auto.-
GENERATE("generate") -
EDIT("edit") -
AUTO("auto")
-
-
Optional<Background> backgroundAllows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of
transparent,opaque, orauto(default value). Whenautois used, the model will automatically determine the best background for the image.gpt-image-2andgpt-image-2-2026-04-21do not support transparent backgrounds. Requests withbackgroundset totransparentwill return an error for these models; useopaqueorautoinstead.If
transparent, the output format needs to support transparency, so it should be set to eitherpng(default value) orwebp.-
TRANSPARENT("transparent") -
OPAQUE("opaque") -
AUTO("auto")
-
-
Optional<InputFidelity> inputFidelityControl how much effort the model will exert to match the style and features, especially facial features, of input images. This parameter is only supported for
gpt-image-1andgpt-image-1.5and later models, unsupported forgpt-image-1-mini. Supportshighandlow. Defaults tolow.-
HIGH("high") -
LOW("low")
-
-
Optional<InputImageMask> inputImageMaskOptional mask for inpainting. Contains
image_url(string, optional) andfile_id(string, optional).-
Optional<String> fileIdFile ID for the mask image.
-
Optional<String> imageUrlBase64-encoded mask image.
-
-
Optional<Model> modelThe image generation model to use. Default:
gpt-image-1.-
GPT_IMAGE_1("gpt-image-1") -
GPT_IMAGE_1_MINI("gpt-image-1-mini") -
GPT_IMAGE_2("gpt-image-2") -
GPT_IMAGE_2_2026_04_21("gpt-image-2-2026-04-21") -
GPT_IMAGE_1_5("gpt-image-1.5") -
CHATGPT_IMAGE_LATEST("chatgpt-image-latest")
-
-
Optional<Moderation> moderationModeration level for the generated image. Default:
auto.-
AUTO("auto") -
LOW("low")
-
-
Optional<Long> outputCompressionCompression level for the output image. Default: 100.
-
Optional<OutputFormat> outputFormatThe output format of the generated image. One of
png,webp, orjpeg. Default:png.-
PNG("png") -
WEBP("webp") -
JPEG("jpeg")
-
-
Optional<Long> partialImagesNumber of partial images to generate in streaming mode, from 0 (default value) to 3.
-
Optional<Quality> qualityThe quality of the generated image. One of
low,medium,high, orauto. Default:auto.-
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
AUTO("auto")
-
-
Optional<Size> sizeThe size of the generated images. For
gpt-image-2andgpt-image-2-2026-04-21, arbitrary resolutions are supported asWIDTHxHEIGHTstrings, for example1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above2560x1440are experimental, and the maximum supported resolution is3840x2160. The requested size must also satisfy the model's current pixel and edge limits. The standard sizes1024x1024,1536x1024, and1024x1536are supported by the GPT image models;autois supported for models that allow automatic sizing. Fordall-e-2, use one of256x256,512x512, or1024x1024. Fordall-e-3, use one of1024x1024,1792x1024, or1024x1792.-
_1024X1024("1024x1024") -
_1024X1536("1024x1536") -
_1536X1024("1536x1024") -
AUTO("auto")
-
-
-
JsonValue;-
JsonValue; type "local_shell"constantThe type of the local shell tool. Always
local_shell.LOCAL_SHELL("local_shell")
-
-
class FunctionShellTool:A tool that allows the model to execute shell commands.
-
JsonValue; type "shell"constantThe type of the shell tool. Always
shell.SHELL("shell")
-
Optional<Environment> environment-
class ContainerAuto:-
JsonValue; type "container_auto"constantAutomatically creates a container for this request
CONTAINER_AUTO("container_auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled: -
class ContainerNetworkPolicyAllowlist:
-
-
Optional<List<Skill>> skillsAn optional list of skills referenced by id or inline data.
-
class SkillReference:-
String skillIdThe ID of the referenced skill.
-
JsonValue; type "skill_reference"constantReferences a skill created with the /v1/skills endpoint.
SKILL_REFERENCE("skill_reference")
-
Optional<String> versionOptional skill version. Use a positive integer or 'latest'. Omit for default.
-
-
class InlineSkill:-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
InlineSkillSource sourceInline skill payload
-
String dataBase64-encoded skill zip bundle.
-
JsonValue; mediaType "application/zip"constantThe media type of the inline skill payload. Must be
application/zip.APPLICATION_ZIP("application/zip")
-
JsonValue; type "base64"constantThe type of the inline skill source. Must be
base64.BASE64("base64")
-
-
JsonValue; type "inline"constantDefines an inline skill for this request.
INLINE("inline")
-
-
-
-
class LocalEnvironment:-
JsonValue; type "local"constantUse a local computer environment.
LOCAL("local")
-
Optional<List<LocalSkill>> skillsAn optional list of skills.
-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
String pathThe path to the directory containing the skill.
-
-
-
class ContainerReference:-
String containerIdThe ID of the referenced container.
-
JsonValue; type "container_reference"constantReferences a container created with the /v1/containers endpoint
CONTAINER_REFERENCE("container_reference")
-
-
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
String nameThe name of the custom tool, used to identify it in tool calls.
-
JsonValue; type "custom"constantThe type of the custom tool. Always
custom.CUSTOM("custom")
-
Optional<Boolean> deferLoadingWhether this tool should be deferred and discovered via tool search.
-
Optional<String> descriptionOptional description of the custom tool, used to provide more context.
-
Optional<CustomToolInputFormat> formatThe input format for the custom tool. Default is unconstrained text.
-
JsonValue;-
JsonValue; type "text"constantUnconstrained text format. Always
text.TEXT("text")
-
-
Grammar-
String definitionThe grammar definition.
-
Syntax syntaxThe syntax of the grammar definition. One of
larkorregex.-
LARK("lark") -
REGEX("regex")
-
-
JsonValue; type "grammar"constantGrammar format. Always
grammar.GRAMMAR("grammar")
-
-
-
-
class NamespaceTool:Groups function/custom tools under a shared namespace.
-
String descriptionA description of the namespace shown to the model.
-
String nameThe namespace name used in tool calls (for example,
crm). -
List<Tool> toolsThe function/custom tools available inside this namespace.
-
class Function:-
String name -
JsonValue; type "function"constantFUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function should be deferred and discovered via tool search.
-
Optional<String> description -
Optional<JsonValue> parameters -
Optional<Boolean> strict
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
-
JsonValue; type "namespace"constantThe type of the tool. Always
namespace.NAMESPACE("namespace")
-
-
class ToolSearchTool:Hosted or BYOT tool search configuration for deferred tools.
-
JsonValue; type "tool_search"constantThe type of the tool. Always
tool_search.TOOL_SEARCH("tool_search")
-
Optional<String> descriptionDescription shown to the model for a client-executed tool search tool.
-
Optional<Execution> executionWhether tool search is executed by the server or by the client.
-
SERVER("server") -
CLIENT("client")
-
-
Optional<JsonValue> parametersParameter schema for a client-executed tool search tool.
-
-
class WebSearchPreviewTool:This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_search_previeworweb_search_preview_2025_03_11.-
WEB_SEARCH_PREVIEW("web_search_preview") -
WEB_SEARCH_PREVIEW_2025_03_11("web_search_preview_2025_03_11")
-
-
Optional<List<SearchContentType>> searchContentTypes-
TEXT("text") -
IMAGE("image")
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe user's location.
-
JsonValue; type "approximate"constantThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles.
-
-
-
class ApplyPatchTool:Allows the assistant to create, delete, or update files using unified diffs.
-
JsonValue; type "apply_patch"constantThe type of the tool. Always
apply_patch.APPLY_PATCH("apply_patch")
-
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> nameThe name of the run.
-
Returns
-
class RunCreateResponse:A schema representing an evaluation run.
-
String idUnique identifier for the evaluation run.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DataSource dataSourceInformation about the run's data source.
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
class Responses:A ResponsesRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class InnerResponses:A EvalResponsesSource object describing a run data source configuration.
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<Long> createdAfterOnly include items created after this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<Long> createdBeforeOnly include items created before this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<String> instructionsSearchOptional string to search the 'instructions' field. This is a query parameter used to select responses.
-
Optional<JsonValue> metadataMetadata filter for the responses. This is a query parameter used to select responses.
-
Optional<String> modelThe name of the model to find responses for. This is a query parameter used to select responses.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Double> temperatureSampling temperature. This is a query parameter used to select responses.
-
Optional<List<String>> toolsList of tool names. This is a query parameter used to select responses.
-
Optional<Double> topPNucleus sampling parameter. This is a query parameter used to select responses.
-
Optional<List<String>> usersList of user identifiers. This is a query parameter used to select responses.
-
-
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class ChatMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText -
InputImage -
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.name" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<Text> textConfiguration options for a text response from the model. Can be plain text or structured JSON data. Learn more:
-
Optional<ResponseFormatTextConfig> formatAn object specifying the format that the model must output.
Configuring
{ "type": "json_schema" }enables Structured Outputs, which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.The default format is
{ "type": "text" }with no additional options.Not recommended for gpt-4o and newer models:
Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
class ResponseFormatTextJsonSchemaConfig:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Schema schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.
-
-
Optional<List<Tool>> toolsAn array of tools the model may call while generating a response. You can specify which tool to use by setting the
tool_choiceparameter.The two categories of tools you can provide the model are:
-
Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, like web search or file search. Learn more about built-in tools.
-
Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling.
-
class FunctionTool:Defines a function in your own code the model can choose to call. Learn more about function calling.
-
String nameThe name of the function to call.
-
Optional<Parameters> parametersA JSON schema object describing the parameters of the function.
-
Optional<Boolean> strictWhether to enforce strict parameter validation. Default
true. -
JsonValue; type "function"constantThe type of the function tool. Always
function.FUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function is deferred and loaded via tool search.
-
Optional<String> descriptionA description of the function. Used by the model to determine whether or not to call the function.
-
-
class FileSearchTool:A tool that searches for relevant content from uploaded files. Learn more about the file search tool.
-
JsonValue; type "file_search"constantThe type of the file search tool. Always
file_search.FILE_SEARCH("file_search")
-
List<String> vectorStoreIdsThe IDs of the vector stores to search.
-
Optional<Filters> filtersA filter to apply.
-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
String keyThe key to compare against the value.
-
Type typeSpecifies the comparison operator:
eq,ne,gt,gte,lt,lte,in,nin.-
eq: equals -
ne: not equal -
gt: greater than -
gte: greater than or equal -
lt: less than -
lte: less than or equal -
in: in -
nin: not in -
EQ("eq") -
NE("ne") -
GT("gt") -
GTE("gte") -
LT("lt") -
LTE("lte") -
IN("in") -
NIN("nin")
-
-
Value valueThe value to compare against the attribute key; supports string, number, or boolean types.
-
String -
double -
boolean -
List<ComparisonFilterValueItem>-
String -
double
-
-
-
-
class CompoundFilter:Combine multiple filters using
andoror.-
List<Filter> filtersArray of filters to combine. Items can be
ComparisonFilterorCompoundFilter.-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
JsonValue
-
-
Type typeType of operation:
andoror.-
AND("and") -
OR("or")
-
-
-
-
Optional<Long> maxNumResultsThe maximum number of results to return. This number should be between 1 and 50 inclusive.
-
Optional<RankingOptions> rankingOptionsRanking options for search.
-
Optional<HybridSearch> hybridSearchWeights that control how reciprocal rank fusion balances semantic embedding matches versus sparse keyword matches when hybrid search is enabled.
-
double embeddingWeightThe weight of the embedding in the reciprocal ranking fusion.
-
double textWeightThe weight of the text in the reciprocal ranking fusion.
-
-
Optional<Ranker> rankerThe ranker to use for the file search.
-
AUTO("auto") -
DEFAULT_2024_11_15("default-2024-11-15")
-
-
Optional<Double> scoreThresholdThe score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.
-
-
-
class ComputerTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
JsonValue; type "computer"constantThe type of the computer tool. Always
computer.COMPUTER("computer")
-
-
class ComputerUsePreviewTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
long displayHeightThe height of the computer display.
-
long displayWidthThe width of the computer display.
-
Environment environmentThe type of computer environment to control.
-
WINDOWS("windows") -
MAC("mac") -
LINUX("linux") -
UBUNTU("ubuntu") -
BROWSER("browser")
-
-
JsonValue; type "computer_use_preview"constantThe type of the computer use tool. Always
computer_use_preview.COMPUTER_USE_PREVIEW("computer_use_preview")
-
-
class WebSearchTool:Search the Internet for sources related to the prompt. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_searchorweb_search_2025_08_26.-
WEB_SEARCH("web_search") -
WEB_SEARCH_2025_08_26("web_search_2025_08_26")
-
-
Optional<Filters> filtersFilters for the search.
-
Optional<List<String>> allowedDomainsAllowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well.
Example:
["pubmed.ncbi.nlm.nih.gov"]
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe approximate location of the user.
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles. -
Optional<Type> typeThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
-
-
Mcp-
String serverLabelA label for this MCP server, used to identify it in tool calls.
-
JsonValue; type "mcp"constantThe type of the MCP tool. Always
mcp.MCP("mcp")
-
Optional<AllowedTools> allowedToolsList of allowed tool names or a filter object.
-
List<String> -
class McpToolFilter:A filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
Optional<String> authorizationAn OAuth access token that can be used with a remote MCP server, either with a custom MCP server URL or a service connector. Your application must handle the OAuth authorization flow and provide the token here.
-
Optional<ConnectorId> connectorIdIdentifier for service connectors, like those available in ChatGPT. One of
server_url,connector_id, ortunnel_idmust be provided. Learn more about service connectors here.Currently supported
connector_idvalues are:-
Dropbox:
connector_dropbox -
Gmail:
connector_gmail -
Google Calendar:
connector_googlecalendar -
Google Drive:
connector_googledrive -
Microsoft Teams:
connector_microsoftteams -
Outlook Calendar:
connector_outlookcalendar -
Outlook Email:
connector_outlookemail -
SharePoint:
connector_sharepoint -
CONNECTOR_DROPBOX("connector_dropbox") -
CONNECTOR_GMAIL("connector_gmail") -
CONNECTOR_GOOGLECALENDAR("connector_googlecalendar") -
CONNECTOR_GOOGLEDRIVE("connector_googledrive") -
CONNECTOR_MICROSOFTTEAMS("connector_microsoftteams") -
CONNECTOR_OUTLOOKCALENDAR("connector_outlookcalendar") -
CONNECTOR_OUTLOOKEMAIL("connector_outlookemail") -
CONNECTOR_SHAREPOINT("connector_sharepoint")
-
-
Optional<Boolean> deferLoadingWhether this MCP tool is deferred and discovered via tool search.
-
Optional<Headers> headersOptional HTTP headers to send to the MCP server. Use for authentication or other purposes.
-
Optional<RequireApproval> requireApprovalSpecify which of the MCP server's tools require approval.
-
class McpToolApprovalFilter:Specify which of the MCP server's tools require approval. Can be
always,never, or a filter object associated with tools that require approval.-
Optional<Always> alwaysA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
Optional<Never> neverA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
enum McpToolApprovalSetting:Specify a single approval policy for all tools. One of
alwaysornever. When set toalways, all tools will require approval. When set tonever, all tools will not require approval.-
ALWAYS("always") -
NEVER("never")
-
-
-
Optional<String> serverDescriptionOptional description of the MCP server, used to provide more context.
-
Optional<String> serverUrlThe URL for the MCP server. One of
server_url,connector_id, ortunnel_idmust be provided. -
Optional<String> tunnelIdThe Secure MCP Tunnel ID to use instead of a direct server URL. One of
server_url,connector_id, ortunnel_idmust be provided.
-
-
CodeInterpreter-
Container containerThe code interpreter container. Can be a container ID or an object that specifies uploaded file IDs to make available to your code, along with an optional
memory_limitsetting.-
String -
class CodeInterpreterToolAuto:Configuration for a code interpreter container. Optionally specify the IDs of the files to run the code on.
-
JsonValue; type "auto"constantAlways
auto.AUTO("auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the code interpreter container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled:-
JsonValue; type "disabled"constantDisable outbound network access. Always
disabled.DISABLED("disabled")
-
-
class ContainerNetworkPolicyAllowlist:-
List<String> allowedDomainsA list of allowed domains when type is
allowlist. -
JsonValue; type "allowlist"constantAllow outbound network access only to specified domains. Always
allowlist.ALLOWLIST("allowlist")
-
Optional<List<ContainerNetworkPolicyDomainSecret>> domainSecretsOptional domain-scoped secrets for allowlisted domains.
-
String domainThe domain associated with the secret.
-
String nameThe name of the secret to inject for the domain.
-
String valueThe secret value to inject for the domain.
-
-
-
-
-
-
JsonValue; type "code_interpreter"constantThe type of the code interpreter tool. Always
code_interpreter.CODE_INTERPRETER("code_interpreter")
-
-
ImageGeneration-
JsonValue; type "image_generation"constantThe type of the image generation tool. Always
image_generation.IMAGE_GENERATION("image_generation")
-
Optional<Action> actionWhether to generate a new image or edit an existing image. Default:
auto.-
GENERATE("generate") -
EDIT("edit") -
AUTO("auto")
-
-
Optional<Background> backgroundAllows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of
transparent,opaque, orauto(default value). Whenautois used, the model will automatically determine the best background for the image.gpt-image-2andgpt-image-2-2026-04-21do not support transparent backgrounds. Requests withbackgroundset totransparentwill return an error for these models; useopaqueorautoinstead.If
transparent, the output format needs to support transparency, so it should be set to eitherpng(default value) orwebp.-
TRANSPARENT("transparent") -
OPAQUE("opaque") -
AUTO("auto")
-
-
Optional<InputFidelity> inputFidelityControl how much effort the model will exert to match the style and features, especially facial features, of input images. This parameter is only supported for
gpt-image-1andgpt-image-1.5and later models, unsupported forgpt-image-1-mini. Supportshighandlow. Defaults tolow.-
HIGH("high") -
LOW("low")
-
-
Optional<InputImageMask> inputImageMaskOptional mask for inpainting. Contains
image_url(string, optional) andfile_id(string, optional).-
Optional<String> fileIdFile ID for the mask image.
-
Optional<String> imageUrlBase64-encoded mask image.
-
-
Optional<Model> modelThe image generation model to use. Default:
gpt-image-1.-
GPT_IMAGE_1("gpt-image-1") -
GPT_IMAGE_1_MINI("gpt-image-1-mini") -
GPT_IMAGE_2("gpt-image-2") -
GPT_IMAGE_2_2026_04_21("gpt-image-2-2026-04-21") -
GPT_IMAGE_1_5("gpt-image-1.5") -
CHATGPT_IMAGE_LATEST("chatgpt-image-latest")
-
-
Optional<Moderation> moderationModeration level for the generated image. Default:
auto.-
AUTO("auto") -
LOW("low")
-
-
Optional<Long> outputCompressionCompression level for the output image. Default: 100.
-
Optional<OutputFormat> outputFormatThe output format of the generated image. One of
png,webp, orjpeg. Default:png.-
PNG("png") -
WEBP("webp") -
JPEG("jpeg")
-
-
Optional<Long> partialImagesNumber of partial images to generate in streaming mode, from 0 (default value) to 3.
-
Optional<Quality> qualityThe quality of the generated image. One of
low,medium,high, orauto. Default:auto.-
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
AUTO("auto")
-
-
Optional<Size> sizeThe size of the generated images. For
gpt-image-2andgpt-image-2-2026-04-21, arbitrary resolutions are supported asWIDTHxHEIGHTstrings, for example1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above2560x1440are experimental, and the maximum supported resolution is3840x2160. The requested size must also satisfy the model's current pixel and edge limits. The standard sizes1024x1024,1536x1024, and1024x1536are supported by the GPT image models;autois supported for models that allow automatic sizing. Fordall-e-2, use one of256x256,512x512, or1024x1024. Fordall-e-3, use one of1024x1024,1792x1024, or1024x1792.-
_1024X1024("1024x1024") -
_1024X1536("1024x1536") -
_1536X1024("1536x1024") -
AUTO("auto")
-
-
-
JsonValue;-
JsonValue; type "local_shell"constantThe type of the local shell tool. Always
local_shell.LOCAL_SHELL("local_shell")
-
-
class FunctionShellTool:A tool that allows the model to execute shell commands.
-
JsonValue; type "shell"constantThe type of the shell tool. Always
shell.SHELL("shell")
-
Optional<Environment> environment-
class ContainerAuto:-
JsonValue; type "container_auto"constantAutomatically creates a container for this request
CONTAINER_AUTO("container_auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled: -
class ContainerNetworkPolicyAllowlist:
-
-
Optional<List<Skill>> skillsAn optional list of skills referenced by id or inline data.
-
class SkillReference:-
String skillIdThe ID of the referenced skill.
-
JsonValue; type "skill_reference"constantReferences a skill created with the /v1/skills endpoint.
SKILL_REFERENCE("skill_reference")
-
Optional<String> versionOptional skill version. Use a positive integer or 'latest'. Omit for default.
-
-
class InlineSkill:-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
InlineSkillSource sourceInline skill payload
-
String dataBase64-encoded skill zip bundle.
-
JsonValue; mediaType "application/zip"constantThe media type of the inline skill payload. Must be
application/zip.APPLICATION_ZIP("application/zip")
-
JsonValue; type "base64"constantThe type of the inline skill source. Must be
base64.BASE64("base64")
-
-
JsonValue; type "inline"constantDefines an inline skill for this request.
INLINE("inline")
-
-
-
-
class LocalEnvironment:-
JsonValue; type "local"constantUse a local computer environment.
LOCAL("local")
-
Optional<List<LocalSkill>> skillsAn optional list of skills.
-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
String pathThe path to the directory containing the skill.
-
-
-
class ContainerReference:-
String containerIdThe ID of the referenced container.
-
JsonValue; type "container_reference"constantReferences a container created with the /v1/containers endpoint
CONTAINER_REFERENCE("container_reference")
-
-
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
String nameThe name of the custom tool, used to identify it in tool calls.
-
JsonValue; type "custom"constantThe type of the custom tool. Always
custom.CUSTOM("custom")
-
Optional<Boolean> deferLoadingWhether this tool should be deferred and discovered via tool search.
-
Optional<String> descriptionOptional description of the custom tool, used to provide more context.
-
Optional<CustomToolInputFormat> formatThe input format for the custom tool. Default is unconstrained text.
-
JsonValue;-
JsonValue; type "text"constantUnconstrained text format. Always
text.TEXT("text")
-
-
Grammar-
String definitionThe grammar definition.
-
Syntax syntaxThe syntax of the grammar definition. One of
larkorregex.-
LARK("lark") -
REGEX("regex")
-
-
JsonValue; type "grammar"constantGrammar format. Always
grammar.GRAMMAR("grammar")
-
-
-
-
class NamespaceTool:Groups function/custom tools under a shared namespace.
-
String descriptionA description of the namespace shown to the model.
-
String nameThe namespace name used in tool calls (for example,
crm). -
List<Tool> toolsThe function/custom tools available inside this namespace.
-
class Function:-
String name -
JsonValue; type "function"constantFUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function should be deferred and discovered via tool search.
-
Optional<String> description -
Optional<JsonValue> parameters -
Optional<Boolean> strict
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
-
JsonValue; type "namespace"constantThe type of the tool. Always
namespace.NAMESPACE("namespace")
-
-
class ToolSearchTool:Hosted or BYOT tool search configuration for deferred tools.
-
JsonValue; type "tool_search"constantThe type of the tool. Always
tool_search.TOOL_SEARCH("tool_search")
-
Optional<String> descriptionDescription shown to the model for a client-executed tool search tool.
-
Optional<Execution> executionWhether tool search is executed by the server or by the client.
-
SERVER("server") -
CLIENT("client")
-
-
Optional<JsonValue> parametersParameter schema for a client-executed tool search tool.
-
-
class WebSearchPreviewTool:This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_search_previeworweb_search_preview_2025_03_11.-
WEB_SEARCH_PREVIEW("web_search_preview") -
WEB_SEARCH_PREVIEW_2025_03_11("web_search_preview_2025_03_11")
-
-
Optional<List<SearchContentType>> searchContentTypes-
TEXT("text") -
IMAGE("image")
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe user's location.
-
JsonValue; type "approximate"constantThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles.
-
-
-
class ApplyPatchTool:Allows the assistant to create, delete, or update files using unified diffs.
-
JsonValue; type "apply_patch"constantThe type of the tool. Always
apply_patch.APPLY_PATCH("apply_patch")
-
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String evalIdThe identifier of the associated evaluation.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String modelThe model that is evaluated, if applicable.
-
String nameThe name of the evaluation run.
-
JsonValue; object_ "eval.run"constantThe type of the object. Always "eval.run".
EVAL_RUN("eval.run")
-
List<PerModelUsage> perModelUsageUsage statistics for each model during the evaluation run.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long invocationCountThe number of invocations.
-
String modelNameThe name of the model.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
List<PerTestingCriteriaResult> perTestingCriteriaResultsResults per testing criteria applied during the evaluation run.
-
long failedNumber of tests failed for this criteria.
-
long passedNumber of tests passed for this criteria.
-
String testingCriteriaA description of the testing criteria.
-
-
String reportUrlThe URL to the rendered evaluation run report on the UI dashboard.
-
ResultCounts resultCountsCounters summarizing the outcomes of the evaluation run.
-
long erroredNumber of output items that resulted in an error.
-
long failedNumber of output items that failed to pass the evaluation.
-
long passedNumber of output items that passed the evaluation.
-
long totalTotal number of executed output items.
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.JsonValue;
import com.openai.models.evals.runs.CreateEvalJsonlRunDataSource;
import com.openai.models.evals.runs.RunCreateParams;
import com.openai.models.evals.runs.RunCreateResponse;
import java.util.List;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
RunCreateParams params = RunCreateParams.builder()
.evalId("eval_id")
.dataSource(CreateEvalJsonlRunDataSource.builder()
.fileContentSource(List.of(CreateEvalJsonlRunDataSource.Source.FileContent.Content.builder()
.item(CreateEvalJsonlRunDataSource.Source.FileContent.Content.Item.builder()
.putAdditionalProperty("foo", JsonValue.from("bar"))
.build())
.build()))
.build())
.build();
RunCreateResponse run = client.evals().runs().create(params);
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source": {
"source": {
"content": [
{
"item": {
"foo": "bar"
},
"sample": {
"foo": "bar"
}
}
],
"type": "file_content"
},
"type": "jsonl"
},
"error": {
"code": "code",
"message": "message"
},
"eval_id": "eval_id",
"metadata": {
"foo": "string"
},
"model": "model",
"name": "name",
"object": "eval.run",
"per_model_usage": [
{
"cached_tokens": 0,
"completion_tokens": 0,
"invocation_count": 0,
"model_name": "model_name",
"prompt_tokens": 0,
"total_tokens": 0
}
],
"per_testing_criteria_results": [
{
"failed": 0,
"passed": 0,
"testing_criteria": "testing_criteria"
}
],
"report_url": "https://example.com",
"result_counts": {
"errored": 0,
"failed": 0,
"passed": 0,
"total": 0
},
"status": "status"
}
Get an eval run
RunRetrieveResponse evals().runs().retrieve(RunRetrieveParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
get /evals/{eval_id}/runs/{run_id}
Get an evaluation run by ID.
Parameters
-
RunRetrieveParams params-
String evalId -
Optional<String> runId
-
Returns
-
class RunRetrieveResponse:A schema representing an evaluation run.
-
String idUnique identifier for the evaluation run.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DataSource dataSourceInformation about the run's data source.
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
class Responses:A ResponsesRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class InnerResponses:A EvalResponsesSource object describing a run data source configuration.
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<Long> createdAfterOnly include items created after this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<Long> createdBeforeOnly include items created before this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<String> instructionsSearchOptional string to search the 'instructions' field. This is a query parameter used to select responses.
-
Optional<JsonValue> metadataMetadata filter for the responses. This is a query parameter used to select responses.
-
Optional<String> modelThe name of the model to find responses for. This is a query parameter used to select responses.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Double> temperatureSampling temperature. This is a query parameter used to select responses.
-
Optional<List<String>> toolsList of tool names. This is a query parameter used to select responses.
-
Optional<Double> topPNucleus sampling parameter. This is a query parameter used to select responses.
-
Optional<List<String>> usersList of user identifiers. This is a query parameter used to select responses.
-
-
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class ChatMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText -
InputImage -
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.name" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<Text> textConfiguration options for a text response from the model. Can be plain text or structured JSON data. Learn more:
-
Optional<ResponseFormatTextConfig> formatAn object specifying the format that the model must output.
Configuring
{ "type": "json_schema" }enables Structured Outputs, which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.The default format is
{ "type": "text" }with no additional options.Not recommended for gpt-4o and newer models:
Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
class ResponseFormatTextJsonSchemaConfig:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Schema schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.
-
-
Optional<List<Tool>> toolsAn array of tools the model may call while generating a response. You can specify which tool to use by setting the
tool_choiceparameter.The two categories of tools you can provide the model are:
-
Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, like web search or file search. Learn more about built-in tools.
-
Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling.
-
class FunctionTool:Defines a function in your own code the model can choose to call. Learn more about function calling.
-
String nameThe name of the function to call.
-
Optional<Parameters> parametersA JSON schema object describing the parameters of the function.
-
Optional<Boolean> strictWhether to enforce strict parameter validation. Default
true. -
JsonValue; type "function"constantThe type of the function tool. Always
function.FUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function is deferred and loaded via tool search.
-
Optional<String> descriptionA description of the function. Used by the model to determine whether or not to call the function.
-
-
class FileSearchTool:A tool that searches for relevant content from uploaded files. Learn more about the file search tool.
-
JsonValue; type "file_search"constantThe type of the file search tool. Always
file_search.FILE_SEARCH("file_search")
-
List<String> vectorStoreIdsThe IDs of the vector stores to search.
-
Optional<Filters> filtersA filter to apply.
-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
String keyThe key to compare against the value.
-
Type typeSpecifies the comparison operator:
eq,ne,gt,gte,lt,lte,in,nin.-
eq: equals -
ne: not equal -
gt: greater than -
gte: greater than or equal -
lt: less than -
lte: less than or equal -
in: in -
nin: not in -
EQ("eq") -
NE("ne") -
GT("gt") -
GTE("gte") -
LT("lt") -
LTE("lte") -
IN("in") -
NIN("nin")
-
-
Value valueThe value to compare against the attribute key; supports string, number, or boolean types.
-
String -
double -
boolean -
List<ComparisonFilterValueItem>-
String -
double
-
-
-
-
class CompoundFilter:Combine multiple filters using
andoror.-
List<Filter> filtersArray of filters to combine. Items can be
ComparisonFilterorCompoundFilter.-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
JsonValue
-
-
Type typeType of operation:
andoror.-
AND("and") -
OR("or")
-
-
-
-
Optional<Long> maxNumResultsThe maximum number of results to return. This number should be between 1 and 50 inclusive.
-
Optional<RankingOptions> rankingOptionsRanking options for search.
-
Optional<HybridSearch> hybridSearchWeights that control how reciprocal rank fusion balances semantic embedding matches versus sparse keyword matches when hybrid search is enabled.
-
double embeddingWeightThe weight of the embedding in the reciprocal ranking fusion.
-
double textWeightThe weight of the text in the reciprocal ranking fusion.
-
-
Optional<Ranker> rankerThe ranker to use for the file search.
-
AUTO("auto") -
DEFAULT_2024_11_15("default-2024-11-15")
-
-
Optional<Double> scoreThresholdThe score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.
-
-
-
class ComputerTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
JsonValue; type "computer"constantThe type of the computer tool. Always
computer.COMPUTER("computer")
-
-
class ComputerUsePreviewTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
long displayHeightThe height of the computer display.
-
long displayWidthThe width of the computer display.
-
Environment environmentThe type of computer environment to control.
-
WINDOWS("windows") -
MAC("mac") -
LINUX("linux") -
UBUNTU("ubuntu") -
BROWSER("browser")
-
-
JsonValue; type "computer_use_preview"constantThe type of the computer use tool. Always
computer_use_preview.COMPUTER_USE_PREVIEW("computer_use_preview")
-
-
class WebSearchTool:Search the Internet for sources related to the prompt. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_searchorweb_search_2025_08_26.-
WEB_SEARCH("web_search") -
WEB_SEARCH_2025_08_26("web_search_2025_08_26")
-
-
Optional<Filters> filtersFilters for the search.
-
Optional<List<String>> allowedDomainsAllowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well.
Example:
["pubmed.ncbi.nlm.nih.gov"]
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe approximate location of the user.
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles. -
Optional<Type> typeThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
-
-
Mcp-
String serverLabelA label for this MCP server, used to identify it in tool calls.
-
JsonValue; type "mcp"constantThe type of the MCP tool. Always
mcp.MCP("mcp")
-
Optional<AllowedTools> allowedToolsList of allowed tool names or a filter object.
-
List<String> -
class McpToolFilter:A filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
Optional<String> authorizationAn OAuth access token that can be used with a remote MCP server, either with a custom MCP server URL or a service connector. Your application must handle the OAuth authorization flow and provide the token here.
-
Optional<ConnectorId> connectorIdIdentifier for service connectors, like those available in ChatGPT. One of
server_url,connector_id, ortunnel_idmust be provided. Learn more about service connectors here.Currently supported
connector_idvalues are:-
Dropbox:
connector_dropbox -
Gmail:
connector_gmail -
Google Calendar:
connector_googlecalendar -
Google Drive:
connector_googledrive -
Microsoft Teams:
connector_microsoftteams -
Outlook Calendar:
connector_outlookcalendar -
Outlook Email:
connector_outlookemail -
SharePoint:
connector_sharepoint -
CONNECTOR_DROPBOX("connector_dropbox") -
CONNECTOR_GMAIL("connector_gmail") -
CONNECTOR_GOOGLECALENDAR("connector_googlecalendar") -
CONNECTOR_GOOGLEDRIVE("connector_googledrive") -
CONNECTOR_MICROSOFTTEAMS("connector_microsoftteams") -
CONNECTOR_OUTLOOKCALENDAR("connector_outlookcalendar") -
CONNECTOR_OUTLOOKEMAIL("connector_outlookemail") -
CONNECTOR_SHAREPOINT("connector_sharepoint")
-
-
Optional<Boolean> deferLoadingWhether this MCP tool is deferred and discovered via tool search.
-
Optional<Headers> headersOptional HTTP headers to send to the MCP server. Use for authentication or other purposes.
-
Optional<RequireApproval> requireApprovalSpecify which of the MCP server's tools require approval.
-
class McpToolApprovalFilter:Specify which of the MCP server's tools require approval. Can be
always,never, or a filter object associated with tools that require approval.-
Optional<Always> alwaysA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
Optional<Never> neverA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
enum McpToolApprovalSetting:Specify a single approval policy for all tools. One of
alwaysornever. When set toalways, all tools will require approval. When set tonever, all tools will not require approval.-
ALWAYS("always") -
NEVER("never")
-
-
-
Optional<String> serverDescriptionOptional description of the MCP server, used to provide more context.
-
Optional<String> serverUrlThe URL for the MCP server. One of
server_url,connector_id, ortunnel_idmust be provided. -
Optional<String> tunnelIdThe Secure MCP Tunnel ID to use instead of a direct server URL. One of
server_url,connector_id, ortunnel_idmust be provided.
-
-
CodeInterpreter-
Container containerThe code interpreter container. Can be a container ID or an object that specifies uploaded file IDs to make available to your code, along with an optional
memory_limitsetting.-
String -
class CodeInterpreterToolAuto:Configuration for a code interpreter container. Optionally specify the IDs of the files to run the code on.
-
JsonValue; type "auto"constantAlways
auto.AUTO("auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the code interpreter container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled:-
JsonValue; type "disabled"constantDisable outbound network access. Always
disabled.DISABLED("disabled")
-
-
class ContainerNetworkPolicyAllowlist:-
List<String> allowedDomainsA list of allowed domains when type is
allowlist. -
JsonValue; type "allowlist"constantAllow outbound network access only to specified domains. Always
allowlist.ALLOWLIST("allowlist")
-
Optional<List<ContainerNetworkPolicyDomainSecret>> domainSecretsOptional domain-scoped secrets for allowlisted domains.
-
String domainThe domain associated with the secret.
-
String nameThe name of the secret to inject for the domain.
-
String valueThe secret value to inject for the domain.
-
-
-
-
-
-
JsonValue; type "code_interpreter"constantThe type of the code interpreter tool. Always
code_interpreter.CODE_INTERPRETER("code_interpreter")
-
-
ImageGeneration-
JsonValue; type "image_generation"constantThe type of the image generation tool. Always
image_generation.IMAGE_GENERATION("image_generation")
-
Optional<Action> actionWhether to generate a new image or edit an existing image. Default:
auto.-
GENERATE("generate") -
EDIT("edit") -
AUTO("auto")
-
-
Optional<Background> backgroundAllows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of
transparent,opaque, orauto(default value). Whenautois used, the model will automatically determine the best background for the image.gpt-image-2andgpt-image-2-2026-04-21do not support transparent backgrounds. Requests withbackgroundset totransparentwill return an error for these models; useopaqueorautoinstead.If
transparent, the output format needs to support transparency, so it should be set to eitherpng(default value) orwebp.-
TRANSPARENT("transparent") -
OPAQUE("opaque") -
AUTO("auto")
-
-
Optional<InputFidelity> inputFidelityControl how much effort the model will exert to match the style and features, especially facial features, of input images. This parameter is only supported for
gpt-image-1andgpt-image-1.5and later models, unsupported forgpt-image-1-mini. Supportshighandlow. Defaults tolow.-
HIGH("high") -
LOW("low")
-
-
Optional<InputImageMask> inputImageMaskOptional mask for inpainting. Contains
image_url(string, optional) andfile_id(string, optional).-
Optional<String> fileIdFile ID for the mask image.
-
Optional<String> imageUrlBase64-encoded mask image.
-
-
Optional<Model> modelThe image generation model to use. Default:
gpt-image-1.-
GPT_IMAGE_1("gpt-image-1") -
GPT_IMAGE_1_MINI("gpt-image-1-mini") -
GPT_IMAGE_2("gpt-image-2") -
GPT_IMAGE_2_2026_04_21("gpt-image-2-2026-04-21") -
GPT_IMAGE_1_5("gpt-image-1.5") -
CHATGPT_IMAGE_LATEST("chatgpt-image-latest")
-
-
Optional<Moderation> moderationModeration level for the generated image. Default:
auto.-
AUTO("auto") -
LOW("low")
-
-
Optional<Long> outputCompressionCompression level for the output image. Default: 100.
-
Optional<OutputFormat> outputFormatThe output format of the generated image. One of
png,webp, orjpeg. Default:png.-
PNG("png") -
WEBP("webp") -
JPEG("jpeg")
-
-
Optional<Long> partialImagesNumber of partial images to generate in streaming mode, from 0 (default value) to 3.
-
Optional<Quality> qualityThe quality of the generated image. One of
low,medium,high, orauto. Default:auto.-
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
AUTO("auto")
-
-
Optional<Size> sizeThe size of the generated images. For
gpt-image-2andgpt-image-2-2026-04-21, arbitrary resolutions are supported asWIDTHxHEIGHTstrings, for example1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above2560x1440are experimental, and the maximum supported resolution is3840x2160. The requested size must also satisfy the model's current pixel and edge limits. The standard sizes1024x1024,1536x1024, and1024x1536are supported by the GPT image models;autois supported for models that allow automatic sizing. Fordall-e-2, use one of256x256,512x512, or1024x1024. Fordall-e-3, use one of1024x1024,1792x1024, or1024x1792.-
_1024X1024("1024x1024") -
_1024X1536("1024x1536") -
_1536X1024("1536x1024") -
AUTO("auto")
-
-
-
JsonValue;-
JsonValue; type "local_shell"constantThe type of the local shell tool. Always
local_shell.LOCAL_SHELL("local_shell")
-
-
class FunctionShellTool:A tool that allows the model to execute shell commands.
-
JsonValue; type "shell"constantThe type of the shell tool. Always
shell.SHELL("shell")
-
Optional<Environment> environment-
class ContainerAuto:-
JsonValue; type "container_auto"constantAutomatically creates a container for this request
CONTAINER_AUTO("container_auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled: -
class ContainerNetworkPolicyAllowlist:
-
-
Optional<List<Skill>> skillsAn optional list of skills referenced by id or inline data.
-
class SkillReference:-
String skillIdThe ID of the referenced skill.
-
JsonValue; type "skill_reference"constantReferences a skill created with the /v1/skills endpoint.
SKILL_REFERENCE("skill_reference")
-
Optional<String> versionOptional skill version. Use a positive integer or 'latest'. Omit for default.
-
-
class InlineSkill:-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
InlineSkillSource sourceInline skill payload
-
String dataBase64-encoded skill zip bundle.
-
JsonValue; mediaType "application/zip"constantThe media type of the inline skill payload. Must be
application/zip.APPLICATION_ZIP("application/zip")
-
JsonValue; type "base64"constantThe type of the inline skill source. Must be
base64.BASE64("base64")
-
-
JsonValue; type "inline"constantDefines an inline skill for this request.
INLINE("inline")
-
-
-
-
class LocalEnvironment:-
JsonValue; type "local"constantUse a local computer environment.
LOCAL("local")
-
Optional<List<LocalSkill>> skillsAn optional list of skills.
-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
String pathThe path to the directory containing the skill.
-
-
-
class ContainerReference:-
String containerIdThe ID of the referenced container.
-
JsonValue; type "container_reference"constantReferences a container created with the /v1/containers endpoint
CONTAINER_REFERENCE("container_reference")
-
-
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
String nameThe name of the custom tool, used to identify it in tool calls.
-
JsonValue; type "custom"constantThe type of the custom tool. Always
custom.CUSTOM("custom")
-
Optional<Boolean> deferLoadingWhether this tool should be deferred and discovered via tool search.
-
Optional<String> descriptionOptional description of the custom tool, used to provide more context.
-
Optional<CustomToolInputFormat> formatThe input format for the custom tool. Default is unconstrained text.
-
JsonValue;-
JsonValue; type "text"constantUnconstrained text format. Always
text.TEXT("text")
-
-
Grammar-
String definitionThe grammar definition.
-
Syntax syntaxThe syntax of the grammar definition. One of
larkorregex.-
LARK("lark") -
REGEX("regex")
-
-
JsonValue; type "grammar"constantGrammar format. Always
grammar.GRAMMAR("grammar")
-
-
-
-
class NamespaceTool:Groups function/custom tools under a shared namespace.
-
String descriptionA description of the namespace shown to the model.
-
String nameThe namespace name used in tool calls (for example,
crm). -
List<Tool> toolsThe function/custom tools available inside this namespace.
-
class Function:-
String name -
JsonValue; type "function"constantFUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function should be deferred and discovered via tool search.
-
Optional<String> description -
Optional<JsonValue> parameters -
Optional<Boolean> strict
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
-
JsonValue; type "namespace"constantThe type of the tool. Always
namespace.NAMESPACE("namespace")
-
-
class ToolSearchTool:Hosted or BYOT tool search configuration for deferred tools.
-
JsonValue; type "tool_search"constantThe type of the tool. Always
tool_search.TOOL_SEARCH("tool_search")
-
Optional<String> descriptionDescription shown to the model for a client-executed tool search tool.
-
Optional<Execution> executionWhether tool search is executed by the server or by the client.
-
SERVER("server") -
CLIENT("client")
-
-
Optional<JsonValue> parametersParameter schema for a client-executed tool search tool.
-
-
class WebSearchPreviewTool:This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_search_previeworweb_search_preview_2025_03_11.-
WEB_SEARCH_PREVIEW("web_search_preview") -
WEB_SEARCH_PREVIEW_2025_03_11("web_search_preview_2025_03_11")
-
-
Optional<List<SearchContentType>> searchContentTypes-
TEXT("text") -
IMAGE("image")
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe user's location.
-
JsonValue; type "approximate"constantThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles.
-
-
-
class ApplyPatchTool:Allows the assistant to create, delete, or update files using unified diffs.
-
JsonValue; type "apply_patch"constantThe type of the tool. Always
apply_patch.APPLY_PATCH("apply_patch")
-
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String evalIdThe identifier of the associated evaluation.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String modelThe model that is evaluated, if applicable.
-
String nameThe name of the evaluation run.
-
JsonValue; object_ "eval.run"constantThe type of the object. Always "eval.run".
EVAL_RUN("eval.run")
-
List<PerModelUsage> perModelUsageUsage statistics for each model during the evaluation run.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long invocationCountThe number of invocations.
-
String modelNameThe name of the model.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
List<PerTestingCriteriaResult> perTestingCriteriaResultsResults per testing criteria applied during the evaluation run.
-
long failedNumber of tests failed for this criteria.
-
long passedNumber of tests passed for this criteria.
-
String testingCriteriaA description of the testing criteria.
-
-
String reportUrlThe URL to the rendered evaluation run report on the UI dashboard.
-
ResultCounts resultCountsCounters summarizing the outcomes of the evaluation run.
-
long erroredNumber of output items that resulted in an error.
-
long failedNumber of output items that failed to pass the evaluation.
-
long passedNumber of output items that passed the evaluation.
-
long totalTotal number of executed output items.
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.RunRetrieveParams;
import com.openai.models.evals.runs.RunRetrieveResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
RunRetrieveParams params = RunRetrieveParams.builder()
.evalId("eval_id")
.runId("run_id")
.build();
RunRetrieveResponse run = client.evals().runs().retrieve(params);
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source": {
"source": {
"content": [
{
"item": {
"foo": "bar"
},
"sample": {
"foo": "bar"
}
}
],
"type": "file_content"
},
"type": "jsonl"
},
"error": {
"code": "code",
"message": "message"
},
"eval_id": "eval_id",
"metadata": {
"foo": "string"
},
"model": "model",
"name": "name",
"object": "eval.run",
"per_model_usage": [
{
"cached_tokens": 0,
"completion_tokens": 0,
"invocation_count": 0,
"model_name": "model_name",
"prompt_tokens": 0,
"total_tokens": 0
}
],
"per_testing_criteria_results": [
{
"failed": 0,
"passed": 0,
"testing_criteria": "testing_criteria"
}
],
"report_url": "https://example.com",
"result_counts": {
"errored": 0,
"failed": 0,
"passed": 0,
"total": 0
},
"status": "status"
}
Cancel eval run
RunCancelResponse evals().runs().cancel(RunCancelParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
post /evals/{eval_id}/runs/{run_id}
Cancel an ongoing evaluation run.
Parameters
-
RunCancelParams params-
String evalId -
Optional<String> runId
-
Returns
-
class RunCancelResponse:A schema representing an evaluation run.
-
String idUnique identifier for the evaluation run.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DataSource dataSourceInformation about the run's data source.
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
class Responses:A ResponsesRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class InnerResponses:A EvalResponsesSource object describing a run data source configuration.
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<Long> createdAfterOnly include items created after this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<Long> createdBeforeOnly include items created before this timestamp (inclusive). This is a query parameter used to select responses.
-
Optional<String> instructionsSearchOptional string to search the 'instructions' field. This is a query parameter used to select responses.
-
Optional<JsonValue> metadataMetadata filter for the responses. This is a query parameter used to select responses.
-
Optional<String> modelThe name of the model to find responses for. This is a query parameter used to select responses.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Double> temperatureSampling temperature. This is a query parameter used to select responses.
-
Optional<List<String>> toolsList of tool names. This is a query parameter used to select responses.
-
Optional<Double> topPNucleus sampling parameter. This is a query parameter used to select responses.
-
Optional<List<String>> usersList of user identifiers. This is a query parameter used to select responses.
-
-
-
JsonValue; type "responses"constantThe type of run data source. Always
responses.RESPONSES("responses")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class ChatMessage:-
String contentThe content of the message.
-
String roleThe role of the message (e.g. "system", "assistant", "user").
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText -
InputImage -
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.name" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1.- All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. - The
gpt-5-promodel defaults to (and only supports)highreasoning effort. xhighis supported for all models aftergpt-5.1-codex-max.
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<Text> textConfiguration options for a text response from the model. Can be plain text or structured JSON data. Learn more:
-
Optional<ResponseFormatTextConfig> formatAn object specifying the format that the model must output.
Configuring
{ "type": "json_schema" }enables Structured Outputs, which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.The default format is
{ "type": "text" }with no additional options.Not recommended for gpt-4o and newer models:
Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
class ResponseFormatTextJsonSchemaConfig:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Schema schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.
-
-
Optional<List<Tool>> toolsAn array of tools the model may call while generating a response. You can specify which tool to use by setting the
tool_choiceparameter.The two categories of tools you can provide the model are:
-
Built-in tools: Tools that are provided by OpenAI that extend the model's capabilities, like web search or file search. Learn more about built-in tools.
-
Function calls (custom tools): Functions that are defined by you, enabling the model to call your own code. Learn more about function calling.
-
class FunctionTool:Defines a function in your own code the model can choose to call. Learn more about function calling.
-
String nameThe name of the function to call.
-
Optional<Parameters> parametersA JSON schema object describing the parameters of the function.
-
Optional<Boolean> strictWhether to enforce strict parameter validation. Default
true. -
JsonValue; type "function"constantThe type of the function tool. Always
function.FUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function is deferred and loaded via tool search.
-
Optional<String> descriptionA description of the function. Used by the model to determine whether or not to call the function.
-
-
class FileSearchTool:A tool that searches for relevant content from uploaded files. Learn more about the file search tool.
-
JsonValue; type "file_search"constantThe type of the file search tool. Always
file_search.FILE_SEARCH("file_search")
-
List<String> vectorStoreIdsThe IDs of the vector stores to search.
-
Optional<Filters> filtersA filter to apply.
-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
String keyThe key to compare against the value.
-
Type typeSpecifies the comparison operator:
eq,ne,gt,gte,lt,lte,in,nin.-
eq: equals -
ne: not equal -
gt: greater than -
gte: greater than or equal -
lt: less than -
lte: less than or equal -
in: in -
nin: not in -
EQ("eq") -
NE("ne") -
GT("gt") -
GTE("gte") -
LT("lt") -
LTE("lte") -
IN("in") -
NIN("nin")
-
-
Value valueThe value to compare against the attribute key; supports string, number, or boolean types.
-
String -
double -
boolean -
List<ComparisonFilterValueItem>-
String -
double
-
-
-
-
class CompoundFilter:Combine multiple filters using
andoror.-
List<Filter> filtersArray of filters to combine. Items can be
ComparisonFilterorCompoundFilter.-
class ComparisonFilter:A filter used to compare a specified attribute key to a given value using a defined comparison operation.
-
JsonValue
-
-
Type typeType of operation:
andoror.-
AND("and") -
OR("or")
-
-
-
-
Optional<Long> maxNumResultsThe maximum number of results to return. This number should be between 1 and 50 inclusive.
-
Optional<RankingOptions> rankingOptionsRanking options for search.
-
Optional<HybridSearch> hybridSearchWeights that control how reciprocal rank fusion balances semantic embedding matches versus sparse keyword matches when hybrid search is enabled.
-
double embeddingWeightThe weight of the embedding in the reciprocal ranking fusion.
-
double textWeightThe weight of the text in the reciprocal ranking fusion.
-
-
Optional<Ranker> rankerThe ranker to use for the file search.
-
AUTO("auto") -
DEFAULT_2024_11_15("default-2024-11-15")
-
-
Optional<Double> scoreThresholdThe score threshold for the file search, a number between 0 and 1. Numbers closer to 1 will attempt to return only the most relevant results, but may return fewer results.
-
-
-
class ComputerTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
JsonValue; type "computer"constantThe type of the computer tool. Always
computer.COMPUTER("computer")
-
-
class ComputerUsePreviewTool:A tool that controls a virtual computer. Learn more about the computer tool.
-
long displayHeightThe height of the computer display.
-
long displayWidthThe width of the computer display.
-
Environment environmentThe type of computer environment to control.
-
WINDOWS("windows") -
MAC("mac") -
LINUX("linux") -
UBUNTU("ubuntu") -
BROWSER("browser")
-
-
JsonValue; type "computer_use_preview"constantThe type of the computer use tool. Always
computer_use_preview.COMPUTER_USE_PREVIEW("computer_use_preview")
-
-
class WebSearchTool:Search the Internet for sources related to the prompt. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_searchorweb_search_2025_08_26.-
WEB_SEARCH("web_search") -
WEB_SEARCH_2025_08_26("web_search_2025_08_26")
-
-
Optional<Filters> filtersFilters for the search.
-
Optional<List<String>> allowedDomainsAllowed domains for the search. If not provided, all domains are allowed. Subdomains of the provided domains are allowed as well.
Example:
["pubmed.ncbi.nlm.nih.gov"]
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe approximate location of the user.
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles. -
Optional<Type> typeThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
-
-
Mcp-
String serverLabelA label for this MCP server, used to identify it in tool calls.
-
JsonValue; type "mcp"constantThe type of the MCP tool. Always
mcp.MCP("mcp")
-
Optional<AllowedTools> allowedToolsList of allowed tool names or a filter object.
-
List<String> -
class McpToolFilter:A filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
Optional<String> authorizationAn OAuth access token that can be used with a remote MCP server, either with a custom MCP server URL or a service connector. Your application must handle the OAuth authorization flow and provide the token here.
-
Optional<ConnectorId> connectorIdIdentifier for service connectors, like those available in ChatGPT. One of
server_url,connector_id, ortunnel_idmust be provided. Learn more about service connectors here.Currently supported
connector_idvalues are:-
Dropbox:
connector_dropbox -
Gmail:
connector_gmail -
Google Calendar:
connector_googlecalendar -
Google Drive:
connector_googledrive -
Microsoft Teams:
connector_microsoftteams -
Outlook Calendar:
connector_outlookcalendar -
Outlook Email:
connector_outlookemail -
SharePoint:
connector_sharepoint -
CONNECTOR_DROPBOX("connector_dropbox") -
CONNECTOR_GMAIL("connector_gmail") -
CONNECTOR_GOOGLECALENDAR("connector_googlecalendar") -
CONNECTOR_GOOGLEDRIVE("connector_googledrive") -
CONNECTOR_MICROSOFTTEAMS("connector_microsoftteams") -
CONNECTOR_OUTLOOKCALENDAR("connector_outlookcalendar") -
CONNECTOR_OUTLOOKEMAIL("connector_outlookemail") -
CONNECTOR_SHAREPOINT("connector_sharepoint")
-
-
Optional<Boolean> deferLoadingWhether this MCP tool is deferred and discovered via tool search.
-
Optional<Headers> headersOptional HTTP headers to send to the MCP server. Use for authentication or other purposes.
-
Optional<RequireApproval> requireApprovalSpecify which of the MCP server's tools require approval.
-
class McpToolApprovalFilter:Specify which of the MCP server's tools require approval. Can be
always,never, or a filter object associated with tools that require approval.-
Optional<Always> alwaysA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
Optional<Never> neverA filter object to specify which tools are allowed.
-
Optional<Boolean> readOnlyIndicates whether or not a tool modifies data or is read-only. If an MCP server is annotated with
readOnlyHint, it will match this filter. -
Optional<List<String>> toolNamesList of allowed tool names.
-
-
-
enum McpToolApprovalSetting:Specify a single approval policy for all tools. One of
alwaysornever. When set toalways, all tools will require approval. When set tonever, all tools will not require approval.-
ALWAYS("always") -
NEVER("never")
-
-
-
Optional<String> serverDescriptionOptional description of the MCP server, used to provide more context.
-
Optional<String> serverUrlThe URL for the MCP server. One of
server_url,connector_id, ortunnel_idmust be provided. -
Optional<String> tunnelIdThe Secure MCP Tunnel ID to use instead of a direct server URL. One of
server_url,connector_id, ortunnel_idmust be provided.
-
-
CodeInterpreter-
Container containerThe code interpreter container. Can be a container ID or an object that specifies uploaded file IDs to make available to your code, along with an optional
memory_limitsetting.-
String -
class CodeInterpreterToolAuto:Configuration for a code interpreter container. Optionally specify the IDs of the files to run the code on.
-
JsonValue; type "auto"constantAlways
auto.AUTO("auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the code interpreter container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled:-
JsonValue; type "disabled"constantDisable outbound network access. Always
disabled.DISABLED("disabled")
-
-
class ContainerNetworkPolicyAllowlist:-
List<String> allowedDomainsA list of allowed domains when type is
allowlist. -
JsonValue; type "allowlist"constantAllow outbound network access only to specified domains. Always
allowlist.ALLOWLIST("allowlist")
-
Optional<List<ContainerNetworkPolicyDomainSecret>> domainSecretsOptional domain-scoped secrets for allowlisted domains.
-
String domainThe domain associated with the secret.
-
String nameThe name of the secret to inject for the domain.
-
String valueThe secret value to inject for the domain.
-
-
-
-
-
-
JsonValue; type "code_interpreter"constantThe type of the code interpreter tool. Always
code_interpreter.CODE_INTERPRETER("code_interpreter")
-
-
ImageGeneration-
JsonValue; type "image_generation"constantThe type of the image generation tool. Always
image_generation.IMAGE_GENERATION("image_generation")
-
Optional<Action> actionWhether to generate a new image or edit an existing image. Default:
auto.-
GENERATE("generate") -
EDIT("edit") -
AUTO("auto")
-
-
Optional<Background> backgroundAllows to set transparency for the background of the generated image(s). This parameter is only supported for GPT image models that support transparent backgrounds. Must be one of
transparent,opaque, orauto(default value). Whenautois used, the model will automatically determine the best background for the image.gpt-image-2andgpt-image-2-2026-04-21do not support transparent backgrounds. Requests withbackgroundset totransparentwill return an error for these models; useopaqueorautoinstead.If
transparent, the output format needs to support transparency, so it should be set to eitherpng(default value) orwebp.-
TRANSPARENT("transparent") -
OPAQUE("opaque") -
AUTO("auto")
-
-
Optional<InputFidelity> inputFidelityControl how much effort the model will exert to match the style and features, especially facial features, of input images. This parameter is only supported for
gpt-image-1andgpt-image-1.5and later models, unsupported forgpt-image-1-mini. Supportshighandlow. Defaults tolow.-
HIGH("high") -
LOW("low")
-
-
Optional<InputImageMask> inputImageMaskOptional mask for inpainting. Contains
image_url(string, optional) andfile_id(string, optional).-
Optional<String> fileIdFile ID for the mask image.
-
Optional<String> imageUrlBase64-encoded mask image.
-
-
Optional<Model> modelThe image generation model to use. Default:
gpt-image-1.-
GPT_IMAGE_1("gpt-image-1") -
GPT_IMAGE_1_MINI("gpt-image-1-mini") -
GPT_IMAGE_2("gpt-image-2") -
GPT_IMAGE_2_2026_04_21("gpt-image-2-2026-04-21") -
GPT_IMAGE_1_5("gpt-image-1.5") -
CHATGPT_IMAGE_LATEST("chatgpt-image-latest")
-
-
Optional<Moderation> moderationModeration level for the generated image. Default:
auto.-
AUTO("auto") -
LOW("low")
-
-
Optional<Long> outputCompressionCompression level for the output image. Default: 100.
-
Optional<OutputFormat> outputFormatThe output format of the generated image. One of
png,webp, orjpeg. Default:png.-
PNG("png") -
WEBP("webp") -
JPEG("jpeg")
-
-
Optional<Long> partialImagesNumber of partial images to generate in streaming mode, from 0 (default value) to 3.
-
Optional<Quality> qualityThe quality of the generated image. One of
low,medium,high, orauto. Default:auto.-
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
AUTO("auto")
-
-
Optional<Size> sizeThe size of the generated images. For
gpt-image-2andgpt-image-2-2026-04-21, arbitrary resolutions are supported asWIDTHxHEIGHTstrings, for example1536x864. Width and height must both be divisible by 16 and the requested aspect ratio must be between 1:3 and 3:1. Resolutions above2560x1440are experimental, and the maximum supported resolution is3840x2160. The requested size must also satisfy the model's current pixel and edge limits. The standard sizes1024x1024,1536x1024, and1024x1536are supported by the GPT image models;autois supported for models that allow automatic sizing. Fordall-e-2, use one of256x256,512x512, or1024x1024. Fordall-e-3, use one of1024x1024,1792x1024, or1024x1792.-
_1024X1024("1024x1024") -
_1024X1536("1024x1536") -
_1536X1024("1536x1024") -
AUTO("auto")
-
-
-
JsonValue;-
JsonValue; type "local_shell"constantThe type of the local shell tool. Always
local_shell.LOCAL_SHELL("local_shell")
-
-
class FunctionShellTool:A tool that allows the model to execute shell commands.
-
JsonValue; type "shell"constantThe type of the shell tool. Always
shell.SHELL("shell")
-
Optional<Environment> environment-
class ContainerAuto:-
JsonValue; type "container_auto"constantAutomatically creates a container for this request
CONTAINER_AUTO("container_auto")
-
Optional<List<String>> fileIdsAn optional list of uploaded files to make available to your code.
-
Optional<MemoryLimit> memoryLimitThe memory limit for the container.
-
_1G("1g") -
_4G("4g") -
_16G("16g") -
_64G("64g")
-
-
Optional<NetworkPolicy> networkPolicyNetwork access policy for the container.
-
class ContainerNetworkPolicyDisabled: -
class ContainerNetworkPolicyAllowlist:
-
-
Optional<List<Skill>> skillsAn optional list of skills referenced by id or inline data.
-
class SkillReference:-
String skillIdThe ID of the referenced skill.
-
JsonValue; type "skill_reference"constantReferences a skill created with the /v1/skills endpoint.
SKILL_REFERENCE("skill_reference")
-
Optional<String> versionOptional skill version. Use a positive integer or 'latest'. Omit for default.
-
-
class InlineSkill:-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
InlineSkillSource sourceInline skill payload
-
String dataBase64-encoded skill zip bundle.
-
JsonValue; mediaType "application/zip"constantThe media type of the inline skill payload. Must be
application/zip.APPLICATION_ZIP("application/zip")
-
JsonValue; type "base64"constantThe type of the inline skill source. Must be
base64.BASE64("base64")
-
-
JsonValue; type "inline"constantDefines an inline skill for this request.
INLINE("inline")
-
-
-
-
class LocalEnvironment:-
JsonValue; type "local"constantUse a local computer environment.
LOCAL("local")
-
Optional<List<LocalSkill>> skillsAn optional list of skills.
-
String descriptionThe description of the skill.
-
String nameThe name of the skill.
-
String pathThe path to the directory containing the skill.
-
-
-
class ContainerReference:-
String containerIdThe ID of the referenced container.
-
JsonValue; type "container_reference"constantReferences a container created with the /v1/containers endpoint
CONTAINER_REFERENCE("container_reference")
-
-
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
String nameThe name of the custom tool, used to identify it in tool calls.
-
JsonValue; type "custom"constantThe type of the custom tool. Always
custom.CUSTOM("custom")
-
Optional<Boolean> deferLoadingWhether this tool should be deferred and discovered via tool search.
-
Optional<String> descriptionOptional description of the custom tool, used to provide more context.
-
Optional<CustomToolInputFormat> formatThe input format for the custom tool. Default is unconstrained text.
-
JsonValue;-
JsonValue; type "text"constantUnconstrained text format. Always
text.TEXT("text")
-
-
Grammar-
String definitionThe grammar definition.
-
Syntax syntaxThe syntax of the grammar definition. One of
larkorregex.-
LARK("lark") -
REGEX("regex")
-
-
JsonValue; type "grammar"constantGrammar format. Always
grammar.GRAMMAR("grammar")
-
-
-
-
class NamespaceTool:Groups function/custom tools under a shared namespace.
-
String descriptionA description of the namespace shown to the model.
-
String nameThe namespace name used in tool calls (for example,
crm). -
List<Tool> toolsThe function/custom tools available inside this namespace.
-
class Function:-
String name -
JsonValue; type "function"constantFUNCTION("function")
-
Optional<Boolean> deferLoadingWhether this function should be deferred and discovered via tool search.
-
Optional<String> description -
Optional<JsonValue> parameters -
Optional<Boolean> strict
-
-
class CustomTool:A custom tool that processes input using a specified format. Learn more about custom tools
-
-
JsonValue; type "namespace"constantThe type of the tool. Always
namespace.NAMESPACE("namespace")
-
-
class ToolSearchTool:Hosted or BYOT tool search configuration for deferred tools.
-
JsonValue; type "tool_search"constantThe type of the tool. Always
tool_search.TOOL_SEARCH("tool_search")
-
Optional<String> descriptionDescription shown to the model for a client-executed tool search tool.
-
Optional<Execution> executionWhether tool search is executed by the server or by the client.
-
SERVER("server") -
CLIENT("client")
-
-
Optional<JsonValue> parametersParameter schema for a client-executed tool search tool.
-
-
class WebSearchPreviewTool:This tool searches the web for relevant results to use in a response. Learn more about the web search tool.
-
Type typeThe type of the web search tool. One of
web_search_previeworweb_search_preview_2025_03_11.-
WEB_SEARCH_PREVIEW("web_search_preview") -
WEB_SEARCH_PREVIEW_2025_03_11("web_search_preview_2025_03_11")
-
-
Optional<List<SearchContentType>> searchContentTypes-
TEXT("text") -
IMAGE("image")
-
-
Optional<SearchContextSize> searchContextSizeHigh level guidance for the amount of context window space to use for the search. One of
low,medium, orhigh.mediumis the default.-
LOW("low") -
MEDIUM("medium") -
HIGH("high")
-
-
Optional<UserLocation> userLocationThe user's location.
-
JsonValue; type "approximate"constantThe type of location approximation. Always
approximate.APPROXIMATE("approximate")
-
Optional<String> cityFree text input for the city of the user, e.g.
San Francisco. -
Optional<String> countryThe two-letter ISO country code of the user, e.g.
US. -
Optional<String> regionFree text input for the region of the user, e.g.
California. -
Optional<String> timezoneThe IANA timezone of the user, e.g.
America/Los_Angeles.
-
-
-
class ApplyPatchTool:Allows the assistant to create, delete, or update files using unified diffs.
-
JsonValue; type "apply_patch"constantThe type of the tool. Always
apply_patch.APPLY_PATCH("apply_patch")
-
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
-
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String evalIdThe identifier of the associated evaluation.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
String modelThe model that is evaluated, if applicable.
-
String nameThe name of the evaluation run.
-
JsonValue; object_ "eval.run"constantThe type of the object. Always "eval.run".
EVAL_RUN("eval.run")
-
List<PerModelUsage> perModelUsageUsage statistics for each model during the evaluation run.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long invocationCountThe number of invocations.
-
String modelNameThe name of the model.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
List<PerTestingCriteriaResult> perTestingCriteriaResultsResults per testing criteria applied during the evaluation run.
-
long failedNumber of tests failed for this criteria.
-
long passedNumber of tests passed for this criteria.
-
String testingCriteriaA description of the testing criteria.
-
-
String reportUrlThe URL to the rendered evaluation run report on the UI dashboard.
-
ResultCounts resultCountsCounters summarizing the outcomes of the evaluation run.
-
long erroredNumber of output items that resulted in an error.
-
long failedNumber of output items that failed to pass the evaluation.
-
long passedNumber of output items that passed the evaluation.
-
long totalTotal number of executed output items.
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.RunCancelParams;
import com.openai.models.evals.runs.RunCancelResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
RunCancelParams params = RunCancelParams.builder()
.evalId("eval_id")
.runId("run_id")
.build();
RunCancelResponse response = client.evals().runs().cancel(params);
}
}
Response
{
"id": "id",
"created_at": 0,
"data_source": {
"source": {
"content": [
{
"item": {
"foo": "bar"
},
"sample": {
"foo": "bar"
}
}
],
"type": "file_content"
},
"type": "jsonl"
},
"error": {
"code": "code",
"message": "message"
},
"eval_id": "eval_id",
"metadata": {
"foo": "string"
},
"model": "model",
"name": "name",
"object": "eval.run",
"per_model_usage": [
{
"cached_tokens": 0,
"completion_tokens": 0,
"invocation_count": 0,
"model_name": "model_name",
"prompt_tokens": 0,
"total_tokens": 0
}
],
"per_testing_criteria_results": [
{
"failed": 0,
"passed": 0,
"testing_criteria": "testing_criteria"
}
],
"report_url": "https://example.com",
"result_counts": {
"errored": 0,
"failed": 0,
"passed": 0,
"total": 0
},
"status": "status"
}
Delete eval run
RunDeleteResponse evals().runs().delete(RunDeleteParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
delete /evals/{eval_id}/runs/{run_id}
Delete an eval run.
Parameters
-
RunDeleteParams params-
String evalId -
Optional<String> runId
-
Returns
-
class RunDeleteResponse:-
Optional<Boolean> deleted -
Optional<String> object_ -
Optional<String> runId
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.RunDeleteParams;
import com.openai.models.evals.runs.RunDeleteResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
RunDeleteParams params = RunDeleteParams.builder()
.evalId("eval_id")
.runId("run_id")
.build();
RunDeleteResponse run = client.evals().runs().delete(params);
}
}
Response
{
"deleted": true,
"object": "eval.run.deleted",
"run_id": "evalrun_677469f564d48190807532a852da3afb"
}
Domain Types
Create Eval Completions Run Data Source
-
class CreateEvalCompletionsRunDataSource:A CompletionsRunDataSource object describing a model sampling configuration.
-
Source sourceDetermines what populates the
itemnamespace in this run's data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
class StoredCompletions:A StoredCompletionsRunDataSource configuration describing a set of filters
-
JsonValue; type "stored_completions"constantThe type of source. Always
stored_completions.STORED_COMPLETIONS("stored_completions")
-
Optional<Long> createdAfterAn optional Unix timestamp to filter items created after this time.
-
Optional<Long> createdBeforeAn optional Unix timestamp to filter items created before this time.
-
Optional<Long> limitAn optional maximum number of items to return.
-
Optional<Metadata> metadataSet of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format, and querying for objects via API or the dashboard.
Keys are strings with a maximum length of 64 characters. Values are strings with a maximum length of 512 characters.
-
Optional<String> modelAn optional model to filter by (e.g., 'gpt-4o').
-
-
-
Type typeThe type of run data source. Always
completions.COMPLETIONS("completions")
-
Optional<InputMessages> inputMessagesUsed when sampling from a model. Dictates the structure of the messages passed into the model. Can either be a reference to a prebuilt trajectory (ie,
item.input_trajectory), or a template with variable references to theitemnamespace.-
class Template:-
List<InnerTemplate> templateA list of chat messages forming the prompt or context. May include variable references to the
itemnamespace, ie {{item.name}}.-
class EasyInputMessage:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentText, image, or audio input to the model, used to generate a response. Can also contain previous assistant responses.
-
String -
List<ResponseInputContent>-
class ResponseInputText:A text input to the model.
-
String textThe text input to the model.
-
JsonValue; type "input_text"constantThe type of the input item. Always
input_text.INPUT_TEXT("input_text")
-
-
class ResponseInputImage:An image input to the model. Learn about image inputs.
-
Detail detailThe detail level of the image to be sent to the model. One of
high,low,auto, ororiginal. Defaults toauto.-
LOW("low") -
HIGH("high") -
AUTO("auto") -
ORIGINAL("original")
-
-
JsonValue; type "input_image"constantThe type of the input item. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> imageUrlThe URL of the image to be sent to the model. A fully qualified URL or base64 encoded image in a data URL.
-
-
class ResponseInputFile:A file input to the model.
-
JsonValue; type "input_file"constantThe type of the input item. Always
input_file.INPUT_FILE("input_file")
-
Optional<Detail> detailThe detail level of the file to be sent to the model. Use
lowfor the default rendering behavior, orhighto render the file at higher quality. Defaults tolow.-
LOW("low") -
HIGH("high")
-
-
Optional<String> fileDataThe content of the file to be sent to the model.
-
Optional<String> fileIdThe ID of the file to be sent to the model.
-
Optional<String> fileUrlThe URL of the file to be sent to the model.
-
Optional<String> filenameThe name of the file to be sent to the model.
-
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Phase> phaseLabels an
assistantmessage as intermediate commentary (commentary) or the final answer (final_answer). For models likegpt-5.3-codexand beyond, when sending follow-up requests, preserve and resend phase on all assistant messages — dropping it can degrade performance. Not used for user messages.-
COMMENTARY("commentary") -
FINAL_ANSWER("final_answer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
class EvalItem:A message input to the model with a role indicating instruction following hierarchy. Instructions given with the
developerorsystemrole take precedence over instructions given with theuserrole. Messages with theassistantrole are presumed to have been generated by the model in previous interactions.-
Content contentInputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
-
String -
class ResponseInputText:A text input to the model.
-
class OutputText:A text output from the model.
-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
class InputImage:An image input block used within EvalItem content arrays.
-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
InputAudio inputAudio-
String dataBase64-encoded audio data.
-
Format formatThe format of the audio data. Currently supported formats are
mp3andwav.-
MP3("mp3") -
WAV("wav")
-
-
-
JsonValue; type "input_audio"constantThe type of the input item. Always
input_audio.INPUT_AUDIO("input_audio")
-
-
List<EvalContentItem>-
String -
class ResponseInputText:A text input to the model.
-
OutputText-
String textThe text output from the model.
-
JsonValue; type "output_text"constantThe type of the output text. Always
output_text.OUTPUT_TEXT("output_text")
-
-
InputImage-
String imageUrlThe URL of the image input.
-
JsonValue; type "input_image"constantThe type of the image input. Always
input_image.INPUT_IMAGE("input_image")
-
Optional<String> detailThe detail level of the image to be sent to the model. One of
high,low, orauto. Defaults toauto.
-
-
class ResponseInputAudio:An audio input to the model.
-
-
-
Role roleThe role of the message input. One of
user,assistant,system, ordeveloper.-
USER("user") -
ASSISTANT("assistant") -
SYSTEM("system") -
DEVELOPER("developer")
-
-
Optional<Type> typeThe type of the message input. Always
message.MESSAGE("message")
-
-
-
JsonValue; type "template"constantThe type of input messages. Always
template.TEMPLATE("template")
-
-
class ItemReference:-
String itemReferenceA reference to a variable in the
itemnamespace. Ie, "item.input_trajectory" -
JsonValue; type "item_reference"constantThe type of input messages. Always
item_reference.ITEM_REFERENCE("item_reference")
-
-
-
Optional<String> modelThe name of the model to use for generating completions (e.g. "o3-mini").
-
Optional<SamplingParams> samplingParams-
Optional<Long> maxCompletionTokensThe maximum number of tokens in the generated output.
-
Optional<ReasoningEffort> reasoningEffortConstrains effort on reasoning for reasoning models. Currently supported values are
none,minimal,low,medium,high, andxhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.-
gpt-5.1defaults tonone, which does not perform reasoning. The supported reasoning values forgpt-5.1arenone,low,medium, andhigh. Tool calls are supported for all reasoning values in gpt-5.1. -
All models before
gpt-5.1default tomediumreasoning effort, and do not supportnone. -
The
gpt-5-promodel defaults to (and only supports)highreasoning effort. -
xhighis supported for all models aftergpt-5.1-codex-max. -
NONE("none") -
MINIMAL("minimal") -
LOW("low") -
MEDIUM("medium") -
HIGH("high") -
XHIGH("xhigh")
-
-
Optional<ResponseFormat> responseFormatAn object specifying the format that the model must output.
Setting to
{ "type": "json_schema", "json_schema": {...} }enables Structured Outputs which ensures the model will match your supplied JSON schema. Learn more in the Structured Outputs guide.Setting to
{ "type": "json_object" }enables the older JSON mode, which ensures the message the model generates is valid JSON. Usingjson_schemais preferred for models that support it.-
class ResponseFormatText:Default response format. Used to generate text responses.
-
JsonValue; type "text"constantThe type of response format being defined. Always
text.TEXT("text")
-
-
class ResponseFormatJsonSchema:JSON Schema response format. Used to generate structured JSON responses. Learn more about Structured Outputs.
-
JsonSchema jsonSchemaStructured Outputs configuration options, including a JSON Schema.
-
String nameThe name of the response format. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the response format is for, used by the model to determine how to respond in the format.
-
Optional<Schema> schemaThe schema for the response format, described as a JSON Schema object. Learn how to build JSON schemas here.
-
Optional<Boolean> strictWhether to enable strict schema adherence when generating the output. If set to true, the model will always follow the exact schema defined in the
schemafield. Only a subset of JSON Schema is supported whenstrictistrue. To learn more, read the Structured Outputs guide.
-
-
JsonValue; type "json_schema"constantThe type of response format being defined. Always
json_schema.JSON_SCHEMA("json_schema")
-
-
class ResponseFormatJsonObject:JSON object response format. An older method of generating JSON responses. Using
json_schemais recommended for models that support it. Note that the model will not generate JSON without a system or user message instructing it to do so.-
JsonValue; type "json_object"constantThe type of response format being defined. Always
json_object.JSON_OBJECT("json_object")
-
-
-
Optional<Long> seedA seed value to initialize the randomness, during sampling.
-
Optional<Double> temperatureA higher temperature increases randomness in the outputs.
-
Optional<List<ChatCompletionFunctionTool>> toolsA list of tools the model may call. Currently, only functions are supported as a tool. Use this to provide a list of functions the model may generate JSON inputs for. A max of 128 functions are supported.
-
FunctionDefinition function-
String nameThe name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
-
Optional<String> descriptionA description of what the function does, used by the model to choose when and how to call the function.
-
Optional<FunctionParameters> parametersThe parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
Omitting
parametersdefines a function with an empty parameter list. -
Optional<Boolean> strictWhether to enable strict schema adherence when generating the function call. If set to true, the model will follow the exact schema defined in the
parametersfield. Only a subset of JSON Schema is supported whenstrictistrue. Learn more about Structured Outputs in the function calling guide.
-
-
JsonValue; type "function"constantThe type of the tool. Currently, only
functionis supported.FUNCTION("function")
-
-
Optional<Double> topPAn alternative to temperature for nucleus sampling; 1.0 includes all tokens.
-
-
Create Eval JSONL Run Data Source
-
class CreateEvalJsonlRunDataSource:A JsonlRunDataSource object with that specifies a JSONL file that matches the eval
-
Source sourceDetermines what populates the
itemnamespace in the data source.-
class FileContent:-
List<Content> contentThe content of the jsonl file.
-
Item item -
Optional<Sample> sample
-
-
JsonValue; type "file_content"constantThe type of jsonl source. Always
file_content.FILE_CONTENT("file_content")
-
-
class FileId:-
String idThe identifier of the file.
-
JsonValue; type "file_id"constantThe type of jsonl source. Always
file_id.FILE_ID("file_id")
-
-
-
JsonValue; type "jsonl"constantThe type of data source. Always
jsonl.JSONL("jsonl")
-
Eval API Error
-
class EvalApiError:An object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
Output Items
Get eval run output items
OutputItemListPage evals().runs().outputItems().list(OutputItemListParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
get /evals/{eval_id}/runs/{run_id}/output_items
Get a list of output items for an evaluation run.
Parameters
-
OutputItemListParams params-
String evalId -
Optional<String> runId -
Optional<String> afterIdentifier for the last output item from the previous pagination request.
-
Optional<Long> limitNumber of output items to retrieve.
-
Optional<Order> orderSort order for output items by timestamp. Use
ascfor ascending order ordescfor descending order. Defaults toasc.-
ASC("asc") -
DESC("desc")
-
-
Optional<Status> statusFilter output items by status. Use
failedto filter by failed output items orpassto filter by passed output items.-
FAIL("fail") -
PASS("pass")
-
-
Returns
-
class OutputItemListResponse:A schema representing an evaluation run output item.
-
String idUnique identifier for the evaluation run output item.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DatasourceItem datasourceItemDetails of the input data source item.
-
long datasourceItemIdThe identifier for the data source item.
-
String evalIdThe identifier of the evaluation group.
-
JsonValue; object_ "eval.run.output_item"constantThe type of the object. Always "eval.run.output_item".
EVAL_RUN_OUTPUT_ITEM("eval.run.output_item")
-
List<Result> resultsA list of grader results for this output item.
-
String nameThe name of the grader.
-
boolean passedWhether the grader considered the output a pass.
-
double scoreThe numeric score produced by the grader.
-
Optional<Sample> sampleOptional sample or intermediate data produced by the grader.
-
Optional<String> typeThe grader type (for example, "string-check-grader").
-
-
String runIdThe identifier of the evaluation run associated with this output item.
-
Sample sampleA sample containing the input and output of the evaluation run.
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String finishReasonThe reason why the sample generation was finished.
-
List<Input> inputAn array of input messages.
-
String contentThe content of the message.
-
String roleThe role of the message sender (e.g., system, user, developer).
-
-
long maxCompletionTokensThe maximum number of tokens allowed for completion.
-
String modelThe model used for generating the sample.
-
List<Output> outputAn array of output messages.
-
Optional<String> contentThe content of the message.
-
Optional<String> roleThe role of the message (e.g. "system", "assistant", "user").
-
-
long seedThe seed used for generating the sample.
-
double temperatureThe sampling temperature used.
-
double topPThe top_p value used for sampling.
-
Usage usageToken usage details for the sample.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.outputitems.OutputItemListPage;
import com.openai.models.evals.runs.outputitems.OutputItemListParams;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
OutputItemListParams params = OutputItemListParams.builder()
.evalId("eval_id")
.runId("run_id")
.build();
OutputItemListPage page = client.evals().runs().outputItems().list(params);
}
}
Response
{
"data": [
{
"id": "id",
"created_at": 0,
"datasource_item": {
"foo": "bar"
},
"datasource_item_id": 0,
"eval_id": "eval_id",
"object": "eval.run.output_item",
"results": [
{
"name": "name",
"passed": true,
"score": 0,
"sample": {
"foo": "bar"
},
"type": "type"
}
],
"run_id": "run_id",
"sample": {
"error": {
"code": "code",
"message": "message"
},
"finish_reason": "finish_reason",
"input": [
{
"content": "content",
"role": "role"
}
],
"max_completion_tokens": 0,
"model": "model",
"output": [
{
"content": "content",
"role": "role"
}
],
"seed": 0,
"temperature": 0,
"top_p": 0,
"usage": {
"cached_tokens": 0,
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0
}
},
"status": "status"
}
],
"first_id": "first_id",
"has_more": true,
"last_id": "last_id",
"object": "list"
}
Get an output item of an eval run
OutputItemRetrieveResponse evals().runs().outputItems().retrieve(OutputItemRetrieveParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
get /evals/{eval_id}/runs/{run_id}/output_items/{output_item_id}
Get an evaluation run output item by ID.
Parameters
-
OutputItemRetrieveParams params-
String evalId -
String runId -
Optional<String> outputItemId
-
Returns
-
class OutputItemRetrieveResponse:A schema representing an evaluation run output item.
-
String idUnique identifier for the evaluation run output item.
-
long createdAtUnix timestamp (in seconds) when the evaluation run was created.
-
DatasourceItem datasourceItemDetails of the input data source item.
-
long datasourceItemIdThe identifier for the data source item.
-
String evalIdThe identifier of the evaluation group.
-
JsonValue; object_ "eval.run.output_item"constantThe type of the object. Always "eval.run.output_item".
EVAL_RUN_OUTPUT_ITEM("eval.run.output_item")
-
List<Result> resultsA list of grader results for this output item.
-
String nameThe name of the grader.
-
boolean passedWhether the grader considered the output a pass.
-
double scoreThe numeric score produced by the grader.
-
Optional<Sample> sampleOptional sample or intermediate data produced by the grader.
-
Optional<String> typeThe grader type (for example, "string-check-grader").
-
-
String runIdThe identifier of the evaluation run associated with this output item.
-
Sample sampleA sample containing the input and output of the evaluation run.
-
EvalApiError errorAn object representing an error response from the Eval API.
-
String codeThe error code.
-
String messageThe error message.
-
-
String finishReasonThe reason why the sample generation was finished.
-
List<Input> inputAn array of input messages.
-
String contentThe content of the message.
-
String roleThe role of the message sender (e.g., system, user, developer).
-
-
long maxCompletionTokensThe maximum number of tokens allowed for completion.
-
String modelThe model used for generating the sample.
-
List<Output> outputAn array of output messages.
-
Optional<String> contentThe content of the message.
-
Optional<String> roleThe role of the message (e.g. "system", "assistant", "user").
-
-
long seedThe seed used for generating the sample.
-
double temperatureThe sampling temperature used.
-
double topPThe top_p value used for sampling.
-
Usage usageToken usage details for the sample.
-
long cachedTokensThe number of tokens retrieved from cache.
-
long completionTokensThe number of completion tokens generated.
-
long promptTokensThe number of prompt tokens used.
-
long totalTokensThe total number of tokens used.
-
-
-
String statusThe status of the evaluation run.
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.evals.runs.outputitems.OutputItemRetrieveParams;
import com.openai.models.evals.runs.outputitems.OutputItemRetrieveResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
OutputItemRetrieveParams params = OutputItemRetrieveParams.builder()
.evalId("eval_id")
.runId("run_id")
.outputItemId("output_item_id")
.build();
OutputItemRetrieveResponse outputItem = client.evals().runs().outputItems().retrieve(params);
}
}
Response
{
"id": "id",
"created_at": 0,
"datasource_item": {
"foo": "bar"
},
"datasource_item_id": 0,
"eval_id": "eval_id",
"object": "eval.run.output_item",
"results": [
{
"name": "name",
"passed": true,
"score": 0,
"sample": {
"foo": "bar"
},
"type": "type"
}
],
"run_id": "run_id",
"sample": {
"error": {
"code": "code",
"message": "message"
},
"finish_reason": "finish_reason",
"input": [
{
"content": "content",
"role": "role"
}
],
"max_completion_tokens": 0,
"model": "model",
"output": [
{
"content": "content",
"role": "role"
}
],
"seed": 0,
"temperature": 0,
"top_p": 0,
"usage": {
"cached_tokens": 0,
"completion_tokens": 0,
"prompt_tokens": 0,
"total_tokens": 0
}
},
"status": "status"
}