Go Premium Account

Spybara
Companies
Openai
Api
Reference Changes, 2026-05-18 22:01 UTC to 2026-05-19 06:34 UTC
cli/resources/fine_tuning/subresources/alpha/index.md

cli/resources/fine_tuning/subresources/alpha/index.md 2026-05-18 22:01 UTC to 2026-05-19 06:34 UTC

0 added, 611 removed.

2026

Wed 27 06:42 Fri 22 06:33 Wed 20 06:35 Tue 19 06:34 Mon 18 22:01 Mon 11 18:00 Thu 7 21:57 Tue 5 23:00 Sat 2 05:57

This document has no rendered page for this history range.

cli/resources/fine_tuning/subresources/alpha/index.md +0 −611 deleted

File Deleted View Diff

~~1# Alpha~~

~~3# Graders~~

~~5## Run grader~~

~~7`$ openai fine-tuning:alpha:graders run`~~

~~9**post** `/fine_tuning/alpha/graders/run`~~

~~11Run a grader.~~

~~13### Parameters~~

~~15- `--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`~~

~~17 The grader used for the fine-tuning job.~~

~~19- `--model-sample: string`~~

~~21 The model sample to be evaluated. This value will be used to populate~~

~~22 the `sample` namespace. See [the guide](https://platform.openai.com/docs/guides/graders) for more details.~~

~~23 The `output_json` variable will be populated if the model sample is a~~

~~24 valid JSON string.~~

~~26- `--item: optional unknown`~~

~~28 The dataset item provided to the grader. This will be used to populate~~

~~29 the `item` namespace. See [the guide](https://platform.openai.com/docs/guides/graders) for more details.~~

~~31### Returns~~

~~33- `FineTuningAlphaGraderRunResponse: object { metadata, model_grader_token_usage_per_model, reward, sub_rewards }`~~

~~35 - `metadata: object { errors, execution_time, name, 4 more }`~~

~~37 - `errors: object { formula_parse_error, invalid_variable_error, model_grader_parse_error, 11 more }`~~

~~39 - `formula_parse_error: boolean`~~

~~41 - `invalid_variable_error: boolean`~~

~~43 - `model_grader_parse_error: boolean`~~

~~45 - `model_grader_refusal_error: boolean`~~

~~47 - `model_grader_server_error: boolean`~~

~~49 - `model_grader_server_error_details: string`~~

~~51 - `other_error: boolean`~~

~~53 - `python_grader_runtime_error: boolean`~~

~~55 - `python_grader_runtime_error_details: string`~~

~~57 - `python_grader_server_error: boolean`~~

~~59 - `python_grader_server_error_type: string`~~

~~61 - `sample_parse_error: boolean`~~

~~63 - `truncated_observation_error: boolean`~~

~~65 - `unresponsive_reward_error: boolean`~~

~~67 - `execution_time: number`~~

~~69 - `name: string`~~

~~71 - `sampled_model_name: string`~~

~~73 - `scores: map[unknown]`~~

~~75 - `token_usage: number`~~

~~77 - `type: string`~~

~~79 - `model_grader_token_usage_per_model: map[unknown]`~~

~~81 - `reward: number`~~

~~83 - `sub_rewards: map[unknown]`~~

~~85### Example~~

~~87```cli~~

~~88openai fine-tuning:alpha:graders run \~~

~~89 --api-key 'My API Key' \~~

~~90 --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}' \~~

~~91 --model-sample model_sample~~

~~92```~~

~~94#### Response~~

~~96```json~~

~~97{~~

~~98 "metadata": {~~

~~99 "errors": {~~

100 "formula_parse_error": true,

101 "invalid_variable_error": true,

102 "model_grader_parse_error": true,

103 "model_grader_refusal_error": true,

104 "model_grader_server_error": true,

105 "model_grader_server_error_details": "model_grader_server_error_details",

106 "other_error": true,

107 "python_grader_runtime_error": true,

108 "python_grader_runtime_error_details": "python_grader_runtime_error_details",

109 "python_grader_server_error": true,

110 "python_grader_server_error_type": "python_grader_server_error_type",

111 "sample_parse_error": true,

112 "truncated_observation_error": true,

113 "unresponsive_reward_error": true

114 },

115 "execution_time": 0,

116 "name": "name",

117 "sampled_model_name": "sampled_model_name",

118 "scores": {

119 "foo": "bar"

120 },

121 "token_usage": 0,

122 "type": "type"

123 },

124 "model_grader_token_usage_per_model": {

125 "foo": "bar"

126 },

127 "reward": 0,

128 "sub_rewards": {

129 "foo": "bar"

130 }

131}

132```

~~133~~

134## Validate grader

~~135~~

136`$ openai fine-tuning:alpha:graders validate`

~~137~~

138**post** `/fine_tuning/alpha/graders/validate`

~~139~~

140Validate a grader.

~~141~~

142### Parameters

~~143~~

144- `--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

~~145~~

146 The grader used for the fine-tuning job.

~~147~~

148### Returns

~~149~~

150- `FineTuningAlphaGraderValidateResponse: object { grader }`

~~151~~

152 - `grader: optional StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

~~153~~

154 The grader used for the fine-tuning job.

~~155~~

156 - `string_check_grader: object { input, name, operation, 2 more }`

~~157~~

158 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

~~159~~

160 - `input: string`

~~161~~

162 The input text. This may include template strings.

~~163~~

164 - `name: string`

~~165~~

166 The name of the grader.

~~167~~

168 - `operation: "eq" or "ne" or "like" or "ilike"`

~~169~~

170 The string check operation to perform. One of `eq`, `ne`, `like`, or `ilike`.

~~171~~

172 - `"eq"`

~~173~~

174 - `"ne"`

~~175~~

176 - `"like"`

~~177~~

178 - `"ilike"`

~~179~~

180 - `reference: string`

~~181~~

182 The reference text. This may include template strings.

~~183~~

184 - `type: "string_check"`

~~185~~

186 The object type, which is always `string_check`.

~~187~~

188 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`

~~189~~

190 A TextSimilarityGrader object which grades text based on similarity metrics.

~~191~~

192 - `evaluation_metric: "cosine" or "fuzzy_match" or "bleu" or 8 more`

~~193~~

194 The evaluation metric to use. One of `cosine`, `fuzzy_match`, `bleu`,

195 `gleu`, `meteor`, `rouge_1`, `rouge_2`, `rouge_3`, `rouge_4`, `rouge_5`,

196 or `rouge_l`.

~~197~~

198 - `"cosine"`

~~199~~

200 - `"fuzzy_match"`

~~201~~

202 - `"bleu"`

~~203~~

204 - `"gleu"`

~~205~~

206 - `"meteor"`

~~207~~

208 - `"rouge_1"`

~~209~~

210 - `"rouge_2"`

~~211~~

212 - `"rouge_3"`

~~213~~

214 - `"rouge_4"`

~~215~~

216 - `"rouge_5"`

~~217~~

218 - `"rouge_l"`

~~219~~

220 - `input: string`

~~221~~

222 The text being graded.

~~223~~

224 - `name: string`

~~225~~

226 The name of the grader.

~~227~~

228 - `reference: string`

~~229~~

230 The text being graded against.

~~231~~

232 - `type: "text_similarity"`

~~233~~

234 The type of grader.

~~235~~

236 - `python_grader: object { name, source, type, image_tag }`

~~237~~

238 A PythonGrader object that runs a python script on the input.

~~239~~

240 - `name: string`

~~241~~

242 The name of the grader.

~~243~~

244 - `source: string`

~~245~~

246 The source code of the python script.

~~247~~

248 - `type: "python"`

~~249~~

250 The object type, which is always `python`.

~~251~~

252 - `image_tag: optional string`

~~253~~

254 The image tag to use for the python script.

~~255~~

256 - `score_model_grader: object { input, model, name, 3 more }`

~~257~~

258 A ScoreModelGrader object that uses a model to assign a score to the input.

~~259~~

260 - `input: array of object { content, role, type }`

~~261~~

262 The input messages evaluated by the grader. Supports text, output text, input image, and input audio content blocks, and may include template strings.

~~263~~

264 - `content: string or ResponseInputText or object { text, type } or 3 more`

~~265~~

266 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.

~~267~~

268 - `Text input: string`

~~269~~

270 A text input to the model.

~~271~~

272 - `response_input_text: object { text, type }`

~~273~~

274 A text input to the model.

~~275~~

276 - `text: string`

~~277~~

278 The text input to the model.

~~279~~

280 - `type: "input_text"`

~~281~~

282 The type of the input item. Always `input_text`.

~~283~~

284 - `Output text: object { text, type }`

~~285~~

286 A text output from the model.

~~287~~

288 - `text: string`

~~289~~

290 The text output from the model.

~~291~~

292 - `type: "output_text"`

~~293~~

294 The type of the output text. Always `output_text`.

~~295~~

296 - `Input image: object { image_url, type, detail }`

~~297~~

298 An image input block used within EvalItem content arrays.

~~299~~

300 - `image_url: string`

~~301~~

302 The URL of the image input.

~~303~~

304 - `type: "input_image"`

~~305~~

306 The type of the image input. Always `input_image`.

~~307~~

308 - `detail: optional string`

~~309~~

310 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

~~311~~

312 - `response_input_audio: object { input_audio, type }`

~~313~~

314 An audio input to the model.

~~315~~

316 - `input_audio: object { data, format }`

~~317~~

318 - `data: string`

~~319~~

320 Base64-encoded audio data.

~~321~~

322 - `format: "mp3" or "wav"`

~~323~~

324 The format of the audio data. Currently supported formats are `mp3` and

325 `wav`.

~~326~~

327 - `"mp3"`

~~328~~

329 - `"wav"`

~~330~~

331 - `type: "input_audio"`

~~332~~

333 The type of the input item. Always `input_audio`.

~~334~~

335 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`

~~336~~

337 A list of inputs, each of which may be either an input text, output text, input

338 image, or input audio object.

~~339~~

340 - `Text input: string`

~~341~~

342 A text input to the model.

~~343~~

344 - `response_input_text: object { text, type }`

~~345~~

346 A text input to the model.

~~347~~

348 - `Output text: object { text, type }`

~~349~~

350 A text output from the model.

~~351~~

352 - `text: string`

~~353~~

354 The text output from the model.

~~355~~

356 - `type: "output_text"`

~~357~~

358 The type of the output text. Always `output_text`.

~~359~~

360 - `Input image: object { image_url, type, detail }`

~~361~~

362 An image input block used within EvalItem content arrays.

~~363~~

364 - `image_url: string`

~~365~~

366 The URL of the image input.

~~367~~

368 - `type: "input_image"`

~~369~~

370 The type of the image input. Always `input_image`.

~~371~~

372 - `detail: optional string`

~~373~~

374 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

~~375~~

376 - `response_input_audio: object { input_audio, type }`

~~377~~

378 An audio input to the model.

~~379~~

380 - `role: "user" or "assistant" or "system" or "developer"`

~~381~~

382 The role of the message input. One of `user`, `assistant`, `system`, or

383 `developer`.

~~384~~

385 - `"user"`

~~386~~

387 - `"assistant"`

~~388~~

389 - `"system"`

~~390~~

391 - `"developer"`

~~392~~

393 - `type: optional "message"`

~~394~~

395 The type of the message input. Always `message`.

~~396~~

397 - `"message"`

~~398~~

399 - `model: string`

~~400~~

401 The model to use for the evaluation.

~~402~~

403 - `name: string`

~~404~~

405 The name of the grader.

~~406~~

407 - `type: "score_model"`

~~408~~

409 The object type, which is always `score_model`.

~~410~~

411 - `range: optional array of number`

~~412~~

413 The range of the score. Defaults to `[0, 1]`.

~~414~~

415 - `sampling_params: optional object { max_completions_tokens, reasoning_effort, seed, 2 more }`

~~416~~

417 The sampling parameters for the model.

~~418~~

419 - `max_completions_tokens: optional number`

~~420~~

421 The maximum number of tokens the grader model may generate in its response.

~~422~~

423 - `reasoning_effort: optional "none" or "minimal" or "low" or 3 more`

~~424~~

425 Constrains effort on reasoning for

426 [reasoning models](https://platform.openai.com/docs/guides/reasoning).

427 Currently supported values are `none`, `minimal`, `low`, `medium`, `high`, and `xhigh`. Reducing

428 reasoning effort can result in faster responses and fewer tokens used

429 on reasoning in a response.

~~430~~

431 - `gpt-5.1` defaults to `none`, which does not perform reasoning. The supported reasoning values for `gpt-5.1` are `none`, `low`, `medium`, and `high`. Tool calls are supported for all reasoning values in gpt-5.1.

432 - All models before `gpt-5.1` default to `medium` reasoning effort, and do not support `none`.

433 - The `gpt-5-pro` model defaults to (and only supports) `high` reasoning effort.

434 - `xhigh` is supported for all models after `gpt-5.1-codex-max`.

~~435~~

436 - `"none"`

~~437~~

438 - `"minimal"`

~~439~~

440 - `"low"`

~~441~~

442 - `"medium"`

~~443~~

444 - `"high"`

~~445~~

446 - `"xhigh"`

~~447~~

448 - `seed: optional number`

~~449~~

450 A seed value to initialize the randomness, during sampling.

~~451~~

452 - `temperature: optional number`

~~453~~

454 A higher temperature increases randomness in the outputs.

~~455~~

456 - `top_p: optional number`

~~457~~

458 An alternative to temperature for nucleus sampling; 1.0 includes all tokens.

~~459~~

460 - `multi_grader: object { calculate_output, graders, name, type }`

~~461~~

462 A MultiGrader object combines the output of multiple graders to produce a single score.

~~463~~

464 - `calculate_output: string`

~~465~~

466 A formula to calculate the output based on grader results.

~~467~~

468 - `graders: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

~~469~~

470 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

~~471~~

472 - `string_check_grader: object { input, name, operation, 2 more }`

~~473~~

474 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

~~475~~

476 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`

~~477~~

478 A TextSimilarityGrader object which grades text based on similarity metrics.

~~479~~

480 - `python_grader: object { name, source, type, image_tag }`

~~481~~

482 A PythonGrader object that runs a python script on the input.

~~483~~

484 - `score_model_grader: object { input, model, name, 3 more }`

~~485~~

486 A ScoreModelGrader object that uses a model to assign a score to the input.

~~487~~

488 - `label_model_grader: object { input, labels, model, 3 more }`

~~489~~

490 A LabelModelGrader object which uses a model to assign labels to each item

491 in the evaluation.

~~492~~

493 - `input: array of object { content, role, type }`

~~494~~

495 - `content: string or ResponseInputText or object { text, type } or 3 more`

~~496~~

497 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.

~~498~~

499 - `Text input: string`

~~500~~

501 A text input to the model.

~~502~~

503 - `response_input_text: object { text, type }`

~~504~~

505 A text input to the model.

~~506~~

507 - `Output text: object { text, type }`

~~508~~

509 A text output from the model.

~~510~~

511 - `text: string`

~~512~~

513 The text output from the model.

~~514~~

515 - `type: "output_text"`

~~516~~

517 The type of the output text. Always `output_text`.

~~518~~

519 - `Input image: object { image_url, type, detail }`

~~520~~

521 An image input block used within EvalItem content arrays.

~~522~~

523 - `image_url: string`

~~524~~

525 The URL of the image input.

~~526~~

527 - `type: "input_image"`

~~528~~

529 The type of the image input. Always `input_image`.

~~530~~

531 - `detail: optional string`

~~532~~

533 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

~~534~~

535 - `response_input_audio: object { input_audio, type }`

~~536~~

537 An audio input to the model.

~~538~~

539 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`

~~540~~

541 A list of inputs, each of which may be either an input text, output text, input

542 image, or input audio object.

~~543~~

544 - `role: "user" or "assistant" or "system" or "developer"`

~~545~~

546 The role of the message input. One of `user`, `assistant`, `system`, or

547 `developer`.

~~548~~

549 - `"user"`

~~550~~

551 - `"assistant"`

~~552~~

553 - `"system"`

~~554~~

555 - `"developer"`

~~556~~

557 - `type: optional "message"`

~~558~~

559 The type of the message input. Always `message`.

~~560~~

561 - `"message"`

~~562~~

563 - `labels: array of string`

~~564~~

565 The labels to assign to each item in the evaluation.

~~566~~

567 - `model: string`

~~568~~

569 The model to use for the evaluation. Must support structured outputs.

~~570~~

571 - `name: string`

~~572~~

573 The name of the grader.

~~574~~

575 - `passing_labels: array of string`

~~576~~

577 The labels that indicate a passing result. Must be a subset of labels.

~~578~~

579 - `type: "label_model"`

~~580~~

581 The object type, which is always `label_model`.

~~582~~

583 - `name: string`

~~584~~

585 The name of the grader.

~~586~~

587 - `type: "multi"`

~~588~~

589 The object type, which is always `multi`.

~~590~~

591### Example

~~592~~

593```cli

594openai fine-tuning:alpha:graders validate \

595 --api-key 'My API Key' \

596 --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}'

597```

~~598~~

599#### Response

~~600~~

601```json

602{

603 "grader": {

604 "input": "input",

605 "name": "name",

606 "operation": "eq",

607 "reference": "reference",

608 "type": "string_check"

609 }

610}

611```