Alpha

Graders

Run grader

$ openai fine-tuning:alpha:graders run

post /fine_tuning/alpha/graders/run

Run a grader.

Parameters

--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more

The grader used for the fine-tuning job.
--model-sample: string

The model sample to be evaluated. This value will be used to populate the sample namespace. See the guide for more details. The output_json variable will be populated if the model sample is a valid JSON string.
--item: optional unknown

The dataset item provided to the grader. This will be used to populate the item namespace. See the guide for more details.

Returns

FineTuningAlphaGraderRunResponse: object { metadata, model_grader_token_usage_per_model, reward, sub_rewards }
- metadata: object { errors, execution_time, name, 4 more }
  - errors: object { formula_parse_error, invalid_variable_error, model_grader_parse_error, 11 more }
    - formula_parse_error: boolean
    - invalid_variable_error: boolean
    - model_grader_parse_error: boolean
    - model_grader_refusal_error: boolean
    - model_grader_server_error: boolean
    - model_grader_server_error_details: string
    - other_error: boolean
    - python_grader_runtime_error: boolean
    - python_grader_runtime_error_details: string
    - python_grader_server_error: boolean
    - python_grader_server_error_type: string
    - sample_parse_error: boolean
    - truncated_observation_error: boolean
    - unresponsive_reward_error: boolean
  - execution_time: number
  - name: string
  - sampled_model_name: string
  - scores: map[unknown]
  - token_usage: number
  - type: string
- model_grader_token_usage_per_model: map[unknown]
- reward: number
- sub_rewards: map[unknown]

Example

openai fine-tuning:alpha:graders run \
  --api-key 'My API Key' \
  --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}' \
  --model-sample model_sample

Response

{
  "metadata": {
    "errors": {
      "formula_parse_error": true,
      "invalid_variable_error": true,
      "model_grader_parse_error": true,
      "model_grader_refusal_error": true,
      "model_grader_server_error": true,
      "model_grader_server_error_details": "model_grader_server_error_details",
      "other_error": true,
      "python_grader_runtime_error": true,
      "python_grader_runtime_error_details": "python_grader_runtime_error_details",
      "python_grader_server_error": true,
      "python_grader_server_error_type": "python_grader_server_error_type",
      "sample_parse_error": true,
      "truncated_observation_error": true,
      "unresponsive_reward_error": true
    },
    "execution_time": 0,
    "name": "name",
    "sampled_model_name": "sampled_model_name",
    "scores": {
      "foo": "bar"
    },
    "token_usage": 0,
    "type": "type"
  },
  "model_grader_token_usage_per_model": {
    "foo": "bar"
  },
  "reward": 0,
  "sub_rewards": {
    "foo": "bar"
  }
}

Validate grader

$ openai fine-tuning:alpha:graders validate

post /fine_tuning/alpha/graders/validate

Validate a grader.

Parameters

--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more

The grader used for the fine-tuning job.

Returns

FineTuningAlphaGraderValidateResponse: object { grader }
- grader: optional StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more
  
  The grader used for the fine-tuning job.
  - string_check_grader: object { input, name, operation, 2 more }
    
    A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
    - input: string
      
      The input text. This may include template strings.
    - name: string
      
      The name of the grader.
    - operation: "eq" or "ne" or "like" or "ilike"
      
      The string check operation to perform. One of eq, ne, like, or ilike.
      - "eq"
      - "ne"
      - "like"
      - "ilike"
    - reference: string
      
      The reference text. This may include template strings.
    - type: "string_check"
      
      The object type, which is always string_check.
  - text_similarity_grader: object { evaluation_metric, input, name, 2 more }
    
    A TextSimilarityGrader object which grades text based on similarity metrics.
    - evaluation_metric: "cosine" or "fuzzy_match" or "bleu" or 8 more
      
      The evaluation metric to use. One of cosine, fuzzy_match, bleu, gleu, meteor, rouge_1, rouge_2, rouge_3, rouge_4, rouge_5, or rouge_l.
      - "cosine"
      - "fuzzy_match"
      - "bleu"
      - "gleu"
      - "meteor"
      - "rouge_1"
      - "rouge_2"
      - "rouge_3"
      - "rouge_4"
      - "rouge_5"
      - "rouge_l"
    - input: string
      
      The text being graded.
    - name: string
      
      The name of the grader.
    - reference: string
      
      The text being graded against.
    - type: "text_similarity"
      
      The type of grader.
  - python_grader: object { name, source, type, image_tag }
    
    A PythonGrader object that runs a python script on the input.
    - name: string
      
      The name of the grader.
    - source: string
      
      The source code of the python script.
    - type: "python"
      
      The object type, which is always python.
    - image_tag: optional string
      
      The image tag to use for the python script.
  - score_model_grader: object { input, model, name, 3 more }
    
    A ScoreModelGrader object that uses a model to assign a score to the input.
    - input: array of object { content, role, type }
      
      The input messages evaluated by the grader. Supports text, output text, input image, and input audio content blocks, and may include template strings.
      - content: string or ResponseInputText or object { text, type } or 3 more
        
        Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
        
        Text input: string
        
        A text input to the model.
        
        response_input_text: object { text, type }
        
        A text input to the model.
        
        text: string
        
        The text input to the model.
        
        type: "input_text"
        
        The type of the input item. Always input_text.
        
        Output text: object { text, type }
        
        A text output from the model.
        
        text: string
        
        The text output from the model.
        
        type: "output_text"
        
        The type of the output text. Always output_text.
        
        Input image: object { image_url, type, detail }
        
        An image input block used within EvalItem content arrays.
        
        image_url: string
        
        The URL of the image input.
        
        type: "input_image"
        
        The type of the image input. Always input_image.
        
        detail: optional string
        
        The detail level of the image to be sent to the model. One of high, low, or auto. Defaults to auto.
        
        response_input_audio: object { input_audio, type }
        
        An audio input to the model.
        
        input_audio: object { data, format }
        
        data: string
        
        Base64-encoded audio data.
        
        format: "mp3" or "wav"
        
        The format of the audio data. Currently supported formats are mp3 and wav.
        
        "mp3"
        
        "wav"
        
        type: "input_audio"
        
        The type of the input item. Always input_audio.
        
        grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more
        
        A list of inputs, each of which may be either an input text, output text, input image, or input audio object.
        
        Text input: string
        
        A text input to the model.
        
        response_input_text: object { text, type }
        
        A text input to the model.
        
        Output text: object { text, type }
        
        A text output from the model.
        
        text: string
        
        The text output from the model.
        
        type: "output_text"
        
        The type of the output text. Always output_text.
        
        Input image: object { image_url, type, detail }
        
        An image input block used within EvalItem content arrays.
        
        image_url: string
        
        The URL of the image input.
        
        type: "input_image"
        
        The type of the image input. Always input_image.
        
        detail: optional string
        
        The detail level of the image to be sent to the model. One of high, low, or auto. Defaults to auto.
        
        response_input_audio: object { input_audio, type }
        
        An audio input to the model.
      - role: "user" or "assistant" or "system" or "developer"
        
        The role of the message input. One of user, assistant, system, or developer.
        
        "user"
        
        "assistant"
        
        "system"
        
        "developer"
      - type: optional "message"
        
        The type of the message input. Always message.
        
        "message"
    - model: string
      
      The model to use for the evaluation.
    - name: string
      
      The name of the grader.
    - type: "score_model"
      
      The object type, which is always score_model.
    - range: optional array of number
      
      The range of the score. Defaults to [0, 1].
    - sampling_params: optional object { max_completions_tokens, reasoning_effort, seed, 2 more }
      
      The sampling parameters for the model.
      - max_completions_tokens: optional number
        
        The maximum number of tokens the grader model may generate in its response.
      - reasoning_effort: optional "none" or "minimal" or "low" or 3 more
        
        Constrains effort on reasoning for reasoning models. Currently supported values are none, minimal, low, medium, high, and xhigh. Reducing reasoning effort can result in faster responses and fewer tokens used on reasoning in a response.
        
        gpt-5.1 defaults to none, which does not perform reasoning. The supported reasoning values for gpt-5.1 are none, low, medium, and high. Tool calls are supported for all reasoning values in gpt-5.1.
        
        All models before gpt-5.1 default to medium reasoning effort, and do not support none.
        
        The gpt-5-pro model defaults to (and only supports) high reasoning effort.
        
        xhigh is supported for all models after gpt-5.1-codex-max.
        
        "none"
        
        "minimal"
        
        "low"
        
        "medium"
        
        "high"
        
        "xhigh"
      - seed: optional number
        
        A seed value to initialize the randomness, during sampling.
      - temperature: optional number
        
        A higher temperature increases randomness in the outputs.
      - top_p: optional number
        
        An alternative to temperature for nucleus sampling; 1.0 includes all tokens.
  - multi_grader: object { calculate_output, graders, name, type }
    
    A MultiGrader object combines the output of multiple graders to produce a single score.
    - calculate_output: string
      
      A formula to calculate the output based on grader results.
    - graders: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more
      
      A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
      - string_check_grader: object { input, name, operation, 2 more }
        
        A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.
      - text_similarity_grader: object { evaluation_metric, input, name, 2 more }
        
        A TextSimilarityGrader object which grades text based on similarity metrics.
      - python_grader: object { name, source, type, image_tag }
        
        A PythonGrader object that runs a python script on the input.
      - score_model_grader: object { input, model, name, 3 more }
        
        A ScoreModelGrader object that uses a model to assign a score to the input.
      - label_model_grader: object { input, labels, model, 3 more }
        
        A LabelModelGrader object which uses a model to assign labels to each item in the evaluation.
        
        input: array of object { content, role, type }
        
        content: string or ResponseInputText or object { text, type } or 3 more
        
        Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.
        
        Text input: string
        
        A text input to the model.
        
        response_input_text: object { text, type }
        
        A text input to the model.
        
        Output text: object { text, type }
        
        A text output from the model.
        
        text: string
        
        The text output from the model.
        
        type: "output_text"
        
        The type of the output text. Always output_text.
        
        Input image: object { image_url, type, detail }
        
        An image input block used within EvalItem content arrays.
        
        image_url: string
        
        The URL of the image input.
        
        type: "input_image"
        
        The type of the image input. Always input_image.
        
        detail: optional string
        
        The detail level of the image to be sent to the model. One of high, low, or auto. Defaults to auto.
        
        response_input_audio: object { input_audio, type }
        
        An audio input to the model.
        
        grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more
        
        A list of inputs, each of which may be either an input text, output text, input image, or input audio object.
        
        role: "user" or "assistant" or "system" or "developer"
        
        The role of the message input. One of user, assistant, system, or developer.
        
        "user"
        
        "assistant"
        
        "system"
        
        "developer"
        
        type: optional "message"
        
        The type of the message input. Always message.
        
        "message"
        
        labels: array of string
        
        The labels to assign to each item in the evaluation.
        
        model: string
        
        The model to use for the evaluation. Must support structured outputs.
        
        name: string
        
        The name of the grader.
        
        passing_labels: array of string
        
        The labels that indicate a passing result. Must be a subset of labels.
        
        type: "label_model"
        
        The object type, which is always label_model.
    - name: string
      
      The name of the grader.
    - type: "multi"
      
      The object type, which is always multi.

Example

openai fine-tuning:alpha:graders validate \
  --api-key 'My API Key' \
  --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}'

Response

{
  "grader": {
    "input": "input",
    "name": "name",
    "operation": "eq",
    "reference": "reference",
    "type": "string_check"
  }
}

cli/resources/fine_tuning/subresources/alpha/index.md +611 −0 created

1# Alpha

3# Graders

5## Run grader

7`$ openai fine-tuning:alpha:graders run`

9**post** `/fine_tuning/alpha/graders/run`

11Run a grader.

13### Parameters

15- `--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

17 The grader used for the fine-tuning job.

19- `--model-sample: string`

21 The model sample to be evaluated. This value will be used to populate

22 the `sample` namespace. See [the guide](https://platform.openai.com/docs/guides/graders) for more details.

23 The `output_json` variable will be populated if the model sample is a

24 valid JSON string.

26- `--item: optional unknown`

28 The dataset item provided to the grader. This will be used to populate

29 the `item` namespace. See [the guide](https://platform.openai.com/docs/guides/graders) for more details.

31### Returns

33- `FineTuningAlphaGraderRunResponse: object { metadata, model_grader_token_usage_per_model, reward, sub_rewards }`

35 - `metadata: object { errors, execution_time, name, 4 more }`

37 - `errors: object { formula_parse_error, invalid_variable_error, model_grader_parse_error, 11 more }`

39 - `formula_parse_error: boolean`

41 - `invalid_variable_error: boolean`

43 - `model_grader_parse_error: boolean`

45 - `model_grader_refusal_error: boolean`

47 - `model_grader_server_error: boolean`

49 - `model_grader_server_error_details: string`

51 - `other_error: boolean`

53 - `python_grader_runtime_error: boolean`

55 - `python_grader_runtime_error_details: string`

57 - `python_grader_server_error: boolean`

59 - `python_grader_server_error_type: string`

61 - `sample_parse_error: boolean`

63 - `truncated_observation_error: boolean`

65 - `unresponsive_reward_error: boolean`

67 - `execution_time: number`

69 - `name: string`

71 - `sampled_model_name: string`

73 - `scores: map[unknown]`

75 - `token_usage: number`

77 - `type: string`

79 - `model_grader_token_usage_per_model: map[unknown]`

81 - `reward: number`

83 - `sub_rewards: map[unknown]`

85### Example

87```cli

88openai fine-tuning:alpha:graders run \

89 --api-key 'My API Key' \

90 --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}' \

91 --model-sample model_sample

92```

94#### Response

96```json

97{

98 "metadata": {

99 "errors": {

100 "formula_parse_error": true,

101 "invalid_variable_error": true,

102 "model_grader_parse_error": true,

103 "model_grader_refusal_error": true,

104 "model_grader_server_error": true,

105 "model_grader_server_error_details": "model_grader_server_error_details",

106 "other_error": true,

107 "python_grader_runtime_error": true,

108 "python_grader_runtime_error_details": "python_grader_runtime_error_details",

109 "python_grader_server_error": true,

110 "python_grader_server_error_type": "python_grader_server_error_type",

111 "sample_parse_error": true,

112 "truncated_observation_error": true,

113 "unresponsive_reward_error": true

114 },

115 "execution_time": 0,

116 "name": "name",

117 "sampled_model_name": "sampled_model_name",

118 "scores": {

119 "foo": "bar"

120 },

121 "token_usage": 0,

122 "type": "type"

123 },

124 "model_grader_token_usage_per_model": {

125 "foo": "bar"

126 },

127 "reward": 0,

128 "sub_rewards": {

129 "foo": "bar"

130 }

131}

132```

133

134## Validate grader

135

136`$ openai fine-tuning:alpha:graders validate`

137

138**post** `/fine_tuning/alpha/graders/validate`

139

140Validate a grader.

141

142### Parameters

143

144- `--grader: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

145

146 The grader used for the fine-tuning job.

147

148### Returns

149

150- `FineTuningAlphaGraderValidateResponse: object { grader }`

151

152 - `grader: optional StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

153

154 The grader used for the fine-tuning job.

155

156 - `string_check_grader: object { input, name, operation, 2 more }`

157

158 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

159

160 - `input: string`

161

162 The input text. This may include template strings.

163

164 - `name: string`

165

166 The name of the grader.

167

168 - `operation: "eq" or "ne" or "like" or "ilike"`

169

170 The string check operation to perform. One of `eq`, `ne`, `like`, or `ilike`.

171

172 - `"eq"`

173

174 - `"ne"`

175

176 - `"like"`

177

178 - `"ilike"`

179

180 - `reference: string`

181

182 The reference text. This may include template strings.

183

184 - `type: "string_check"`

185

186 The object type, which is always `string_check`.

187

188 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`

189

190 A TextSimilarityGrader object which grades text based on similarity metrics.

191

192 - `evaluation_metric: "cosine" or "fuzzy_match" or "bleu" or 8 more`

193

194 The evaluation metric to use. One of `cosine`, `fuzzy_match`, `bleu`,

195 `gleu`, `meteor`, `rouge_1`, `rouge_2`, `rouge_3`, `rouge_4`, `rouge_5`,

196 or `rouge_l`.

197

198 - `"cosine"`

199

200 - `"fuzzy_match"`

201

202 - `"bleu"`

203

204 - `"gleu"`

205

206 - `"meteor"`

207

208 - `"rouge_1"`

209

210 - `"rouge_2"`

211

212 - `"rouge_3"`

213

214 - `"rouge_4"`

215

216 - `"rouge_5"`

217

218 - `"rouge_l"`

219

220 - `input: string`

221

222 The text being graded.

223

224 - `name: string`

225

226 The name of the grader.

227

228 - `reference: string`

229

230 The text being graded against.

231

232 - `type: "text_similarity"`

233

234 The type of grader.

235

236 - `python_grader: object { name, source, type, image_tag }`

237

238 A PythonGrader object that runs a python script on the input.

239

240 - `name: string`

241

242 The name of the grader.

243

244 - `source: string`

245

246 The source code of the python script.

247

248 - `type: "python"`

249

250 The object type, which is always `python`.

251

252 - `image_tag: optional string`

253

254 The image tag to use for the python script.

255

256 - `score_model_grader: object { input, model, name, 3 more }`

257

258 A ScoreModelGrader object that uses a model to assign a score to the input.

259

260 - `input: array of object { content, role, type }`

261

262 The input messages evaluated by the grader. Supports text, output text, input image, and input audio content blocks, and may include template strings.

263

264 - `content: string or ResponseInputText or object { text, type } or 3 more`

265

266 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.

267

268 - `Text input: string`

269

270 A text input to the model.

271

272 - `response_input_text: object { text, type }`

273

274 A text input to the model.

275

276 - `text: string`

277

278 The text input to the model.

279

280 - `type: "input_text"`

281

282 The type of the input item. Always `input_text`.

283

284 - `Output text: object { text, type }`

285

286 A text output from the model.

287

288 - `text: string`

289

290 The text output from the model.

291

292 - `type: "output_text"`

293

294 The type of the output text. Always `output_text`.

295

296 - `Input image: object { image_url, type, detail }`

297

298 An image input block used within EvalItem content arrays.

299

300 - `image_url: string`

301

302 The URL of the image input.

303

304 - `type: "input_image"`

305

306 The type of the image input. Always `input_image`.

307

308 - `detail: optional string`

309

310 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

311

312 - `response_input_audio: object { input_audio, type }`

313

314 An audio input to the model.

315

316 - `input_audio: object { data, format }`

317

318 - `data: string`

319

320 Base64-encoded audio data.

321

322 - `format: "mp3" or "wav"`

323

324 The format of the audio data. Currently supported formats are `mp3` and

325 `wav`.

326

327 - `"mp3"`

328

329 - `"wav"`

330

331 - `type: "input_audio"`

332

333 The type of the input item. Always `input_audio`.

334

335 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`

336

337 A list of inputs, each of which may be either an input text, output text, input

338 image, or input audio object.

339

340 - `Text input: string`

341

342 A text input to the model.

343

344 - `response_input_text: object { text, type }`

345

346 A text input to the model.

347

348 - `Output text: object { text, type }`

349

350 A text output from the model.

351

352 - `text: string`

353

354 The text output from the model.

355

356 - `type: "output_text"`

357

358 The type of the output text. Always `output_text`.

359

360 - `Input image: object { image_url, type, detail }`

361

362 An image input block used within EvalItem content arrays.

363

364 - `image_url: string`

365

366 The URL of the image input.

367

368 - `type: "input_image"`

369

370 The type of the image input. Always `input_image`.

371

372 - `detail: optional string`

373

374 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

375

376 - `response_input_audio: object { input_audio, type }`

377

378 An audio input to the model.

379

380 - `role: "user" or "assistant" or "system" or "developer"`

381

382 The role of the message input. One of `user`, `assistant`, `system`, or

383 `developer`.

384

385 - `"user"`

386

387 - `"assistant"`

388

389 - `"system"`

390

391 - `"developer"`

392

393 - `type: optional "message"`

394

395 The type of the message input. Always `message`.

396

397 - `"message"`

398

399 - `model: string`

400

401 The model to use for the evaluation.

402

403 - `name: string`

404

405 The name of the grader.

406

407 - `type: "score_model"`

408

409 The object type, which is always `score_model`.

410

411 - `range: optional array of number`

412

413 The range of the score. Defaults to `[0, 1]`.

414

415 - `sampling_params: optional object { max_completions_tokens, reasoning_effort, seed, 2 more }`

416

417 The sampling parameters for the model.

418

419 - `max_completions_tokens: optional number`

420

421 The maximum number of tokens the grader model may generate in its response.

422

423 - `reasoning_effort: optional "none" or "minimal" or "low" or 3 more`

424

425 Constrains effort on reasoning for

426 [reasoning models](https://platform.openai.com/docs/guides/reasoning).

427 Currently supported values are `none`, `minimal`, `low`, `medium`, `high`, and `xhigh`. Reducing

428 reasoning effort can result in faster responses and fewer tokens used

429 on reasoning in a response.

430

431 - `gpt-5.1` defaults to `none`, which does not perform reasoning. The supported reasoning values for `gpt-5.1` are `none`, `low`, `medium`, and `high`. Tool calls are supported for all reasoning values in gpt-5.1.

432 - All models before `gpt-5.1` default to `medium` reasoning effort, and do not support `none`.

433 - The `gpt-5-pro` model defaults to (and only supports) `high` reasoning effort.

434 - `xhigh` is supported for all models after `gpt-5.1-codex-max`.

435

436 - `"none"`

437

438 - `"minimal"`

439

440 - `"low"`

441

442 - `"medium"`

443

444 - `"high"`

445

446 - `"xhigh"`

447

448 - `seed: optional number`

449

450 A seed value to initialize the randomness, during sampling.

451

452 - `temperature: optional number`

453

454 A higher temperature increases randomness in the outputs.

455

456 - `top_p: optional number`

457

458 An alternative to temperature for nucleus sampling; 1.0 includes all tokens.

459

460 - `multi_grader: object { calculate_output, graders, name, type }`

461

462 A MultiGrader object combines the output of multiple graders to produce a single score.

463

464 - `calculate_output: string`

465

466 A formula to calculate the output based on grader results.

467

468 - `graders: StringCheckGrader or TextSimilarityGrader or PythonGrader or 2 more`

469

470 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

471

472 - `string_check_grader: object { input, name, operation, 2 more }`

473

474 A StringCheckGrader object that performs a string comparison between input and reference using a specified operation.

475

476 - `text_similarity_grader: object { evaluation_metric, input, name, 2 more }`

477

478 A TextSimilarityGrader object which grades text based on similarity metrics.

479

480 - `python_grader: object { name, source, type, image_tag }`

481

482 A PythonGrader object that runs a python script on the input.

483

484 - `score_model_grader: object { input, model, name, 3 more }`

485

486 A ScoreModelGrader object that uses a model to assign a score to the input.

487

488 - `label_model_grader: object { input, labels, model, 3 more }`

489

490 A LabelModelGrader object which uses a model to assign labels to each item

491 in the evaluation.

492

493 - `input: array of object { content, role, type }`

494

495 - `content: string or ResponseInputText or object { text, type } or 3 more`

496

497 Inputs to the model - can contain template strings. Supports text, output text, input images, and input audio, either as a single item or an array of items.

498

499 - `Text input: string`

500

501 A text input to the model.

502

503 - `response_input_text: object { text, type }`

504

505 A text input to the model.

506

507 - `Output text: object { text, type }`

508

509 A text output from the model.

510

511 - `text: string`

512

513 The text output from the model.

514

515 - `type: "output_text"`

516

517 The type of the output text. Always `output_text`.

518

519 - `Input image: object { image_url, type, detail }`

520

521 An image input block used within EvalItem content arrays.

522

523 - `image_url: string`

524

525 The URL of the image input.

526

527 - `type: "input_image"`

528

529 The type of the image input. Always `input_image`.

530

531 - `detail: optional string`

532

533 The detail level of the image to be sent to the model. One of `high`, `low`, or `auto`. Defaults to `auto`.

534

535 - `response_input_audio: object { input_audio, type }`

536

537 An audio input to the model.

538

539 - `grader_inputs: array of string or ResponseInputText or object { text, type } or 2 more`

540

541 A list of inputs, each of which may be either an input text, output text, input

542 image, or input audio object.

543

544 - `role: "user" or "assistant" or "system" or "developer"`

545

546 The role of the message input. One of `user`, `assistant`, `system`, or

547 `developer`.

548

549 - `"user"`

550

551 - `"assistant"`

552

553 - `"system"`

554

555 - `"developer"`

556

557 - `type: optional "message"`

558

559 The type of the message input. Always `message`.

560

561 - `"message"`

562

563 - `labels: array of string`

564

565 The labels to assign to each item in the evaluation.

566

567 - `model: string`

568

569 The model to use for the evaluation. Must support structured outputs.

570

571 - `name: string`

572

573 The name of the grader.

574

575 - `passing_labels: array of string`

576

577 The labels that indicate a passing result. Must be a subset of labels.

578

579 - `type: "label_model"`

580

581 The object type, which is always `label_model`.

582

583 - `name: string`

584

585 The name of the grader.

586

587 - `type: "multi"`

588

589 The object type, which is always `multi`.

590

591### Example

592

593```cli

594openai fine-tuning:alpha:graders validate \

595 --api-key 'My API Key' \

596 --grader '{input: input, name: name, operation: eq, reference: reference, type: string_check}'

597```

598

599#### Response

600

601```json

602{

603 "grader": {

604 "input": "input",

605 "name": "name",

606 "operation": "eq",

607 "reference": "reference",

608 "type": "string_check"

609 }

610}

611```

cli/resources/fine_tuning/subresources/alpha/index.md 2026-05-05 23:00 UTC to 2026-05-07 21:57 UTC

Alpha

Graders

Run grader

Parameters

Returns

Example

Response

Validate grader

Parameters

Returns

Example

Response