Create moderation
ModerationCreateResponse moderations().create(ModerationCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())
post /moderations
Classifies if text and/or image inputs are potentially harmful. Learn more in the moderation guide.
Parameters
-
ModerationCreateParams params-
Input inputInput (or inputs) to classify. Can be a single string, an array of strings, or an array of multi-modal input objects similar to other models.
-
String -
List<String> -
List<ModerationMultiModalInput>-
class ModerationImageUrlInput:An object describing an image to classify.
-
ImageUrl imageUrlContains either an image URL or a data URL for a base64 encoded image.
-
String urlEither a URL of the image or the base64 encoded image data.
-
-
JsonValue; type "image_url"constantAlways
image_url.IMAGE_URL("image_url")
-
-
class ModerationTextInput:An object describing text to classify.
-
String textA string of text to classify.
-
JsonValue; type "text"constantAlways
text.TEXT("text")
-
-
-
-
Optional<ModerationModel> modelThe content moderation model you would like to use. Learn more in the moderation guide, and learn about available models here.
-
Returns
-
class ModerationCreateResponse:Represents if a given text input is potentially harmful.
-
String idThe unique identifier for the moderation request.
-
String modelThe model used to generate the moderation results.
-
List<Moderation> resultsA list of moderation objects.
-
Categories categoriesA list of the categories, and whether they are flagged or not.
-
boolean harassmentContent that expresses, incites, or promotes harassing language towards any target.
-
boolean harassmentThreateningHarassment content that also includes violence or serious harm towards any target.
-
boolean hateContent that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. Hateful content aimed at non-protected groups (e.g., chess players) is harassment.
-
boolean hateThreateningHateful content that also includes violence or serious harm towards the targeted group based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
-
Optional<Boolean> illicitContent that includes instructions or advice that facilitate the planning or execution of wrongdoing, or that gives advice or instruction on how to commit illicit acts. For example, "how to shoplift" would fit this category.
-
Optional<Boolean> illicitViolentContent that includes instructions or advice that facilitate the planning or execution of wrongdoing that also includes violence, or that gives advice or instruction on the procurement of any weapon.
-
boolean selfHarmContent that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
-
boolean selfHarmInstructionsContent that encourages performing acts of self-harm, such as suicide, cutting, and eating disorders, or that gives instructions or advice on how to commit such acts.
-
boolean selfHarmIntentContent where the speaker expresses that they are engaging or intend to engage in acts of self-harm, such as suicide, cutting, and eating disorders.
-
boolean sexualContent meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
-
boolean sexualMinorsSexual content that includes an individual who is under 18 years old.
-
boolean violenceContent that depicts death, violence, or physical injury.
-
boolean violenceGraphicContent that depicts death, violence, or physical injury in graphic detail.
-
-
CategoryAppliedInputTypes categoryAppliedInputTypesA list of the categories along with the input type(s) that the score applies to.
-
List<Harassment> harassmentThe applied input type(s) for the category 'harassment'.
TEXT("text")
-
List<HarassmentThreatening> harassmentThreateningThe applied input type(s) for the category 'harassment/threatening'.
TEXT("text")
-
List<Hate> hateThe applied input type(s) for the category 'hate'.
TEXT("text")
-
List<HateThreatening> hateThreateningThe applied input type(s) for the category 'hate/threatening'.
TEXT("text")
-
List<Illicit> illicitThe applied input type(s) for the category 'illicit'.
TEXT("text")
-
List<IllicitViolent> illicitViolentThe applied input type(s) for the category 'illicit/violent'.
TEXT("text")
-
List<SelfHarm> selfHarmThe applied input type(s) for the category 'self-harm'.
-
TEXT("text") -
IMAGE("image")
-
-
List<SelfHarmInstruction> selfHarmInstructionsThe applied input type(s) for the category 'self-harm/instructions'.
-
TEXT("text") -
IMAGE("image")
-
-
List<SelfHarmIntent> selfHarmIntentThe applied input type(s) for the category 'self-harm/intent'.
-
TEXT("text") -
IMAGE("image")
-
-
List<Sexual> sexualThe applied input type(s) for the category 'sexual'.
-
TEXT("text") -
IMAGE("image")
-
-
List<SexualMinor> sexualMinorsThe applied input type(s) for the category 'sexual/minors'.
TEXT("text")
-
List<Violence> violenceThe applied input type(s) for the category 'violence'.
-
TEXT("text") -
IMAGE("image")
-
-
List<ViolenceGraphic> violenceGraphicThe applied input type(s) for the category 'violence/graphic'.
-
TEXT("text") -
IMAGE("image")
-
-
-
CategoryScores categoryScoresA list of the categories along with their scores as predicted by model.
-
double harassmentThe score for the category 'harassment'.
-
double harassmentThreateningThe score for the category 'harassment/threatening'.
-
double hateThe score for the category 'hate'.
-
double hateThreateningThe score for the category 'hate/threatening'.
-
double illicitThe score for the category 'illicit'.
-
double illicitViolentThe score for the category 'illicit/violent'.
-
double selfHarmThe score for the category 'self-harm'.
-
double selfHarmInstructionsThe score for the category 'self-harm/instructions'.
-
double selfHarmIntentThe score for the category 'self-harm/intent'.
-
double sexualThe score for the category 'sexual'.
-
double sexualMinorsThe score for the category 'sexual/minors'.
-
double violenceThe score for the category 'violence'.
-
double violenceGraphicThe score for the category 'violence/graphic'.
-
-
boolean flaggedWhether any of the below categories are flagged.
-
-
Example
package com.openai.example;
import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.models.moderations.ModerationCreateParams;
import com.openai.models.moderations.ModerationCreateResponse;
public final class Main {
private Main() {}
public static void main(String[] args) {
OpenAIClient client = OpenAIOkHttpClient.fromEnv();
ModerationCreateParams params = ModerationCreateParams.builder()
.input("I want to kill them.")
.build();
ModerationCreateResponse moderation = client.moderations().create(params);
}
}
Response
{
"id": "id",
"model": "model",
"results": [
{
"categories": {
"harassment": true,
"harassment/threatening": true,
"hate": true,
"hate/threatening": true,
"illicit": true,
"illicit/violent": true,
"self-harm": true,
"self-harm/instructions": true,
"self-harm/intent": true,
"sexual": true,
"sexual/minors": true,
"violence": true,
"violence/graphic": true
},
"category_applied_input_types": {
"harassment": [
"text"
],
"harassment/threatening": [
"text"
],
"hate": [
"text"
],
"hate/threatening": [
"text"
],
"illicit": [
"text"
],
"illicit/violent": [
"text"
],
"self-harm": [
"text"
],
"self-harm/instructions": [
"text"
],
"self-harm/intent": [
"text"
],
"sexual": [
"text"
],
"sexual/minors": [
"text"
],
"violence": [
"text"
],
"violence/graphic": [
"text"
]
},
"category_scores": {
"harassment": 0,
"harassment/threatening": 0,
"hate": 0,
"hate/threatening": 0,
"illicit": 0,
"illicit/violent": 0,
"self-harm": 0,
"self-harm/instructions": 0,
"self-harm/intent": 0,
"sexual": 0,
"sexual/minors": 0,
"violence": 0,
"violence/graphic": 0
},
"flagged": true
}
]
}