SpyBara
Go Premium

java/resources/audio/subresources/speech/index.md 2026-05-02 05:57 UTC to 2026-05-05 23:00 UTC

140 added, 0 removed.

2026
Wed 27 06:42 Fri 22 06:33 Wed 20 06:35 Tue 19 06:34 Mon 18 22:01 Mon 11 18:00 Thu 7 21:57 Tue 5 23:00 Sat 2 05:57
Data Information:
  • After 2026-05-05 06:03 UTC, this monitor no longer uses markdownified HTML/MDX. Comparisons across that boundary can therefore show more extensive diffs.

Speech

Create speech

HttpResponse audio().speech().create(SpeechCreateParamsparams, RequestOptionsrequestOptions = RequestOptions.none())

post /audio/speech

Generates audio from the input text.

Returns the audio file content, or a stream of audio events.

Parameters

  • SpeechCreateParams params

    • String input

      The text to generate audio for. The maximum length is 4096 characters.

    • SpeechModel model

      One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.

      • TTS_1("tts-1")

      • TTS_1_HD("tts-1-hd")

      • GPT_4O_MINI_TTS("gpt-4o-mini-tts")

      • GPT_4O_MINI_TTS_2025_12_15("gpt-4o-mini-tts-2025-12-15")

    • Voice voice

      The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.

      • String

      • enum UnionMember1:

        • ALLOY("alloy")

        • ASH("ash")

        • BALLAD("ballad")

        • CORAL("coral")

        • ECHO("echo")

        • SAGE("sage")

        • SHIMMER("shimmer")

        • VERSE("verse")

        • MARIN("marin")

        • CEDAR("cedar")

      • class Id:

        Custom voice reference.

        • String id

          The custom voice ID, e.g. voice_1234.

    • Optional<String> instructions

      Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.

    • Optional<ResponseFormat> responseFormat

      The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.

      • MP3("mp3")

      • OPUS("opus")

      • AAC("aac")

      • FLAC("flac")

      • WAV("wav")

      • PCM("pcm")

    • Optional<Double> speed

      The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.

    • Optional<StreamFormat> streamFormat

      The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.

      • SSE("sse")

      • AUDIO("audio")

Example

package com.openai.example;

import com.openai.client.OpenAIClient;
import com.openai.client.okhttp.OpenAIOkHttpClient;
import com.openai.core.http.HttpResponse;
import com.openai.models.audio.speech.SpeechCreateParams;
import com.openai.models.audio.speech.SpeechModel;

public final class Main {
    private Main() {}

    public static void main(String[] args) {
        OpenAIClient client = OpenAIOkHttpClient.fromEnv();

        SpeechCreateParams params = SpeechCreateParams.builder()
            .input("input")
            .model(SpeechModel.TTS_1)
            .voice("string")
            .build();
        HttpResponse speech = client.audio().speech().create(params);
    }
}

Domain Types

Speech Model

  • enum SpeechModel:

    • TTS_1("tts-1")

    • TTS_1_HD("tts-1-hd")

    • GPT_4O_MINI_TTS("gpt-4o-mini-tts")

    • GPT_4O_MINI_TTS_2025_12_15("gpt-4o-mini-tts-2025-12-15")