SpyBara
Go Premium

ruby/resources/audio/subresources/speech/index.md 2026-06-10 15:48 UTC to 2026-06-12 00:01 UTC

131 added, 0 removed.

2026
Fri 26 17:57 Thu 25 20:59 Wed 24 22:02 Tue 23 22:00 Wed 17 18:02 Tue 16 21:57 Fri 12 00:01 Wed 10 15:48 Tue 9 06:34 Fri 5 06:45 Thu 4 06:52 Tue 2 06:51

Speech

Create speech

audio.speech.create(**kwargs) -> StringIO

post /audio/speech

Generates audio from the input text.

Returns the audio file content, or a stream of audio events.

Parameters

  • input: String

    The text to generate audio for. The maximum length is 4096 characters.

  • model: String | SpeechModel

    One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.

    • String = String

    • SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"

      • :"tts-1"

      • :"tts-1-hd"

      • :"gpt-4o-mini-tts"

      • :"gpt-4o-mini-tts-2025-12-15"

  • voice: String | :alloy | :ash | :ballad | 7 more | ID{ id}

    The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.

    • String = String

    • Voice = :alloy | :ash | :ballad | 7 more

      • :alloy

      • :ash

      • :ballad

      • :coral

      • :echo

      • :sage

      • :shimmer

      • :verse

      • :marin

      • :cedar

    • class ID

      Custom voice reference.

      • id: String

        The custom voice ID, e.g. voice_1234.

  • instructions: String

    Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.

  • response_format: :mp3 | :opus | :aac | 3 more

    The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.

    • :mp3

    • :opus

    • :aac

    • :flac

    • :wav

    • :pcm

  • speed: Float

    The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.

  • stream_format: :sse | :audio

    The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.

    • :sse

    • :audio

Returns

  • StringIO

Example

require "openai"

openai = OpenAI::Client.new(api_key: "My API Key")

speech = openai.audio.speech.create(input: "input", model: :"tts-1", voice: :alloy)

puts(speech)

Domain Types

Speech Model

  • SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"

    • :"tts-1"

    • :"tts-1-hd"

    • :"gpt-4o-mini-tts"

    • :"gpt-4o-mini-tts-2025-12-15"