Speech

Create speech

audio.speech.create(**kwargs) -> StringIO

post /audio/speech

Generates audio from the input text.

Returns the audio file content, or a stream of audio events.

Parameters

input: String

The text to generate audio for. The maximum length is 4096 characters.
model: String | SpeechModel

One of the available TTS models: tts-1, tts-1-hd, gpt-4o-mini-tts, or gpt-4o-mini-tts-2025-12-15.
- String = String
- SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"
  - :"tts-1"
  - :"tts-1-hd"
  - :"gpt-4o-mini-tts"
  - :"gpt-4o-mini-tts-2025-12-15"
voice: String | :alloy | :ash | :ballad | 7 more | ID{ id}

The voice to use when generating the audio. Supported built-in voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse, marin, and cedar. You may also provide a custom voice object with an id, for example { "id": "voice_1234" }. Previews of the voices are available in the Text to speech guide.
- String = String
- Voice = :alloy | :ash | :ballad | 7 more
  - :alloy
  - :ash
  - :ballad
  - :coral
  - :echo
  - :sage
  - :shimmer
  - :verse
  - :marin
  - :cedar
- class ID
  
  Custom voice reference.
  - id: String
    
    The custom voice ID, e.g. voice_1234.
instructions: String

Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.
response_format: :mp3 | :opus | :aac | 3 more

The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.
- :mp3
- :opus
- :aac
- :flac
- :wav
- :pcm
speed: Float

The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
stream_format: :sse | :audio

The format to stream the audio in. Supported formats are sse and audio. sse is not supported for tts-1 or tts-1-hd.
- :sse
- :audio

Returns

StringIO

Example

require "openai"

openai = OpenAI::Client.new(api_key: "My API Key")

speech = openai.audio.speech.create(input: "input", model: :"tts-1", voice: :alloy)

puts(speech)

Domain Types

Speech Model

SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"
- :"tts-1"
- :"tts-1-hd"
- :"gpt-4o-mini-tts"
- :"gpt-4o-mini-tts-2025-12-15"

ruby/resources/audio/subresources/speech/index.md +131 −0 created

1# Speech

3## Create speech

5`audio.speech.create(**kwargs) -> StringIO`

7**post** `/audio/speech`

9Generates audio from the input text.

11Returns the audio file content, or a stream of audio events.

13### Parameters

15- `input: String`

17 The text to generate audio for. The maximum length is 4096 characters.

19- `model: String | SpeechModel`

21 One of the available [TTS models](https://platform.openai.com/docs/models#tts): `tts-1`, `tts-1-hd`, `gpt-4o-mini-tts`, or `gpt-4o-mini-tts-2025-12-15`.

23 - `String = String`

25 - `SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"`

27 - `:"tts-1"`

29 - `:"tts-1-hd"`

31 - `:"gpt-4o-mini-tts"`

33 - `:"gpt-4o-mini-tts-2025-12-15"`

35- `voice: String | :alloy | :ash | :ballad | 7 more | ID{ id}`

37 The voice to use when generating the audio. Supported built-in voices are `alloy`, `ash`, `ballad`, `coral`, `echo`, `fable`, `onyx`, `nova`, `sage`, `shimmer`, `verse`, `marin`, and `cedar`. You may also provide a custom voice object with an `id`, for example `{ "id": "voice_1234" }`. Previews of the voices are available in the [Text to speech guide](https://platform.openai.com/docs/guides/text-to-speech#voice-options).

39 - `String = String`

41 - `Voice = :alloy | :ash | :ballad | 7 more`

43 - `:alloy`

45 - `:ash`

47 - `:ballad`

49 - `:coral`

51 - `:echo`

53 - `:sage`

55 - `:shimmer`

57 - `:verse`

59 - `:marin`

61 - `:cedar`

63 - `class ID`

65 Custom voice reference.

67 - `id: String`

69 The custom voice ID, e.g. `voice_1234`.

71- `instructions: String`

73 Control the voice of your generated audio with additional instructions. Does not work with `tts-1` or `tts-1-hd`.

75- `response_format: :mp3 | :opus | :aac | 3 more`

77 The format to audio in. Supported formats are `mp3`, `opus`, `aac`, `flac`, `wav`, and `pcm`.

79 - `:mp3`

81 - `:opus`

83 - `:aac`

85 - `:flac`

87 - `:wav`

89 - `:pcm`

91- `speed: Float`

93 The speed of the generated audio. Select a value from `0.25` to `4.0`. `1.0` is the default.

95- `stream_format: :sse | :audio`

97 The format to stream the audio in. Supported formats are `sse` and `audio`. `sse` is not supported for `tts-1` or `tts-1-hd`.

99 - `:sse`

100

101 - `:audio`

102

103### Returns

104

105- `StringIO`

106

107### Example

108

109```ruby

110require "openai"

111

112openai = OpenAI::Client.new(api_key: "My API Key")

113

114speech = openai.audio.speech.create(input: "input", model: :"tts-1", voice: :alloy)

115

116puts(speech)

117```

118

119## Domain Types

120

121### Speech Model

122

123- `SpeechModel = :"tts-1" | :"tts-1-hd" | :"gpt-4o-mini-tts" | :"gpt-4o-mini-tts-2025-12-15"`

124

125 - `:"tts-1"`

126

127 - `:"tts-1-hd"`

128

129 - `:"gpt-4o-mini-tts"`

130

131 - `:"gpt-4o-mini-tts-2025-12-15"`