python/resources/completions/methods/create/index.md +0 −415 deleted
File Deleted View Diff
1## Create completion
2
3`completions.create(CompletionCreateParams**kwargs) -> Completion`
4
5**post** `/completions`
6
7Creates a completion for the provided prompt and parameters.
8
9Returns a completion object, or a sequence of completion objects if the request is streamed.
10
11### Parameters
12
13- `model: Union[str, Literal["gpt-3.5-turbo-instruct", "davinci-002", "babbage-002"]]`
14
15 ID of the model to use. You can use the [List models](https://platform.openai.com/docs/api-reference/models/list) API to see all of your available models, or see our [Model overview](https://platform.openai.com/docs/models) for descriptions of them.
16
17 - `str`
18
19 - `Literal["gpt-3.5-turbo-instruct", "davinci-002", "babbage-002"]`
20
21 ID of the model to use. You can use the [List models](https://platform.openai.com/docs/api-reference/models/list) API to see all of your available models, or see our [Model overview](https://platform.openai.com/docs/models) for descriptions of them.
22
23 - `"gpt-3.5-turbo-instruct"`
24
25 - `"davinci-002"`
26
27 - `"babbage-002"`
28
29- `prompt: Union[str, Sequence[str], Iterable[int], 2 more]`
30
31 The prompt(s) to generate completions for, encoded as a string, array of strings, array of tokens, or array of token arrays.
32
33 Note that <|endoftext|> is the document separator that the model sees during training, so if a prompt is not specified the model will generate as if from the beginning of a new document.
34
35 - `str`
36
37 - `Sequence[str]`
38
39 - `Iterable[int]`
40
41 - `Iterable[Iterable[int]]`
42
43- `best_of: Optional[int]`
44
45 Generates `best_of` completions server-side and returns the "best" (the one with the highest log probability per token). Results cannot be streamed.
46
47 When used with `n`, `best_of` controls the number of candidate completions and `n` specifies how many to return – `best_of` must be greater than `n`.
48
49 **Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.
50
51- `echo: Optional[bool]`
52
53 Echo back the prompt in addition to the completion
54
55- `frequency_penalty: Optional[float]`
56
57 Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
58
59 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation)
60
61- `logit_bias: Optional[Dict[str, int]]`
62
63 Modify the likelihood of specified tokens appearing in the completion.
64
65 Accepts a JSON object that maps tokens (specified by their token ID in the GPT tokenizer) to an associated bias value from -100 to 100. You can use this [tokenizer tool](/tokenizer?view=bpe) to convert text to token IDs. Mathematically, the bias is added to the logits generated by the model prior to sampling. The exact effect will vary per model, but values between -1 and 1 should decrease or increase likelihood of selection; values like -100 or 100 should result in a ban or exclusive selection of the relevant token.
66
67 As an example, you can pass `{"50256": -100}` to prevent the <|endoftext|> token from being generated.
68
69- `logprobs: Optional[int]`
70
71 Include the log probabilities on the `logprobs` most likely output tokens, as well the chosen tokens. For example, if `logprobs` is 5, the API will return a list of the 5 most likely tokens. The API will always return the `logprob` of the sampled token, so there may be up to `logprobs+1` elements in the response.
72
73 The maximum value for `logprobs` is 5.
74
75- `max_tokens: Optional[int]`
76
77 The maximum number of [tokens](/tokenizer) that can be generated in the completion.
78
79 The token count of your prompt plus `max_tokens` cannot exceed the model's context length. [Example Python code](https://cookbook.openai.com/examples/how_to_count_tokens_with_tiktoken) for counting tokens.
80
81- `n: Optional[int]`
82
83 How many completions to generate for each prompt.
84
85 **Note:** Because this parameter generates many completions, it can quickly consume your token quota. Use carefully and ensure that you have reasonable settings for `max_tokens` and `stop`.
86
87- `presence_penalty: Optional[float]`
88
89 Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
90
91 [See more information about frequency and presence penalties.](https://platform.openai.com/docs/guides/text-generation)
92
93- `seed: Optional[int]`
94
95 If specified, our system will make a best effort to sample deterministically, such that repeated requests with the same `seed` and parameters should return the same result.
96
97 Determinism is not guaranteed, and you should refer to the `system_fingerprint` response parameter to monitor changes in the backend.
98
99- `stop: Optional[Union[Optional[str], Sequence[str], null]]`
100
101 Not supported with latest reasoning models `o3` and `o4-mini`.
102
103 Up to 4 sequences where the API will stop generating further tokens. The
104 returned text will not contain the stop sequence.
105
106 - `Optional[str]`
107
108 - `Sequence[str]`
109
110- `stream: Optional[Literal[false]]`
111
112 Whether to stream back partial progress. If set, tokens will be sent as data-only [server-sent events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events#Event_stream_format) as they become available, with the stream terminated by a `data: [DONE]` message. [Example Python code](https://cookbook.openai.com/examples/how_to_stream_completions).
113
114 - `false`
115
116- `stream_options: Optional[ChatCompletionStreamOptionsParam]`
117
118 Options for streaming response. Only set this when you set `stream: true`.
119
120 - `include_obfuscation: Optional[bool]`
121
122 When true, stream obfuscation will be enabled. Stream obfuscation adds
123 random characters to an `obfuscation` field on streaming delta events to
124 normalize payload sizes as a mitigation to certain side-channel attacks.
125 These obfuscation fields are included by default, but add a small amount
126 of overhead to the data stream. You can set `include_obfuscation` to
127 false to optimize for bandwidth if you trust the network links between
128 your application and the OpenAI API.
129
130 - `include_usage: Optional[bool]`
131
132 If set, an additional chunk will be streamed before the `data: [DONE]`
133 message. The `usage` field on this chunk shows the token usage statistics
134 for the entire request, and the `choices` field will always be an empty
135 array.
136
137 All other chunks will also include a `usage` field, but with a null
138 value. **NOTE:** If the stream is interrupted, you may not receive the
139 final usage chunk which contains the total token usage for the request.
140
141- `suffix: Optional[str]`
142
143 The suffix that comes after a completion of inserted text.
144
145 This parameter is only supported for `gpt-3.5-turbo-instruct`.
146
147- `temperature: Optional[float]`
148
149 What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.
150
151 We generally recommend altering this or `top_p` but not both.
152
153- `top_p: Optional[float]`
154
155 An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.
156
157 We generally recommend altering this or `temperature` but not both.
158
159- `user: Optional[str]`
160
161 A unique identifier representing your end-user, which can help OpenAI to monitor and detect abuse. [Learn more](https://platform.openai.com/docs/guides/safety-best-practices#end-user-ids).
162
163### Returns
164
165- `class Completion: …`
166
167 Represents a completion response from the API. Note: both the streamed and non-streamed response objects share the same shape (unlike the chat endpoint).
168
169 - `id: str`
170
171 A unique identifier for the completion.
172
173 - `choices: List[CompletionChoice]`
174
175 The list of completion choices the model generated for the input prompt.
176
177 - `finish_reason: Literal["stop", "length", "content_filter"]`
178
179 The reason the model stopped generating tokens. This will be `stop` if the model hit a natural stop point or a provided stop sequence,
180 `length` if the maximum number of tokens specified in the request was reached,
181 or `content_filter` if content was omitted due to a flag from our content filters.
182
183 - `"stop"`
184
185 - `"length"`
186
187 - `"content_filter"`
188
189 - `index: int`
190
191 - `logprobs: Optional[Logprobs]`
192
193 - `text_offset: Optional[List[int]]`
194
195 - `token_logprobs: Optional[List[float]]`
196
197 - `tokens: Optional[List[str]]`
198
199 - `top_logprobs: Optional[List[Dict[str, float]]]`
200
201 - `text: str`
202
203 - `created: int`
204
205 The Unix timestamp (in seconds) of when the completion was created.
206
207 - `model: str`
208
209 The model used for completion.
210
211 - `object: Literal["text_completion"]`
212
213 The object type, which is always "text_completion"
214
215 - `"text_completion"`
216
217 - `system_fingerprint: Optional[str]`
218
219 This fingerprint represents the backend configuration that the model runs with.
220
221 Can be used in conjunction with the `seed` request parameter to understand when backend changes have been made that might impact determinism.
222
223 - `usage: Optional[CompletionUsage]`
224
225 Usage statistics for the completion request.
226
227 - `completion_tokens: int`
228
229 Number of tokens in the generated completion.
230
231 - `prompt_tokens: int`
232
233 Number of tokens in the prompt.
234
235 - `total_tokens: int`
236
237 Total number of tokens used in the request (prompt + completion).
238
239 - `completion_tokens_details: Optional[CompletionTokensDetails]`
240
241 Breakdown of tokens used in a completion.
242
243 - `accepted_prediction_tokens: Optional[int]`
244
245 When using Predicted Outputs, the number of tokens in the
246 prediction that appeared in the completion.
247
248 - `audio_tokens: Optional[int]`
249
250 Audio input tokens generated by the model.
251
252 - `reasoning_tokens: Optional[int]`
253
254 Tokens generated by the model for reasoning.
255
256 - `rejected_prediction_tokens: Optional[int]`
257
258 When using Predicted Outputs, the number of tokens in the
259 prediction that did not appear in the completion. However, like
260 reasoning tokens, these tokens are still counted in the total
261 completion tokens for purposes of billing, output, and context window
262 limits.
263
264 - `prompt_tokens_details: Optional[PromptTokensDetails]`
265
266 Breakdown of tokens used in the prompt.
267
268 - `audio_tokens: Optional[int]`
269
270 Audio input tokens present in the prompt.
271
272 - `cached_tokens: Optional[int]`
273
274 Cached tokens present in the prompt.
275
276### Example
277
278```python
279import os
280from openai import OpenAI
281
282client = OpenAI(
283 api_key=os.environ.get("OPENAI_API_KEY"), # This is the default and can be omitted
284)
285for completion in client.completions.create(
286 model="string",
287 prompt="This is a test.",
288):
289 print(completion)
290```
291
292#### Response
293
294```json
295{
296 "id": "id",
297 "choices": [
298 {
299 "finish_reason": "stop",
300 "index": 0,
301 "logprobs": {
302 "text_offset": [
303 0
304 ],
305 "token_logprobs": [
306 0
307 ],
308 "tokens": [
309 "string"
310 ],
311 "top_logprobs": [
312 {
313 "foo": 0
314 }
315 ]
316 },
317 "text": "text"
318 }
319 ],
320 "created": 0,
321 "model": "model",
322 "object": "text_completion",
323 "system_fingerprint": "system_fingerprint",
324 "usage": {
325 "completion_tokens": 0,
326 "prompt_tokens": 0,
327 "total_tokens": 0,
328 "completion_tokens_details": {
329 "accepted_prediction_tokens": 0,
330 "audio_tokens": 0,
331 "reasoning_tokens": 0,
332 "rejected_prediction_tokens": 0
333 },
334 "prompt_tokens_details": {
335 "audio_tokens": 0,
336 "cached_tokens": 0
337 }
338 }
339}
340```
341
342### No streaming
343
344```python
345from openai import OpenAI
346client = OpenAI()
347
348client.completions.create(
349 model="VAR_completion_model_id",
350 prompt="Say this is a test",
351 max_tokens=7,
352 temperature=0
353)
354```
355
356#### Response
357
358```json
359{
360 "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
361 "object": "text_completion",
362 "created": 1589478378,
363 "model": "VAR_completion_model_id",
364 "system_fingerprint": "fp_44709d6fcb",
365 "choices": [
366 {
367 "text": "\n\nThis is indeed a test",
368 "index": 0,
369 "logprobs": null,
370 "finish_reason": "length"
371 }
372 ],
373 "usage": {
374 "prompt_tokens": 5,
375 "completion_tokens": 7,
376 "total_tokens": 12
377 }
378}
379```
380
381### Streaming
382
383```python
384from openai import OpenAI
385client = OpenAI()
386
387for chunk in client.completions.create(
388 model="VAR_completion_model_id",
389 prompt="Say this is a test",
390 max_tokens=7,
391 temperature=0,
392 stream=True
393):
394 print(chunk.choices[0].text)
395```
396
397#### Response
398
399```json
400{
401 "id": "cmpl-7iA7iJjj8V2zOkCGvWF2hAkDWBQZe",
402 "object": "text_completion",
403 "created": 1690759702,
404 "choices": [
405 {
406 "text": "This",
407 "index": 0,
408 "logprobs": null,
409 "finish_reason": null
410 }
411 ],
412 "model": "gpt-3.5-turbo-instruct"
413 "system_fingerprint": "fp_44709d6fcb",
414}
415```