9 9
10The [input token count endpoint](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count) accepts the same input format as the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Pass text, messages, images, files, tools, or conversations—the API returns the exact count the model will receive.10The [input token count endpoint](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count) accepts the same input format as the [Responses API](https://developers.openai.com/api/docs/api-reference/responses/create). Pass text, messages, images, files, tools, or conversations—the API returns the exact count the model will receive.
11 11
12The count includes formatting tokens used to represent request structure, such as message roles and boundaries. These tokens might not appear in the text or fields you tokenize locally.
13
12## Why use the token counting API?14## Why use the token counting API?
13 15
14Local tokenizers like [tiktoken](https://github.com/openai/tiktoken) work for plain text, but they have limitations:16Local tokenizers like [tiktoken](https://github.com/openai/tiktoken) work for plain text, but they have limitations:
373 375
374[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.376[File inputs](https://developers.openai.com/api/docs/guides/pdf-files)—currently PDFs—are supported. Pass `file_id`, `file_url`, or `file_data` as you would for `responses.create`. The token count reflects the model’s full processed input.
375 377
378## Understand output token counts
379
380Reported output token usage includes all tokens generated by the model, not only the text visible in a response. The Responses API reports this total as `output_tokens`, while the Chat Completions API reports it as `completion_tokens`.
381
382Some models, including GPT-5 models, generate tokens used to format or delimit response channels, tool calls, and other message structure. These formatting tokens don't appear in message content or `logprobs`, and they aren't necessarily itemized separately in usage. As a result, the reported output or completion token count can be higher than the number of visible tokens or tokens included in `logprobs`, even when the reported `reasoning_tokens` value is `0`.
383
384The `max_output_tokens` and `max_completion_tokens` parameters limit all tokens generated by the model, including non-visible tokens. The number of non-visible tokens varies by model and response shape, so don't assume a fixed difference between reported usage and visible output. Leave headroom in these limits when you need a specific amount of visible output.
385
376## API reference386## API reference
377 387
378For full parameters and response shape, see the [Count input tokens API reference](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count). The endpoint is:388For full parameters and response shape, see the [Count input tokens API reference](https://developers.openai.com/api/reference/python/resources/responses/subresources/input_tokens/methods/count). The endpoint is: