Documentation

API Reference
POST

Chat Completion

/v1/chat/completions

Send a chat completion request to a selected model. This endpoint is OpenAI-compatible and supports streaming.

Interactive Example

Request & Response Example

curl https://api.a4f.co/v1/chat/completions \
-H "Authorization: Bearer YOUR_A4F_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "provider-1/chatgpt-4o-latest",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 50,
"stream": false
}'

Headers

Authorization

string
Required
Bearer token for authentication. Your A4F API key. Example: Bearer ddc-a4f-xxxxxxxx. See Authentication.

Content-Type

string
Required
The content type of the request body.

Default: application/json

Request Body

This endpoint expects a JSON object in the request body with the following fields:

model

string
Required
ID of the model to use. Must include provider prefix, e.g., provider-1/chatgpt-4o-latest. See Models and Provider Routing.

messages

array of objects
Required
A list of messages comprising the conversation so far.

role

string
Required
The role of the message author. One of 'system', 'user', 'assistant', or 'tool'.

content

string | array
Required
The content of the message. Can be a string or an array of content parts (for multimodal).

name

string
Optional. The name of the author of this message if role is 'user' or 'assistant'. Can be used to identify speakers.

tool_calls

array of objects
Optional. The tool calls generated by the model, if any.

tool_call_id

string
Required if role is 'tool'. The ID of the tool call being responded to.

stream

boolean
If set, partial message deltas will be sent as server-sent events. See Streaming.

Default: false

temperature

number
Sampling temperature (0-2). Higher values mean more randomness.

Default: 1

max_tokens

integer
Maximum number of tokens to generate. Defaults to model's maximum if not set.

For a full list of all supported OpenAI-compatible parameters (like top_p, n, stop, tools, etc.), please refer to the Parameters documentation.

Response Body (200 OK)

A successful request returns a JSON object with the following structure:

id

string
A unique identifier for the chat completion.

object

string
The object type, which is always 'chat.completion'.

created

integer
The Unix timestamp (in seconds) of when the chat completion was created.

model

string
The model ID used for the chat completion (e.g., 'provider-1/chatgpt-4o-latest').

choices

array of objects
A list of chat completion choices.

index

integer
The index of the choice in the list.

message

object
A chat completion message object.

role

string
The role of the author of this message (usually 'assistant').

content

string | null
The content of the message.

tool_calls

array of objects | null
The tool calls generated by the model, if any.

finish_reason

string
The reason the model stopped generating tokens (e.g., 'stop', 'length', 'tool_calls').

usage

object
Usage statistics for the completion request.

prompt_tokens

integer
Number of tokens in the prompt.

completion_tokens

integer
Number of tokens in the generated completion.

total_tokens

integer
Total number of tokens used in the request (prompt + completion).

system_fingerprint

string | null
This fingerprint represents the backend configuration that the model runs with. Can be used to track changes in the backend.

Was this page helpful?