Audio Speech (TTS)

/v1/audio/speech

Generate audio from input text using a variety of Text-to-Speech (TTS) models.

Interactive Example

Request Example

curl https://api.a4f.co/v1/audio/speech \
  -H "Authorization: Bearer YOUR_A4F_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "provider-3/tts-1",
    "input": "Hello, world! This is a test of the A4F text-to-speech API.",
    "voice": "alloy"
  }' \
  --output speech.mp3

Headers

Authorization

string

Required

Bearer token for authentication.

Content-Type

string

Required

The content type of the request body.

Default: application/json

Request Body

model

string

Required

ID of the TTS model to use. Must be of type audio/speech.

input

string

Required

The text to synthesize into speech. The maximum length (in characters or tokens) is determined by your subscription plan. Exceeding the limit will result in an error.

voice

string

Required

The voice to use for the audio generation. Available voices are specific to the chosen model and are listed on the Models page.

response_format

string

The format of the audio output. Supported formats: mp3, opus, aac, flac, wav, pcm.

Default: mp3

speed

number

The speed of the generated audio. Must be between 0.25 and 4.0.

Default: 1

instructions

string

Optional provider-specific instructions on how the speech should be delivered.

Response Body (200 OK)

Raw Audio File

A successful request returns the raw audio data directly in the response body. It does not return a JSON object. The `Content-Type` header of the response will correspond to the `response_format` you requested (e.g., `audio/mpeg` for mp3).

Was this page helpful?

Embeddings

API for creating text embeddings.

Audio Transcriptions

API for converting audio to text.