Documentation

Features

Images & PDFs

How to send images and PDFs to A4F API Services.

A4F API Services primarily focuses on providing affordable and unified access to a wide range of text-based Large Language Models. While some underlying providers offer multimodal capabilities (like image understanding), direct support for rich media inputs such as images and PDFs through the standard A4F API /v1/chat/completions endpoint is currently limited and depends heavily on the chosen provider and model.

Image Inputs

Requests with images to multimodal models are typically sent via the standard A4F /v1/chat/completions API, using a multi-part messages parameter as per OpenAI's specification. The image content can be either a URL or a base64-encoded image.

Note that multiple images can be sent in separate content array entries. The number of images you can send in a single request varies per provider and per model. Due to how the content is parsed, we recommend sending the text prompt first, then the images. If the images must come first, we recommend putting it in the system prompt.

Using Image URLs

Here's how to send an image using a URL:

import requests
import json
from openai import OpenAI
A4F_API_KEY = "A4F_API_KEY"
A4F_BASE_URL = "https://api.a4f.co/v1"
MODEL_ID = "provider-5/gpt-4o"
client = OpenAI(
api_key=A4F_API_KEY,
base_url=A4F_BASE_URL
)
response = client.chat.completions.create(
MODEL_ID = "provider-5/gpt-4o"
client = OpenAI(
api_key=A4F_API_KEY,
base_url=A4F_BASE_URL
)
response = client.chat.completions.create(
model=MODEL_ID,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]
)
print(response.choices[0].message.content)

Using Base64 Encoded Images

For locally stored images, you can send them using base64 encoding. Here's how to do it:

import os
import base64
from pathlib import Path
from openai import OpenAI
A4F_API_KEY = os.getenv("A4F_API_KEY")
A4F_BASE_URL = "https://api.a4f.co/v1"
MODEL_ID = "provider-5/gpt-4o"
client = OpenAI(
api_key=A4F_API_KEY,
base_url=A4F_BASE_URL
)
def encode_image_to_base64(image_path: str):
with open(image_path, "rb") as image_file:
return base64.b64encode(image_file.read()).decode('utf-8')
image_path = "path/to/your/image.jpg"
try:
base64_image = encode_image_to_base64(image_path)
data_url = f"data:image/jpeg;base64,{base64_image}"
response = client.chat.completions.create(
model=MODEL_ID,
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "Describe this locally stored image."
},
{
"type": "image_url",
"image_url": {
"url": data_url
}
}
]
}
]
)
print(response.choices[0].message.content)
except FileNotFoundError:
print(f"Error: Image file not found at {image_path}")
except Exception as e:
print(f"An error occurred: {e}")

Supported image content types (ensure your base64 data URL matches):

  • image/png
  • image/jpeg
  • image/webp
  • image/gif (non-animated)

PDF Support

Processing Extracted PDF Text

If you have extracted text from a PDF, you can send it to A4F as part of your prompt. Here's a conceptual example:

import os
from openai import OpenAI
A4F_API_KEY = os.getenv("A4F_API_KEY")
A4F_BASE_URL = "https://api.a4f.co/v1"
MODEL_ID = "provider-1/some-text-model"
client = OpenAI(
api_key=A4F_API_KEY,
base_url=A4F_BASE_URL
)
pdf_text_content = """
Page 1: Introduction to A4F Services. A4F provides unified access...
Page 2: Key Features. Our platform offers affordability and uptime...
... (rest of your extracted PDF text) ...
"""
response = client.chat.completions.create(
model=MODEL_ID,
messages=[
{
"role": "user",
"content": f"Summarize the following document content:\n\n{pdf_text_content}"
}
]
)
print(response.choices[0].message.content)

Note that multiple PDF text segments (e.g., page by page, or chunked content) can be sent in separate messages or combined, depending on token limits and your prompting strategy.

Response Format

The API will return a response in the following format (OpenAI compatible):

{
"id": "gen-1234567890",
"provider": "a4f-provider-alias",
"model": "provider-X/actual-model-id",
"object": "chat.completion",
"created": 1234567890,
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The document discusses..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 1000,
"completion_tokens": 100,
"total_tokens": 1100
}
}

Was this page helpful?