Features

Images & PDFs

How to send images and PDFs to A4F API Services.

A4F API Services primarily focuses on providing affordable and unified access to a wide range of text-based Large Language Models. While some underlying providers offer multimodal capabilities (like image understanding), direct support for rich media inputs such as images and PDFs through the standard A4F API /v1/chat/completions endpoint is currently limited and depends heavily on the chosen provider and model.

Context is Key

You can send both PDF text (extracted by your application) and image URLs/data (for supported models) in the same API request, allowing for rich contextual interactions.

Image Inputs

Requests with images to multimodal models are typically sent via the standard A4F /v1/chat/completions API, using a multi-part messages parameter as per OpenAI's specification. The image content can be either a URL or a base64-encoded image.

Limited Provider Support for Direct Image Input

Currently, most A4F providers have limited or no direct support for image inputs in the messages array.

- Provider-5 might offer some vision capabilities for specific models (check model list).
- Provider-4 may occasionally support images for certain models.

These providers do not support file uploads through the standard API. Direct image processing incurs significant backend costs, which is reflected in the pricing and availability of such features. Always verify model capabilities on the A4F Models page.

Note that multiple images can be sent in separate content array entries. The number of images you can send in a single request varies per provider and per model. Due to how the content is parsed, we recommend sending the text prompt first, then the images. If the images must come first, we recommend putting it in the system prompt.

Using Image URLs

Here's how to send an image using a URL:

import requests
import json
from openai import OpenAI

A4F_API_KEY = "A4F_API_KEY"
A4F_BASE_URL = "https://api.a4f.co/v1"

MODEL_ID = "provider-5/gpt-4o"

client = OpenAI(
  api_key=A4F_API_KEY,
  base_url=A4F_BASE_URL
)

response = client.chat.completions.create(
MODEL_ID = "provider-5/gpt-4o"

client = OpenAI(
  api_key=A4F_API_KEY,
  base_url=A4F_BASE_URL
)

response = client.chat.completions.create(
  model=MODEL_ID,
  messages=[
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What's in this image?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
          }
        }
      ]
    }
  ]
)

print(response.choices[0].message.content)

Using Base64 Encoded Images

For locally stored images, you can send them using base64 encoding. Here's how to do it:

import os
import base64
from pathlib import Path
from openai import OpenAI

A4F_API_KEY = os.getenv("A4F_API_KEY")
A4F_BASE_URL = "https://api.a4f.co/v1"
MODEL_ID = "provider-5/gpt-4o"

client = OpenAI(
  api_key=A4F_API_KEY,
  base_url=A4F_BASE_URL
)

def encode_image_to_base64(image_path: str):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

image_path = "path/to/your/image.jpg"
try:
    base64_image = encode_image_to_base64(image_path)
    data_url = f"data:image/jpeg;base64,{base64_image}"

    response = client.chat.completions.create(
      model=MODEL_ID,
      messages=[
        {
          "role": "user",
          "content": [
            {
              "type": "text",
              "text": "Describe this locally stored image."
            },
            {
              "type": "image_url",
              "image_url": {
                "url": data_url
              }
            }
          ]
        }
      ]
    )
    print(response.choices[0].message.content)

except FileNotFoundError:
    print(f"Error: Image file not found at {image_path}")
except Exception as e:
    print(f"An error occurred: {e}")

Supported image content types (ensure your base64 data URL matches):

image/png
image/jpeg
image/webp
image/gif (non-animated)

PDF Support

No Direct PDF Upload/Processing via Standard API

A4F API Services do not currently support direct PDF file uploading or processing through the standard /v1/chat/completions API in the way some specialized platforms (like OpenRouter's example with file content type or plugins) might offer.

Our focus is on providing cost-effective access to text generation models. Handling arbitrary file types like PDFs directly within API requests significantly increases backend complexity and operational costs, which is not aligned with A4F's current service model and pricing for text-based interactions.

Recommendation: To work with PDF content, your application should first extract the text from the PDF using a library or service of your choice (e.g., PyMuPDF/fitz, pdfminer.six in Python, or an external OCR service for image-based PDFs). Then, send this extracted text to A4F as part of your prompt.

Processing Extracted PDF Text

If you have extracted text from a PDF, you can send it to A4F as part of your prompt. Here's a conceptual example:

import os
from openai import OpenAI

A4F_API_KEY = os.getenv("A4F_API_KEY")
A4F_BASE_URL = "https://api.a4f.co/v1"
MODEL_ID = "provider-1/some-text-model"

client = OpenAI(
  api_key=A4F_API_KEY,
  base_url=A4F_BASE_URL
)

pdf_text_content = """
Page 1: Introduction to A4F Services. A4F provides unified access...
Page 2: Key Features. Our platform offers affordability and uptime...
... (rest of your extracted PDF text) ...
"""

response = client.chat.completions.create(
  model=MODEL_ID,
  messages=[
    {
      "role": "user",
      "content": f"Summarize the following document content:\n\n{pdf_text_content}"
    }
  ]
)
print(response.choices[0].message.content)

Note that multiple PDF text segments (e.g., page by page, or chunked content) can be sent in separate messages or combined, depending on token limits and your prompting strategy.

Response Format

The API will return a response in the following format (OpenAI compatible):

{
  "id": "gen-1234567890",
  "provider": "a4f-provider-alias", 
  "model": "provider-X/actual-model-id", 
  "object": "chat.completion",
  "created": 1234567890,
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The document discusses..." 
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 1000,
    "completion_tokens": 100,
    "total_tokens": 1100
  }
}

Was this page helpful?

Tool Calling

Learn how to do tool calling with A4F

Message Transforms

Transform prompt messages with A4F

Documentation