Vision (image input)

Send images via the OpenAI-standard image_url format to models that support image input. Image support, size limits, and billing depend on the selected model.

Choose a model with image input

Filter for vision or multimodal capability in /models, then test with representative images before launch.

Example

Both URL and base64 are supported. Use base64 for local or private images, but keep in mind that it increases request size and can hit route or model limits.

Python

import os
from openai import OpenAI
import base64

client = OpenAI(
    base_url="https://test.sealink.io/v1",
    api_key=os.environ["SEALINK_API_KEY"],
)

# From a URL:
resp = client.chat.completions.create(
    model="qwen3.6-plus",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {"url": "https://example.com/photo.jpg"},
                },
            ],
        }
    ],
)
print(resp.choices[0].message.content)

# From a local file (base64):
with open("invoice.png", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

resp = client.chat.completions.create(
    model="qwen3.6-plus",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract the total amount and date."},
                {
                    "type": "image_url",
                    "image_url": {"url": f"data:image/png;base64,{b64}"},
                },
            ],
        }
    ],
)

Typical use cases

Invoice / receipt OCR with field extraction
E-commerce product attribute extraction (color / material / style)
UI screenshot description / automated test assertions
Student homework / exam grading
Screenshare + agent control

Token billing

Each model tokenizes images differently (Claude charges by image size; OpenAI charges by tile count). See per-model rules on each model detail page.