Send an Image to Claude for Visual Analysis
Claude 3.5 Sonnet and Opus are natively multimodal — they can process images alongside text in the same messages array, enabling tasks like document OCR, diagram explanation, screenshot analysis, and visual question answering. This example shows how to include a base64-encoded image in a messages API request and structure the accompanying text prompt for reliable visual analysis. Images can be provided in two ways: as base64-encoded data with a media type (image/jpeg, image/png, image/gif, image/webp), or as a URL that Claude fetches at inference time. The base64 approach is more reliable for production use because it eliminates latency from external fetches and avoids failures caused by URL authentication or expiry. The image is included as an image content block within the user message, alongside a text content block with the question. Image token counting uses a formula based on dimensions: for standard images, tokens = (width × height) / 750, with a minimum of 300 tokens per image. A typical 1024×768 screenshot consumes about 1,049 tokens — more than many prompts. Factor image tokens into your context window and cost calculations, especially when processing many images in batch jobs.
{
"model": "claude-3-5-sonnet-20241022",
"max_tokens": 1024,
"messages": [
{
"role": "user",
"content": [
{
"type": "image",
"source": {
"type": "base64",
"media_type": "image/png",
"data": "<base64-encoded-image-data>"
}
},
{
"type": "text",
"text": "Please extract all text from this screenshot and identify any UI elements or buttons visible in the interface."
}
]
}
]
}FAQ
- How many tokens does an image use?
- Claude uses approximately (width × height) / 750 tokens per image, with a minimum of 300. A standard 1080p screenshot uses about 1,555 tokens. High-resolution images are resized before processing, so you rarely exceed 1,600 tokens per image regardless of source resolution.
- What image formats does Claude support?
- Claude accepts JPEG, PNG, GIF, and WebP formats. For documents, PDFs can also be included as base64 encoded content. The maximum image file size is 5MB per image.
- Can I include multiple images in one request?
- Yes. Include multiple image content blocks in the messages array. Each image counts toward your context window. You can mix images and text freely within and across messages.
Related Examples
The Anthropic Messages API is the primary interface for all Claude models. Unlik...
Define a Tool for Claude Tool UseClaude's tool use (equivalent to OpenAI's function calling) enables the model to...
Build an OpenAI Chat Completion RequestThe Chat Completion API is the primary interface for all GPT models and the foun...