Describe Image
Generate a plain-English description of any image. ~100 MB model downloads on first use, then works offline.
About Describe Image
Describe Image generates a plain-English description of any photo or graphic, entirely on your device. A roughly 100 MB model downloads the first time you use it, then everything runs offline in your browser with no uploads. It's a quick way to draft alt-text or get a one-line summary of what an image shows.
- Category
- export
- Input
- Accepts: image/jpeg, image/png, image/webp or image/bmp.
- Output
- Outputs: text/plain.
- Cost
- Free, runs in your browser
- Memory
- medium
- Install group
- vision-llm
Common uses
- Draft starter alt-text for images on a website to improve accessibility
- Generate captions for a batch of product photos as a first pass before editing
- Get a quick description of a screenshot when you can't view it yourself
- Add descriptive text to images in documentation or a knowledge base
- Caption photos for a personal archive so they're easier to search later
- Produce a baseline image description offline, with no account or upload required
Frequently asked questions
Does my image get uploaded?
No. After the model downloads once, captioning runs entirely in your browser. The image never leaves your device, which is why it works offline.
What formats can I caption?
JPEG, PNG, WebP, and BMP images.
Why is there a download the first time?
The vision model (~100 MB) is fetched once and cached, so the first run is slower. After that it loads from cache and works without a network connection.
How detailed are the captions?
This is the standard captioner: it produces short, plain descriptions. For richer, more accurate descriptions, use Describe Image (Detailed), which runs a larger BLIP-base model.
Is it free?
Yes, completely free. It runs on your hardware, so there are no credits or per-run costs.
Keywords
- caption
- describe
- image
- alt-text
- accessibility
- vlm
- vision