PDF Extract Images
Pull embedded raster images out of a PDF and pack them into a ZIP. Different from pdf-to-image — that one renders whole pages; this one extracts the original photos and figures. Coverage varies by PDF: pdfjs preserves some image objects after rendering and not others. The ZIP always contains a _report.json listing what was found vs. extracted vs. skipped, so you can see which pages need a follow-up pass.
About PDF Extract Images
PDF Extract Images pulls the embedded raster photos and figures out of a PDF and packs them into a ZIP, preserving the originals rather than re-rendering pages. It differs from PDF to Image: that tool snapshots whole pages, while this one recovers the source images themselves. Coverage depends on how the PDF was built, so every ZIP includes a _report.json listing what was found, extracted, and skipped. It all runs in your browser, so the document is never uploaded.
- Category
- Input
- Accepts: application/pdf.
- Output
- Outputs: application/zip.
- Cost
- Free, runs in your browser
- Memory
- high
Common uses
- Recover the original product photos from a marketing PDF instead of screenshotting pages
- Pull figures and charts out of a research paper at their embedded resolution
- Extract logos or diagrams from a slide deck exported as PDF
- Salvage scanned page images from a document for re-processing elsewhere
- Read the _report.json to see exactly which pages need a follow-up extraction pass
- Batch the embedded images of a brochure into a single ZIP for an asset handoff
Frequently asked questions
How is this different from PDF to Image?
PDF to Image renders each whole page as a new image; this tool extracts the original embedded photos and figures rather than re-rendering anything.
Will it extract every image in any PDF?
Not always. Coverage varies because pdfjs preserves some image objects after rendering and not others. The included _report.json tells you what was found, extracted, and skipped.
What is the _report.json file for?
It lists, per page, which images were found versus extracted versus skipped, so you can identify pages that need a follow-up pass.
Is my PDF uploaded?
No. Extraction runs entirely in your browser; the PDF and its images never leave your device.
What format is the output?
A single ZIP archive containing the extracted images plus the _report.json summary.
Keywords
- extract
- images
- photos
- figures
- embedded
- zip