inspect

PDF Suspicious

Scan a PDF for prompt-injection / homoglyph / invisible-character attacks. Extracts the text per page and runs text-suspicious over the whole document — and optionally each page individually. Useful when an LLM is about to ingest a PDF: catches Latin-spoofed phishing strings, zero-width prompt injections, and confusable-script substitutions before the model sees them.

Loading…

About PDF Suspicious

PDF Suspicious scans a PDF for prompt-injection, homoglyph, and invisible-character attacks before an LLM ever reads it. It extracts the text page by page and runs a suspicious-text analysis over the whole document — and optionally each page on its own — flagging Latin-spoofed phishing strings, zero-width injections, and confusable-script swaps. It runs in your browser, so a document you don't yet trust is never sent to a server.

Category
inspect
Input
Accepts: application/pdf.
Output
Outputs: application/json.
Cost
Free, runs in your browser
Memory
medium
Privacy: PDF Suspicious runs entirely on your device. Files you provide never leave your browser — no uploads, no server, no tracking. The page works offline once loaded.

Common uses

  • Vet a PDF resume or invoice for hidden prompt-injection text before pasting it into an AI assistant
  • Catch zero-width or invisible characters smuggled into a contract draft from an outside party
  • Flag homoglyph phishing — Cyrillic or Greek letters disguised as Latin in a brand name or URL
  • Pre-screen documents in an LLM ingestion pipeline so confusable-script payloads are caught per page
  • Audit a batch of vendor PDFs for confusable-script substitutions that bypass keyword filters
  • Confirm a scanned-then-OCR'd PDF didn't pick up suspicious unicode before downstream summarization

Frequently asked questions

What does the output look like?

It returns JSON describing what was flagged — the suspicious strings, the category of issue (e.g. confusable script, invisible character, injection-style phrasing), and where in the document they appear.

Can it check each page separately?

Yes. By default it analyzes the whole document's extracted text, and it can optionally run the same check on each page individually for finer-grained results.

Does it work on scanned PDFs?

It analyzes extractable text, so image-only scans with no text layer won't yield much. Run OCR first to produce a text layer, then scan that.

Is the PDF uploaded for scanning?

No. Text extraction and the suspicious-text analysis both run locally in your browser, which is the point — you can inspect an untrusted file without exposing it.

Does it remove the suspicious content?

No, it's a read-only inspection that reports findings as JSON. Use it to decide whether to trust, clean, or reject the document before further processing.

Keywords

  • pdf
  • security
  • prompt-injection
  • confusable
  • audit
  • inspect

Try next