text

HTML Clean

Strip HTML tags and decode entities to produce readable plain text. Drops <script>, <style>, and comments entirely. Different from html-to-markdown — this returns prose, not formatted markdown.

Loading…

About HTML Clean

HTML Clean strips every tag from an HTML document and decodes entities like &amp; and &#39; so you're left with clean, readable prose. It drops <script>, <style>, and comment blocks entirely, so you never paste hidden code or styling into your destination. Unlike HTML to Markdown, this doesn't preserve formatting — it gives you flat text, and it all happens in your browser with nothing uploaded.

Category
text
Input
Accepts: text/html or text/plain.
Output
Outputs: text/plain.
Cost
Free, runs in your browser
Memory
low
Privacy: HTML Clean runs entirely on your device. Files you provide never leave your browser — no uploads, no server, no tracking. The page works offline once loaded.

Common uses

  • Pull the readable copy out of an email's HTML body before pasting it into a notes app
  • Convert a chunk of CMS-exported HTML into plain text for a word-count or character limit check
  • Sanitize HTML you scraped or copied so no leftover <script> or inline styles travel with it
  • Turn an HTML product description into clean prose for a marketplace that rejects markup
  • Decode entity-encoded text (&quot;, &#8217;, &nbsp;) back into normal punctuation in one pass
  • Strip markup from an exported help-doc page so you can re-paste it into a plain-text ticket

Frequently asked questions

What's the difference between this and HTML to Markdown?

HTML Clean returns flat prose with no formatting markers. HTML to Markdown keeps structure — headings, links, lists — as Markdown syntax. Use this when you want text only.

Does it keep links or images?

No. Tags are removed entirely, so anchor text stays but the href and any image references are dropped. You get readable prose, not the markup behind it.

Is my HTML uploaded anywhere?

No. The stripping and entity decoding run entirely in your browser. Nothing is sent to a server, which makes it safe for internal emails or proprietary page source.

What inputs does it accept?

HTML (text/html) and plain text. If you paste text that already has no tags, you'll get it back with any HTML entities decoded.

Will script and style content leak into the output?

No. <script>, <style>, and HTML comments are removed wholesale, including their contents, so no JavaScript or CSS ends up in your text.

Keywords

  • html
  • clean
  • strip
  • plain
  • text
  • sanitize
  • extract
  • tags

Try next