HTML Clean
Strip HTML tags and decode entities to produce readable plain text. Drops <script>, <style>, and comments entirely. Different from html-to-markdown — this returns prose, not formatted markdown.
About HTML Clean
HTML Clean strips every tag from an HTML document and decodes entities like & and ' so you're left with clean, readable prose. It drops <script>, <style>, and comment blocks entirely, so you never paste hidden code or styling into your destination. Unlike HTML to Markdown, this doesn't preserve formatting — it gives you flat text, and it all happens in your browser with nothing uploaded.
- Category
- text
- Input
- Accepts: text/html or text/plain.
- Output
- Outputs: text/plain.
- Cost
- Free, runs in your browser
- Memory
- low
Common uses
- Pull the readable copy out of an email's HTML body before pasting it into a notes app
- Convert a chunk of CMS-exported HTML into plain text for a word-count or character limit check
- Sanitize HTML you scraped or copied so no leftover <script> or inline styles travel with it
- Turn an HTML product description into clean prose for a marketplace that rejects markup
- Decode entity-encoded text (", ’, ) back into normal punctuation in one pass
- Strip markup from an exported help-doc page so you can re-paste it into a plain-text ticket
Frequently asked questions
What's the difference between this and HTML to Markdown?
HTML Clean returns flat prose with no formatting markers. HTML to Markdown keeps structure — headings, links, lists — as Markdown syntax. Use this when you want text only.
Does it keep links or images?
No. Tags are removed entirely, so anchor text stays but the href and any image references are dropped. You get readable prose, not the markup behind it.
Is my HTML uploaded anywhere?
No. The stripping and entity decoding run entirely in your browser. Nothing is sent to a server, which makes it safe for internal emails or proprietary page source.
What inputs does it accept?
HTML (text/html) and plain text. If you paste text that already has no tags, you'll get it back with any HTML entities decoded.
Will script and style content leak into the output?
No. <script>, <style>, and HTML comments are removed wholesale, including their contents, so no JavaScript or CSS ends up in your text.
Keywords
- html
- clean
- strip
- plain
- text
- sanitize
- extract
- tags