text

Unicode Normalize

Normalize Unicode text to NFC, NFD, NFKC, or NFKD — removes invisible differences between identical-looking strings.

Loading…

About Unicode Normalize

Unicode Normalize converts text to one of the four standard normalization forms — NFC, NFD, NFKC, or NFKD — so that strings which look identical on screen become byte-for-byte identical underneath. It is the fix for the maddening bug where two names or filenames appear the same but compare as unequal. The conversion happens locally in your browser, so your text is never sent to a server.

Category
text
Input
Accepts: text/plain.
Output
Outputs: text/plain.
Cost
Free, runs in your browser
Memory
low
Privacy: Unicode Normalize runs entirely on your device. Files you provide never leave your browser — no uploads, no server, no tracking. The page works offline once loaded.

Common uses

  • Reconcile a café-style accented name stored as a single é versus an e plus a combining accent so the records finally match
  • Clean user-submitted data before inserting it into a database that treats canonically equivalent strings as distinct
  • Collapse full-width and half-width characters with NFKC before running a search or deduplication pass
  • Normalize filenames copied between macOS (NFD) and Windows/Linux (NFC) to stop phantom duplicates
  • Prepare text for hashing or signing so that visually identical inputs produce the same digest
  • Strip ligatures and stylistic variants down to plain ASCII-friendly forms with the compatibility (NFKx) modes

Frequently asked questions

Which normalization form should I pick?

NFC is the most common choice for storage and display — it composes characters into their canonical single-codepoint form. NFKC additionally folds compatibility variants (full-width, ligatures, superscripts) into plain equivalents, which is useful for search and matching but loses some formatting distinctions.

What is the difference between the C and D forms?

C (composed) merges base characters and combining marks into single codepoints where possible; D (decomposed) splits them apart into base plus combining marks. The K variants (NFKC/NFKD) add compatibility folding on top.

Will normalizing change how my text looks?

Canonical forms (NFC/NFD) preserve appearance exactly. Compatibility forms (NFKC/NFKD) may change appearance — for example turning a ligature or full-width digit into its plain equivalent — because they prioritize matching over visual fidelity.

Does my text leave my device?

No. Normalization uses your browser's built-in Unicode engine and runs entirely client-side. Nothing is uploaded.

Is there a size limit on the input?

No practical limit for ordinary text. Normalization is a fast linear pass, so even large documents process quickly in the browser.

Keywords

  • unicode
  • normalize
  • nfc
  • nfd
  • nfkc
  • nfkd
  • encoding
  • text

Try next