Unicode Normalize
Normalize Unicode text to NFC, NFD, NFKC, or NFKD — removes invisible differences between identical-looking strings.
About Unicode Normalize
Unicode Normalize converts text to one of the four standard normalization forms — NFC, NFD, NFKC, or NFKD — so that strings which look identical on screen become byte-for-byte identical underneath. It is the fix for the maddening bug where two names or filenames appear the same but compare as unequal. The conversion happens locally in your browser, so your text is never sent to a server.
- Category
- text
- Input
- Accepts: text/plain.
- Output
- Outputs: text/plain.
- Cost
- Free, runs in your browser
- Memory
- low
Common uses
- Reconcile a café-style accented name stored as a single é versus an e plus a combining accent so the records finally match
- Clean user-submitted data before inserting it into a database that treats canonically equivalent strings as distinct
- Collapse full-width and half-width characters with NFKC before running a search or deduplication pass
- Normalize filenames copied between macOS (NFD) and Windows/Linux (NFC) to stop phantom duplicates
- Prepare text for hashing or signing so that visually identical inputs produce the same digest
- Strip ligatures and stylistic variants down to plain ASCII-friendly forms with the compatibility (NFKx) modes
Frequently asked questions
Which normalization form should I pick?
NFC is the most common choice for storage and display — it composes characters into their canonical single-codepoint form. NFKC additionally folds compatibility variants (full-width, ligatures, superscripts) into plain equivalents, which is useful for search and matching but loses some formatting distinctions.
What is the difference between the C and D forms?
C (composed) merges base characters and combining marks into single codepoints where possible; D (decomposed) splits them apart into base plus combining marks. The K variants (NFKC/NFKD) add compatibility folding on top.
Will normalizing change how my text looks?
Canonical forms (NFC/NFD) preserve appearance exactly. Compatibility forms (NFKC/NFKD) may change appearance — for example turning a ligature or full-width digit into its plain equivalent — because they prioritize matching over visual fidelity.
Does my text leave my device?
No. Normalization uses your browser's built-in Unicode engine and runs entirely client-side. Nothing is uploaded.
Is there a size limit on the input?
No practical limit for ordinary text. Normalization is a fast linear pass, so even large documents process quickly in the browser.
Keywords
- unicode
- normalize
- nfc
- nfd
- nfkc
- nfkd
- encoding
- text