Fuzzy Match Score Calculator
Free fuzzy match score calculator: compare two product strings and get Levenshtein, Jaro-Winkler, and token similarity scores in your browser.
This fuzzy match score calculator compares two product strings — names, descriptions, manufacturer part numbers, or supplier SKUs — and returns a similarity score so you can decide whether they refer to the same item. Paste a pair or a column of pairs and see how closely they match before you merge, cross-reference, or onboard them.
Fuzzy Match Score Calculator
The interactive version of this tool is coming soon. It will run entirely in your browser — no login, no upload limits.
Planned tool: fuzzy match score calculator
Need this now? Talk to ClaroWhat it checks
For each pair of strings you enter, the calculator computes and reports:
- Levenshtein (edit distance) similarity — how many single-character insertions, deletions, or substitutions separate the two strings, normalized to a 0–100 score. Good for catching typos like
Loctite 243vsLoctite 234. - Jaro-Winkler similarity — a 0–1 score that rewards matches at the start of a string, which suits part numbers and brand-led titles such as
SKF 6204-2RSvsSKF 6204 2RS. - Token (set/sort) similarity — splits each string into words and compares the sets, so reordering and extra words score well. Useful when a CPG title reads
Organic Almond Butter 16ozon one feed andAlmond Butter, Organic — 16 ozon another. - Normalized comparison — an optional pass that lowercases, strips punctuation, and collapses whitespace before scoring, so cosmetic formatting differences do not drag the score down.
- A blended verdict — a plain-language likely match / review / no match label based on the combined scores, with the contributing metric shown so you understand why.
How fuzzy match scoring works
Fuzzy matching estimates how similar two pieces of text are when an exact, character-for-character match will not work. Catalog data is full of near-misses: a furniture supplier writes Walnut Veneer Side Table - 45cm, your ERP stores Side Table, Walnut Veneer, 450mm, and a marketplace feed lists WLNT-SIDE-TBL-45. None of these match on a string equality check, yet all describe one product.
Each algorithm measures similarity differently. Edit-distance methods count the character operations needed to turn one string into the other. Jaro-Winkler weighs matching and transposed characters and gives a bonus for a shared prefix. Token methods ignore word order and focus on shared vocabulary. Because each is strong on different error types — typos, abbreviations, reordering, added units — this calculator surfaces all of them so you can pick a threshold that fits your data rather than trusting a single number.
A score is a signal, not a decision. The right cutoff depends on your tolerance for false merges. For MRO and industrial distribution, where merging two distinct fasteners is costly, teams often hold a high bar and route borderline pairs to human review. For loosely structured CPG titles, a lower token-similarity threshold may be acceptable. The guidance below explains how to set those thresholds and why naive scripts struggle once volume grows.
Related resources
Glossary
What Is Fuzzy Matching?
The concepts behind approximate string matching and where it fits in catalog work.
Glossary
Confidence Scores in Data Matching
How similarity scores become merge, review, and reject decisions.
Tool
Levenshtein / Jaro-Winkler Calculator
Drill into a single algorithm and inspect the raw distance between two strings.
Playbook
Match Supplier Catalogs to Inventory
A step-by-step process for reconciling incoming supplier data against your own SKUs.
Guide
Why Fuzzy-Match Scripts Break at Scale
The failure modes of homegrown matching and what to do instead.
Claro
Automated Catalog Matching
See how Claro resolves matches across millions of records with confidence scoring and provenance.
FAQ
What is a good fuzzy match score?
There is no universal cutoff — it depends on your data and your cost of error. As a starting point, blended similarity above roughly 90 usually indicates the same product, 75–90 warrants human review, and below 75 is likely a different item. For high-risk catalogs like industrial parts, raise the auto-merge bar and review more pairs manually.
Which is better, Levenshtein or Jaro-Winkler?
Neither is universally better. Levenshtein (edit distance) is intuitive for typos and short differences. Jaro-Winkler favors strings that share a prefix, which helps with part numbers and brand-led titles. This calculator shows both plus a token score so you can choose the metric that best fits the kind of variation in your fields.
Can I use this to match SKUs or part numbers?
Yes. Paste the two identifiers and read the Jaro-Winkler and normalized scores, which handle spacing and punctuation differences like 6204-2RS vs 6204 2RS well. For structured identifiers, also confirm with a deterministic check on the cleaned value, since a high fuzzy score alone can pair similar-but-distinct part numbers.
Is my data sent anywhere when I use this tool?
No. The fuzzy match score calculator runs entirely in your browser. Nothing is uploaded, stored, or transmitted, so you can safely test it with real supplier SKUs, pricing files, or product descriptions.
Why do my fuzzy-match scripts work in testing but fail in production?
Small samples hide the long tail of edge cases — multilingual titles, embedded units, transposed tokens, and near-duplicate part numbers — and pairwise comparison gets quadratically slower as catalogs grow. The guide on why fuzzy-match scripts break at scale covers blocking, normalization, and confidence thresholds that keep accuracy stable as volume increases.