Reconcile Supplier Catalogs: A Practical Guide for Distributors

How to reconcile supplier catalogs into one clean inventory: collapse duplicates, match SKUs across sources, and keep a canonical record that holds its shape.

You have fifty supplier files open and none of them agree. The same hex bolt appears as M8X40-A2, M8 x 40mm A2-70, and BOLT,HEX,M8,40. One CPG vendor ships UPCs, an MRO supplier ships internal part numbers, and a furniture line ships nothing but a free-text description and a price. To reconcile supplier catalogs into one inventory you are not merging spreadsheets — you are deciding, for every row, whether two records describe the same physical product. Done by hand across fifty sources, that decision repeats hundreds of thousands of times, and the errors compound silently into duplicate SKUs, broken pricing, and an inventory count nobody trusts. Claro is built for exactly this problem: it resolves product identity across supplier feeds, enriches missing attributes, validates the result against your schema, and writes the clean canonical record back into your existing PIM or ERP — so the reconciled state holds even as suppliers keep sending new files.

This guide breaks the work into the decisions that actually matter, so you can run the reconciliation as a repeatable pipeline rather than a one-off cleanup that decays the moment the next price file lands.

The before/after: messy feeds vs. trusted inventory

Before reconciliation	After reconciliation with Claro
Same product spans 3–5 rows with different identifiers	One canonical record per product with a full attribute set
Conflicting prices and stock levels per duplicate	Single source of truth written back to PIM/ERP
Buyers manually cross-reference part numbers	Deterministic and fuzzy matching handled automatically
New supplier file creates fresh duplicates	Each file matched against existing golden records on ingest
Missing specs block listings or require manual lookup	Attributes filled from any matched source, gaps flagged for review
No audit trail — wrong merges are invisible	Every match scored, every merge reversible with source provenance

Separate identity from attributes

The single most common mistake is treating reconciliation as a formatting problem. Normalizing units and fixing encodings is necessary, but it does not tell you which records are the same product. Split the work in two:

Identity resolution decides whether record A and record B are the same product. This is where matching lives.
Attribute reconciliation decides, once you know two records match, which field values win — the supplier’s IP rating or yours, the longer description or the structured one.

Conflating these is why naive scripts fail. A VLOOKUP on part number assumes every supplier uses the same identifier; reality is that one source has a clean GTIN, another has a manufacturer part number with a different delimiter, and a third has only a description. You need a layered match strategy.

Signal	Match type	When it is reliable
GTIN / UPC / EAN	Deterministic	Present and valid on both records
Manufacturer part number (MPN) + brand	Deterministic after normalization	Both sides carry MPN and brand can be resolved
Description + key specs	Probabilistic (fuzzy)	Identifiers are missing or partial
Cross-reference table	Deterministic	You have a curated SKU-to-SKU map

Run the high-confidence deterministic signals first and remove those rows from the pool. Only the leftovers — usually the messy long tail — go to fuzzy matching, which is slower and needs human review. Claro runs this layered pass automatically, so the expensive review queue stays as small as possible.

Build one canonical record per product, not per source

Reconciliation should produce a single source of truth: a canonical record (often called a golden record) that survives no matter which supplier last sent a file. For each matched cluster of records, decide field-by-field which value is authoritative.

Critically, keep every merge reversible. Store the original supplier values and the reason each survivor was chosen, so when a buyer disputes a spec you can trace it back to the file it came from. A reconciliation you cannot unwind is a reconciliation you cannot trust. Claro attaches source provenance to every field value in the canonical record, so the audit trail is built in rather than bolted on.

Score every match instead of accepting yes/no

Fuzzy matching does not return “same” or “different” — it returns a similarity score. The discipline is choosing thresholds and acting on the bands consistently.

Auto-merge above a high threshold where false positives are rare (for example, exact GTIN plus matching brand).
Route a middle band to human review, where a description matches but a key spec differs.
Auto-reject below a low threshold so reviewers are not buried in noise.
Log the score and the matched fields for every decision so thresholds can be tuned against real outcomes.

The right thresholds depend on your blast radius. An industrial distributor where a wrong match means shipping the incorrect breaker should review aggressively; a furniture catalog where the cost of a duplicate is a redundant listing can auto-merge more freely. Tune by category, not globally.

Fill attribute gaps after identity is resolved

Knowing two records are the same product is not the same as having a complete record. Once Claro identifies a cluster, it cross-populates attributes: if Supplier A has the weight and Supplier B has the IP rating, the canonical record gets both. Fields still missing after the cross-population step are flagged against your required-attribute schema so a buyer or enrichment workflow can fill them — rather than a half-complete record silently making it to your PIM.

Make it a pipeline, not a project

Fifty catalogs reconciled once will be fifty-one next quarter, and every existing supplier will resend updated files. Treat reconciliation as a standing process: ingest, normalize, match against the canonical set, merge or queue for review, and write back the resolved record. Each new file matches against your golden records rather than restarting from zero. Claro writes clean records back into your existing PIM or ERP on each cycle, so the system of record stays current without a manual export step. This is the difference between a cleanup you repeat forever and a canonical product-data layer that holds its shape as suppliers come and go.

Playbook

Match Supplier Catalogs to Your Inventory

The step-by-step matching workflow behind a clean reconciliation.

Guide

Why Fuzzy-Match Scripts Break at Scale

Where DIY matching scripts fail once catalog count grows.

Guide

How Duplicate SKUs Corrupt Pricing

How unresolved duplicates flow downstream into price and margin errors.

Glossary

Canonical Product Record

The single survivor record that reconciliation should produce.

Glossary

What Is Fuzzy Matching?

How similarity scoring matches records that identifiers miss.

Comparison

Scripts vs. a Matching Platform

When hand-rolled matching logic stops scaling and what to use instead.

FAQ

How do I match products when suppliers use different identifiers?

Layer your signals. Resolve exact identifiers first — GTIN, UPC, EAN — then normalized manufacturer part number plus brand, then fall back to probabilistic matching on description and key specs for the remainder. A curated SKU-to-SKU cross-reference handles known equivalences that no automated signal can infer. Claro runs this layered match automatically and flags every low-confidence result for human review rather than silently merging them.

What is a canonical or golden product record?

It is the single authoritative record for a product, assembled from the best field values across all matched supplier records using survivorship rules you define. New supplier files match against it rather than creating new rows, which is what keeps a reconciled inventory from re-fragmenting the moment the next price file lands.

Should I auto-merge matched records or review them?

Both, by confidence band. Auto-merge where the score is high and the cost of an error is low, route a middle band to human review, and auto-reject clear non-matches. Set thresholds per product category, since the cost of a wrong match varies widely between electrical components and home goods. Claro surfaces the right band automatically so reviewers only see the cases that actually need a decision.

How do I keep reconciliation from breaking when suppliers send new files?

Run it as a standing pipeline rather than a one-time cleanup. Each incoming file is normalized and matched against your existing canonical records, so updates merge into the right product instead of spawning duplicates. Claro writes the resolved record back into your PIM or ERP so the clean state persists without a manual export step.

How long does it take to reconcile dozens of supplier catalogs?

Most of the volume resolves automatically through deterministic matching; the time cost lives in the fuzzy long tail and the review queue. The biggest lever is reducing how much lands in review, which comes from clean normalization and well-tuned thresholds rather than raw catalog count. Teams using Claro typically clear their initial backlog in days, not weeks.

What happens to missing attributes after catalogs are reconciled?

Reconciliation tells you which records are the same product. Enrichment fills the gaps in the winning record. Once Claro resolves identity, it checks each canonical record against your required attribute schema, sources missing values from any matched supplier record that has them, and flags what still needs attention — so your PIM receives a complete record, not just a deduplicated one.