
Product Deduplication
One product. One record. All the history preserved.
Create a single source of truth across suppliers, systems, and channels.
Product deduplication is the process of identifying and merging duplicate product records — the same product entered into the catalog multiple times under different IDs, names, or supplier references — into a single canonical record. Claro performs deduplication as part of its canonical entity layer: every merge carries a confidence score, links back to all source records, and is reversible. Source records are preserved as evidence; the canonical entity becomes the trusted reference used by downstream systems for pricing, search, analytics, and AI workflows.
Duplicates accumulate quietly and corrupt everything downstream.
How it works
From duplicate records to canonical entities — reversibly.
Detect, score, merge, preserve. Every merge is reviewable and reversible. Source records are kept as evidence.
1 — Detect duplicate candidates Claro scans incoming and existing records using attribute similarity, semantic comparison, and document-grounded evidence.
2 — Score every candidate Each duplicate candidate gets a confidence score. High-confidence duplicates merge automatically per your threshold. Lower-confidence cases route to review with side-by-side evidence.
3 — Merge with attribute-level provenance The canonical record inherits the best value for each attribute, with provenance preserved per field. Source records are kept as evidence — never lost.
4 — Reverse if needed If a merge is wrong, it's reversible. Audit history makes rollback safe. Your downstream systems keep using the IDs they already use.


“Without Claro would have taken 5x time longer to develop the new pricing system”
James B.
CPO, Eventim
Hours of Work, Done in Minutes.
Production-grade deduplication with reversibility, provenance, and review built in.
Book a demo
Attribute-Level Provenance
The canonical record inherits the best value per attribute. Source records are preserved as evidence.
Reversible Merges
Every merge is reviewable and reversible. Audit history makes rollback safe.
Semantic + Document-Grounded Detection
Goes beyond fuzzy text matching. Uses graph relationships, embeddings, and document evidence.
Confidence + Review Thresholds
Configure thresholds per category. High-confidence auto-merges; ambiguous cases route to review with side-by-side evidence.
Downstream-Safe
Claro maintains the mapping between canonical IDs and your existing IDs. Downstream systems keep using the IDs they already use.
FAQ
Frequently asked questions
Are merges reversible?
How does Claro decide which attribute wins when merging?
Can we tune the confidence threshold for auto-merge?
Will deduplication break our existing SKU references in downstream systems?
How long does a deduplication pilot take?




