What Is a Data Pool? GS1, GDSN, and Synchronized Product Data

A data pool is a GS1-certified repository that syncs product records between trading partners over GDSN — but most catalogs hold far more than pools cover.

Supplier onboarding teams know the scenario: a new trading partner sends their item file, and half the GTINs are missing, descriptions are inconsistent across feeds, and the data that does arrive through GS1 covers only a fraction of the catalog. A GS1 data pool solves the exchange-and-synchronization problem for the slice of the catalog that flows through it — but most real catalogs are a patchwork of pooled, spreadsheet, and marketplace data, each arriving with different quality and structure. Claro treats pool-sourced attributes as high-trust signals and extends the same identity resolution, enrichment, and write-back to every source your PIM or ERP touches.

Definition

When people ask what is a data pool in GS1 terms, the answer is a GS1-certified service that acts as the on-ramp and off-ramp for synchronized product data. Suppliers publish their item attributes — identifiers, descriptions, dimensions, packaging hierarchies, nutrition or hazard data — into a data pool. Retailers, distributors, and marketplaces subscribe through their own data pool, and the network keeps both sides in sync whenever the source record changes. Each item is keyed on its Global Trade Item Number (GTIN) and the supplier’s Global Location Number (GLN), so a record published once can be consumed by many recipients without re-keying.

There is no single global data pool. The GDSN is a federation of dozens of certified pools that all speak the same data model and message format, so a supplier connected to one pool can reach a retailer connected to another. The pool is the infrastructure; the synchronisation network is the routing layer that connects pools to each other. This distinction matters during onboarding, because choosing a pool is a one-time integration decision, while the data flowing through it is continuous.

What a data pool does — and does not do

A data pool guarantees a consistent, machine-readable version of an item across every partner who subscribes to it. When a CPG supplier corrects a case-pack quantity or an allergen flag, every subscribed retailer receives the same change automatically. That eliminates the manual re-keying and version-mismatch problems that plague spreadsheet-based onboarding.

But a data pool only standardizes the records that flow through it. Most real catalogs are a mix: GDSN-synced grocery and consumer items alongside MRO parts, furniture SKUs, and industrial components that were never published to any pool. A home-improvement retailer might receive paint and fasteners through GDSN while loading patio furniture and power tools from supplier spreadsheets. The pool gives you clean delivery for the synced slice; everything else still needs identity resolution, attribute enrichment, and classification before it can be matched against your master catalog or surfaced in AI-driven product search.

Incoming channel	Typical categories	Data quality on arrival
GDSN data pool	Grocery, CPG, health and beauty	Standardized, GTIN-keyed, auto-synced
Supplier spreadsheets	MRO, furniture, industrial	Inconsistent field names, free-text descriptions
Marketplace feeds	Long-tail third-party items	Variable completeness, often missing GTINs
Direct EDI or API	Electronics, apparel	Structured but schema varies by partner

Before and after: pooled data in a trusted catalog

Even records that arrive from a certified data pool can carry empty attributes, outdated specs, or taxonomy codes that do not match your internal classification. The table below shows the practical difference between raw pooled data landing in a PIM and the same records after a canonical product-data layer validates and enriches them.

Before — raw pool record	After — Claro-resolved record
GTIN present, brand name in all-caps	GTIN retained; brand normalized to title case
Case quantity populated, inner-pack empty	Inner-pack filled from secondary source, confidence flagged
Product category is supplier taxonomy code	Mapped to your internal taxonomy with audit trail
Allergen flags present, ingredient list missing	Ingredient list sourced from published spec sheet
Record last updated 18 months ago	Staleness alert surfaced; re-sync triggered automatically

Claro ingests both pool-sourced and non-pooled records, resolves them to a single canonical product record, validates against your attribute schema, and writes clean, complete data back into your existing PIM or ERP — so your full catalog is consistent and AI-citable, not just the synchronized portion.

How pool data fits into a broader data pipeline

A data pool handles publication and delivery. The steps before and after it determine whether that data actually improves your catalog:

Supplier publishes to their certified pool

The supplier loads item attributes into their pool. The GDSN routes the record to every subscribing retailer’s pool automatically.
Retailer's pool delivers the record

Your pool receives the publication and surfaces it as a candidate item for your catalog. At this point the record meets GDSN structural standards but may still be incomplete or misclassified against your internal model.
Identity resolution and deduplication

The incoming GTIN is checked against your existing canonical product records. Deterministic matching on GTIN handles clean cases; probabilistic matching on name, brand, and dimensions handles the rest. Duplicate candidates are flagged before they reach the PIM.
Attribute validation and enrichment

Required fields missing from the pool record are flagged. Where secondary sources exist — spec sheets, open datasets, other supplier feeds — Claro fills the gap with a source citation so you know exactly where each attribute came from.
Write-back to PIM or ERP

The validated, enriched record is written back to your system of record with full provenance. Future pool updates trigger the same pipeline automatically, so no manual re-keying is needed when a supplier changes a pack size or an allergen flag.

Glossary

What Is GDSN?

The synchronisation network that links certified data pools and routes published records to subscribers.

Glossary

What Is Product Content Syndication?

How rich product content moves from suppliers to retail and marketplace channels beyond core GDSN attributes.

Glossary

What Is Schema Mapping?

Aligning incoming pool and feed fields to your internal product model during onboarding.

Glossary

Canonical Product Record

The single golden record that unifies synced and unsynced sources into one trusted item.

Comparison

GDSN vs Direct Feed

When a certified data pool makes sense versus a direct supplier integration.

Tool

GTIN Validator

Check the identifiers that key every record published into a GS1 data pool.

FAQ

What is a data pool in GS1 and GDSN?

In GS1 terms, a data pool is a certified electronic repository where suppliers publish standardized product data and recipients subscribe to receive it. The GDSN connects these certified pools into one network, so a record published to any pool can reach subscribers connected to any other pool. Each item is identified by its GTIN and the supplier’s GLN.

Is there only one global data pool?

No. The GDSN is a federation of dozens of GS1-certified data pools. They all use the same data model and message standards, so partners do not need to use the same pool to exchange data. You integrate with one pool and can reach trading partners connected to any other certified pool in the network.

What is the difference between a data pool and a PIM?

A PIM (Product Information Management system) is where you author, govern, and store your product content internally. A data pool is the external exchange layer that publishes or receives that content over GDSN. Many companies connect their PIM to a data pool so governed records flow out to trading partners automatically. The two are complementary, not interchangeable.

Do non-grocery products go through data pools?

Some do, but adoption is heaviest in grocery and CPG. Many MRO, furniture, and industrial catalogs never publish to a pool and arrive as spreadsheets or marketplace feeds instead. Most retailers therefore need a layer that can match, resolve, and enrich both pooled and non-pooled data so the full catalog is consistent.

How do I choose a data pool?

Choosing a pool is a one-time onboarding decision driven by where your trading partners already are, the categories and regions you support, and how the pool integrates with your PIM or ERP. Because all certified pools interoperate over GDSN, the choice affects integration effort more than reach.

What happens to data quality after records leave a data pool?

A data pool standardizes structure and delivery, but it does not validate completeness, catch taxonomy drift, or enrich missing attributes. Records that arrive clean at the pool level can still contain empty fields, misclassified categories, or outdated specs. A canonical product-data layer validates incoming pool records against your own master, flags attribute gaps, and writes corrected data back to your PIM or ERP so downstream systems stay accurate.

What Is a Data Pool? GS1, GDSN, and Synchronized Product Data

Definition

What a data pool does — and does not do

Before and after: pooled data in a trusted catalog

How pool data fits into a broader data pipeline

Related

What Is GDSN?

What Is Product Content Syndication?

What Is Schema Mapping?

Canonical Product Record

GDSN vs Direct Feed

GTIN Validator

FAQ

See how Claro handles this in production