登录查看更多内容

A data model for product information management

John Quillinan

Experienced Hands-On Results-Oriented Strategy & Analytics Thought Leader

发布日期: 2025年3月1日

Introduction

It is important to have a robust data model in place for effective product information management (PIM). A well-structured data model can streamline product data across various channels, improve data quality, and enhance overall efficiency.

In this article, I highlight key components of a Product Information Management (PIM) data model, including taxonomy, attributes, and relationships, and provide insights into best practices for designing and implementing a successful PIM system. I review data models from a few PIM solutions, and present a custom data model that could be implemented in a build. The final data model will also help you identify the product data elements that you need to master in a solution that you can buy and configure.

Build vs Buy

A grocery retailer recently shared that it is considering build their own in-house MDM solution. Personally, I advocate buy a COGS solution given how many solutions are out there and how mature the MDM space is. It is more expensive and takes more time to build an in-house solution. In addition to taking years to rollout an in-house solution, the core team of developer and architects have often left by the time the solution goes live. Without the institutional knowledge, it makes your inhouse solution that much more expensive. For a Product MDM or Product Information Management (PIM) solution, there are a number of COGS solutions that will give you 90% of the features that you need. For the other 10% of features that you think that you need, you should carefully evaluate their usage (i.e., how often are they used), and what is the return on investment (ROI) to customize your solution to provide those features. If there is no quantifiable ROI, then you probably don't need the other 10% of the features.

PIM Defined

Gartner defines the product information management (PIM) market as the packaged solutions that enable product, commerce and marketing teams to create and maintain an approved shareable version of rich product content.

PIM makes a single, trusted source of product information available for multichannel commerce and data exchange. PIM solutions now support complex use cases, including product data syndication (PDS), product experience management (PXM), product information effectiveness analytics, digital shelf analytics and product data contextualization. They lay the foundation for delivering personalization, product discovery and digital experience platforms (DXPs). PIM is available as hosted cloud-native, SaaS, private cloud and on-premises solutions.

Why Should I Implement a PIM?

You should implement a Product Information Management (PIM) system if you want to centralize and streamline your product data across multiple sales channels, ensuring consistent and accurate information for customers, leading to improved customer experiences, faster time to market, and better overall efficiency in managing product details, especially when dealing with a large number of SKUs or selling across various online platforms.?

Key benefits of implementing a PIM:

Improved data quality:

A single source of truth for all product information, eliminating inconsistencies and errors across different channels.?

Faster time to market:

Efficiently manage product launches by quickly updating and distributing product data across platforms.?

Enhanced customer experience:

Accurate and comprehensive product information on all sales channels leads to better customer decision-making.?

Omnichannel selling:

Seamlessly distribute product information to various online marketplaces, websites, and catalogs.?

Increased operational efficiency:

Reduce time spent managing product data by centralizing updates in one system.?

Better collaboration:

Facilitate collaboration between different teams involved in product information management (marketing, sales, product development).?

Data Models Matter

The underlying data model often comes up in the first phase of an organization undertaking a PIM solution. While most vendors do not expose their data models, I will review data models from a few vendors here. I will conclude with a custom product data model, that is suitable for a retail grocer. A well laid out data model is critical to the success of the Product Information Model application for a business.

Informatica

The Informatica data model revolves around the concept of a Entity Item. The term "entity item" is used to identify a single business object which can be represented by different models. An entity item is always a concrete real-world object which maps to a physical record in the database. (E. g. a product or structure group object.)

The following data model was constructed from the Informatica Product 360 SaaS data dictionary.

Figure 2. Informatica Product 360 Item and Category business entities.

Stibo STEP

The Stibo data model is highly extendable, but the fully data model is not exposed. Stibo employs an object model.

STEP employs Super Object Data Types and hierarchical structuring.

Object Types and Structures are defined in STEP workbench.

The Stibo STEP data modeling process consists of the following steps.

Define Dimensions, Dimension Points and Contexts
Define Product Object Types & Structures
Create (missing) Units
Create LOVs
Create Attribute Groups

6. Create Attributes and make them valid for the required Object Types

7. Instantiate the hierarchy category levels

8. Link in Specification Attributes

SAP Material Master

The material master contains information on all the materials that a company procures or produces, stores, and sells. It is the company's central source for retrieving material-specific data. This information is stored in individual material master records.

The data in the material master is organized into different views for easy maintenance.

The Basic Data view stores basic information like Base Unit of Measure, Material Description, etc that is common for the entire organization. The Classification view stores the material attributes, and is common across all organizational units. The Sales organizational view stores the sales relevant information, and is at plant & sales area level. The Purchasing view stores the purchase relevant information, and is specific to the plant. The MRP views store Planning relevant data, and is specific to the plant; a few storage location specific values are also stored here. Forecast relevant information like forecast model, etc are stored at the Forecasting view, and is at plant level. All manufacturing related information like Production scheduler, etc. are stored in the Work Scheduling view, and is at plant level. The quality testing related information is stored in the Quality view, and is at plant level. The Storage Location information like Physical Inventory Indicator is stored in the Storage Location views, and is specific to the plant & storage location. The Warehouse related information like picking storage type is stored in the warehouse view, and is specific to the plant & warehouse is stored in these views. Accounting related information like valuation class, etc are stored in the Accounting view, and is at Valuation Level (Usually Plant). Costing related information for a material is stored in the Costing view, and is plant specific.

Materials with the same attributes are grouped together and assigned to a material type. Material type controls important functions for materials.

Figure 11. Materials with the same attributes are grouped together and assigned a material type.

Figure 12. Material type codes and descriptions..

Organizational views enable downstream system Material Master Data for all processes.

Figure 13. Key Fields by Organizational View.

Some material master data is valid everywhere, but some only on respective organizational level. For instance, a material may be shipped differently in different plants. The different data on the respective organizational levels are stored in different database tables, approximately following the principle of?relational database management. The data structures are:

Proposed data model

Having reviewing data models from Informatica, Stibo, and SAP, I conclude with the following normalized data model for a Product Information Management. Kindly note that the model includes some customizations such as allergen and nutrients that would be found in a PIM for a retail grocer.

The PRODUCT_MASTER table is the core or central data structure. The PRODUCT_MASTER contains basic information such as the product id, UPC/EAN, vendor product number, manufacturer global location number (GLN), manufacturer product number.
PRODUCT_TYPE table: A product is classified using a product type and the assigned product purpose. The primary key for this table is the PRODUCT_TYPE field, which is defined as a foreign key in the PRODUCT_MASTER table.
PRODUCT_BRAND table: This table is a lookup table for Brand codes. The primary key is the BRAND_CODE field, which is defined as a foreign key in PRODUCT_MASTER table.
PRODUCT_STATUS table: Status codes are defined in this lookup table. The primary key is STATUS_CODE, which is defined as a foreign key in PRODUCT_MASTER table.
PRODUCT_DESCRIPTION table: This table contains description in different languages for the product. This table also supports different descriptions based on the distribution channels. The primary key is PRODUCT_ID, CHANNEL_ID (distribution channel identifier), and LANGUAGE.
DISTRIBUTION_CHANNEL table: Examples of Distribution Channels are in-store, ecommerce (i.e., company-branded web site), and third-party distribution channel (e.g. UberEats, DoorDash). Primary key is CHANNEL_ID.
PRODUCT_MEASURES table: The purpose of this table is to capture dimensional data such as weight and volume, as well as length, width, height and depth. The primary key consists of the PRODUCT_ID, UOM_ID and MEASURE_TYPE fields.
UOM (Unit of Measure) table: UOM includes a code and description for a dimensional unit of measure. This table includes a UOM Group. The primary key is the UOM_ID field.
UOM_GROUP table: UOM_GROUP is a grouping or category of dimensional units of measure. The primary key is the UOM_GROUP_ID field, which is defined as a foreign key in the UOM table.
MEASURE_TYPE table: According to the federal label regulations on the net quantity of contents, the principal display panel of a food in package form shall bear a declaration of the net quantity of contents. This shall be expressed in the terms of weight, measure, numerical count, or a combination of numerical count and weight or measure. If the food is solid, semisolid, or viscous, it should be expressed in terms of weight. If the food is a liquid, it should be expressed in a fluid measure (e.g., fl oz). For non-food products, such as lotions or dish detergents, fluid ounces should be listed on the label. Physical dimensions is another important measure type, but generally you won't see this on principal display panel. This table is a lookup table for the measure type (e.g. net weight, fluid ounces, physical dimensions). The primary key is the MEASURE_TYPE field, which is defined as part of composite primary key in the PRODUCT_MEASURES table.

PRODUCT_PACKAGING table: Products may be available in more than one type of Packaging. This table lists the possible packagings for a product, and is cross-referenced to the packaging hierarchy. The primary key consists of the PACKAGING_LEVEL_ID and PRODUCT_ID fields.
PACKAGING_HIERARCHY table: This table defines the packaging level and their parents. Packaging level definitions can be found here. The primary key is the PACKAGING_LEVEL_ID field.
PRODUCT_HIERARCHY_ASSIGNMENT table: Products are typically assigned to the lowest level in the product hierarchy. The product hierarchy may have as many of 6 levels as demonstrated in the hierarchy of a discount retail chain. The product hierarchy will also vary based on the distribution channel or hierarchy type. In the case of the discount retail chain, there is a store hierarchy, which follows a traditional merchandising hierarchy, an ecommerce hierarchy and also an accounting hierarchy. The product assignment to product hierarchy is captured in this table. The PRODUCT HIERARCHY typically is organized in a fashion similar to those shown in Figure 15. The primary key is the PRODUCT_ID field, with the HIERARCHY_ID field as a foreign key to the PRODUCT_HIERARHCY table.
PRODUCT_HIERARCHY table: This table captures the product hierarchy level along with a description, and its parent, if one exists. The primary key is the HIERARCHY_ID field.
HIERARCHY_TYPE table: This table is a lookup for product hierarchy type. The primary key is the HIERARCHY_TYPE field. See figure 16 for sample product hierarchies and how they might vary between retailers, and even within a retailer's distribution channels. In this example, Scintilla?Customer Perception (an offering by Walmart Data Ventures) illustrates how an end user might navigate the Walmart's omni-channel Product Hierarchy to view customer perception analytics.

ENTITY_TYPE table: Entity type is similar to material type in the SAP Material Master, and defines which products are finished products (sellable items), versus raw materials such as ingredients for products produced in a store. The primary key is the ENTITY_TYPE field.
PRODUCT_RELATIONSHIP table: A product could be both a raw material as well as a finished product.We relate products / entity type to each other via relationship (RELATIONSHIP_ID). The primary key is the ENTITY_TYPE_ID, PRODUCT_ID, and RELATIONSHIP_ID fields.
RELATIONSHIP table: This table is a lookup table for relationships. The primary key is the RELATIONSHIP_ID field, which a part of a composite key in the PRODUCT_RELATIONSHIP table.
PRODUCT_ATTRIBUTE table: Products are related to attributes and their values can be found in this table. The primary key is the PRODUCT_ATTRIBUTE_ID field, which is a surrogate key. The ATTRIBUTE_ID field is a foreign key to the attribute in the ATTRIBUTE table. The LOV_ID field is a foreign key to the attribute value in the LOV table.
ATTRIBUTE table: This table serves as a lookup table for all attributes. The table assigns related attributes to Attribute Groups. The primary key is the ATTRIBUTE_ID field.

LOV table: This tables serves as a lookup table for valid attribute values. The primary key is the LOV_ID which is surrogate key. The ATTRIBUTE_ID field is a foreign key to the attribute in the ATTRIBUTE table. The PRODUCT_ID field is a foreign key to the product in the PRODUCT_MASTER.
ATTRIBUTE_GROUP table: Attribute groups are assigned codes and descriptions in this table. The primary key is the ATTRIBUTE_GROUP_ID field, which is a foreign key in the ATTRIBUTE table. An attribute is assigned to one and only attribute group.
PRODUCT_ATTRIBUTE_GROUP table: In our Stibo example, in figure 7, attributes are organized into groups. For Meat and Seafood Characteristics group, we have the product attributes: Antibiotic Free, Boned vs Deboned, Fresh Caught Or Farmed, Latin Name For Species, Recommended Sauce, Skinned vs Skinless, Type Of Casing and Yield. We control which attributes can be associated with a product by assigning the product to an attribute group in this table. If properly implemented in the presentation layer, this avoids the merchant or steward having to populate all of the attributes, rather only populating the applicable attributes. We can also use product attribute group to control who can view and edit attributes. The primary key is the PRODUCT_ATTRIBUTE_GROUP_ID field. The ATTRIBUTE_GROUP_ID field is a foreign key to the attribute group in the ATTRIBUTE_GROUP table. The PRODUCT_ID field is a foreign key to the product in the PRODUCT_MASTER.
ALLERGEN table: The Food Allergen Labeling and Consumer Protection Act (FALCPA) requires manufacturers of food products in the United States to clearly label their products with information about the presence of major food allergens. The nine major food allergens that must be labeled are: Milk, Eggs, Fish, Crustacean shellfish (e.g., crab, lobster), Tree nuts (e.g., walnuts, almonds), Peanuts, Wheat, Soy, and Sesame (added in 2021). This table serves as a lookup table for allergens. The primary key is the CODE field.

Figure 18. Composition with common food allergens including egg, milk, soya, nuts, fish, seafood, wheat flour, mustard, dried apricots and celery (courtesy of Shutterstock).

PRODUCT_ALLERGEN table: Associated allergens are flagged (Yes/No) for product in this table. The source of the information is the 1) ingredients label (where in parentheses following the name of the ingredient, the name of the food source of a major food allergen must appear); examples: “lecithin (soy),” “flour (wheat),” and “whey (milk)”. 2) This information may also appear immediately after or next to the list of ingredients in a “contains” statement; example: “Contains wheat, milk, and soy.” The primary key is the PRODUCT_ALLERGEN_ID field. The PRODUCT_ID field is a foreign key to the product in the PRODUCT_MASTER, while the (allergen) CODE field is a foreign key to the allergen in the ALLERGEN table.

Figure 19. Ingredients and "contains" statement highlighting wheat and soy ingredients.

PRODUCT_NUTRIENT table: Nutritional information is captured at the product level in this table. Refer to section 3 in the sample label shown in figure 19. It shows some key nutrients that impact your health. You can use the label to support your personal dietary needs. The primary key is the PRODUCT_NUTRIENT_ID, which is a surrogate key. The PRODUCT_ID field is a foreign key to the product in the PRODUCT_MASTER, while the (nutrient) CODE field is a foreign key to the nutrient in the NUTRIENT table. The UOM_ID field is a foreign key to lookup the UOM. The associated value from the nutrients section is populated in this table.
NUTRIENT table: This table serves as a lookup table for all nutrients: their codes and descriptions. The primary key is the (nutrient) CODE field.

Figure 20. Sample label courtesy of FDA.

PRODUCT_FINANCIAL table: The last set of tables in the proposed data model is related to financials. Product financials are stored in this table. The primary key is the PRODUCT_ID, FINANCIAL_ID and CURRENCY_ID fields.
FINANCIAL_TYPE table: This table serves as a lookup table for the types of financial information. The primary key is the FINANCIAL_ID field. Think of the FINANCIAL_ID as the valuation type in the SAP MBEW (Material Valuation) table.
CURRENCY table: This table is a lookup table for all currencies. The primary key is the CURRENCY_ID field.

This data model is a work-in-progress. Whether you are planning to build or buy a solution, the data model should help you identify the data elements that you need to master product data. The model is highly extensible. Feedback and suggestions for improvements are welcomed.

#PIM #ProductInformationManagement #ProductMDM #MDM

要查看或添加评论，请登录

John Quillinan的更多文章

EV Trucks Pros and Cons

2024年12月26日

EV Trucks Pros and Cons

I test drove a Chevrolet Silverado EV RST on Saturday, December 21. I was lured into the dealership via a targeted…
What is AI Governance?

2024年9月6日

What is AI Governance?

AI governance refers to the frameworks, policies, and practices that guide the responsible development, deployment, and…

1 条评论
Best Practices in MDM

2024年8月23日

Best Practices in MDM

There are six best practices to keep in mind for an MDM initiative to be successful. Align with company goals Start…
The Quest for Quality Product Data and Images

2024年8月13日

The Quest for Quality Product Data and Images

GS1 Data Pools Just about anyone who works with product data in the retail world has heard of GS1. GS1 is an…
What is the difference between a PIM and Product MDM?

2024年8月8日

What is the difference between a PIM and Product MDM?

Do you know what the difference is between a PIM and Product MDM? This article will explain the difference and when to…

1 条评论
Hierarchy Management

2024年8月2日

Hierarchy Management

Hierarchy management plays a crucial role in master data management (MDM) by organizing and maintaining the…
Packaging information

2024年7月29日

Packaging information

Packaging information and levels is a critical attribute for products. While retailers might sell a unit at the case…
List of Values (LOVs)

2024年7月24日

List of Values (LOVs)

A List of Values (LOVs) is a defined set of values that can be selected for an attribute with the validation base type…
Product hierarchies

2024年7月24日

Product hierarchies

A Product MDM solution should support the association of multiple product hierarchies to a product. Primary Hierarchy…
Dietary certifications and allergens

2024年7月23日

Dietary certifications and allergens

While some dietary certifications are directly regulated by government bodies, others are managed by independent…

See all articles

Introduction

Build vs Buy

PIM Defined

Why Should I Implement a PIM?

Key benefits of implementing a PIM:

Data Models Matter

Informatica

Stibo STEP

SAP Material Master

Proposed data model

John Quillinan的更多文章

EV Trucks Pros and Cons

What is AI Governance?

Best Practices in MDM

The Quest for Quality Product Data and Images

What is the difference between a PIM and Product MDM?

Hierarchy Management

Packaging information

List of Values (LOVs)

Product hierarchies

Dietary certifications and allergens