Optimizing Product Catalogs: How MongoDB Fits the Bill
Harihar Mohapatra
Deloitte | Driving Digital Transformations with Engineering, AI & Data | Technology Leader
MongoDB seems to be the perfect fit to implement a product catalog since products maps so well to documents. Almost every other system will want to make use of the catalog instead of making its own copy, so typically a low latency, scalable and geo distributed catalog service is the ideal solution. A document-oriented representation of product data means fewer entities (a handful of collections vs. dozens of tables), better query performance (no server-side joins), and structures that fit the product precisely. There’s no longer any need to design some master schema that can account for every single conceivable product.
A product has at least the following information:
Item: the overall product info
Variant: a specific variant of an item which typically has a specific SKU / UPC.
Price: price information may vary based on the store, the variant, etc
Hierarchy: the item taxonomy
Facet: facets to search products by
Vendors: a given sku may be available through different vendors if the site is a marketplace.
Patterns help determine how to structure data effectively, aligning with application requirements while avoiding redundancy and ensuring data integrity. Well-defined data models facilitate easier scaling as data volumes increase, ensuring the database can accommodate higher loads without significant restructuring.
Various design patterns that MongoDB offers for a highly performant schema that can save you a tremendous amount of CPU cycles, memory, time and cost. You can use any of them as suited for your business requirement.
领英推荐
Inheritance pattern - Group similar items together in one collection and use a discriminator field to differentiate them. For example, a “products” collection with the variable ‘type’, that differentiates the different types of products.
Approximation pattern - Store the approximate value of certain variables, where precision is not too important, instead of doing expensive calculations every time.
Attribute pattern - Documents within the same collection can have different attributes in addition to the common attributes. For example, different types of products can have different attributes.
Bucket pattern - Group related documents into the same bucket based on a predetermined criteria.
Schema versioning pattern - Stores the current and previous version of a document side by side.
Document versioning pattern- Track changes to a document over time by having all the versions of the document in the same collection rather than having a separate collection for each version.
Computed pattern - If a set of data needs to be computed multiple times in the application, the computed pattern can be used to pre compute the values saving memory and time.
Outlier pattern - To find data that does not match the other data present in a collection in specific use cases and flag them. For example, flagging unusually high valued transactions from a user, who generally does not do so.
Extended reference pattern - To fetch data from different collections into a single query, similar to the join operation in a relational database.
Preallocation pattern - When you know the document structure in advance and simply need to fill data.
Tree and graph pattern - When hierarchical data (data with parent-child relationship) is frequently queried, it can be stored in the same manner for easy retrieval and readability.