The three types of data products need different emphases to manage them well
There are different types of data products and, unsurprisingly, organizations often use their own descriptions and terminology for variations on the same concepts. Some example classifications of data products in pharma companies are:?
To bring those concepts together into a common language, we think that Zhamak Dehghani’s classification of data products is clear and easily understood:?
Generally, source-aligned data products are combined into aggregate data products, which can be used to create consumer data products. However, the categories aren’t always strictly followed in practice, e.g. a consumer data product such as a dashboard might be created directly from a source-aligned data product.?
Do you recognize your types of data products in this wording??Or are you at the beginning of establishing data products and haven’t classified them yet??
"In the pharma industry, there are enormous benefits to creating data products,” said Ronak Vyas, Senior Principal – CMO Data Integration and Analysis at Vertex Pharmaceuticals. “This requires defining standards for data normalization and transformation with consumers in mind. As we collect data from our CMOs at Vertex, we create data products with business users in mind.”
Here, we’ll use Dehghani’s classification.?
1 Source-aligned data products?
Source-aligned data products are created from operational data sources without much transformation or aggregation (see Figure 1). They typically retain the original structure, format, and granularity of the data sources (which may vary by site or function) and they may have an extra function of assuring data integrity. Data moves from the operational plane to the analytical plane, which allows a change in purpose from meeting immediate business needs to developing analysis and insight.?
Source-aligned data products can preserve the provenance, quality, and completeness of the data and are the foundation for all the other types of data products – but they may pose some challenges, such as inconsistency, redundancy, or complexity.?
It’s rare for the owners of source operational systems to accept ownership of a source-aligned data product in the data lake – yet their input is essential because they understand the meaning of the data. Also, the extra responsibilities of data product ownership are often less well understood and get less attention than the demands of an operational system.?
Examples of operational systems in pharmaceutical companies that are often used to create source-aligned data products include Enterprise Resource Planning, Manufacturing Execution, and Lab Information Management systems.?
领英推荐
2 Aggregate data products?
Aggregate data products are created by transforming or aggregating source-aligned data products (see Figure 2). They bring together data from multiple sources into a coherent data structure that eliminates the differences between sources while adding context to raw operational data. They may also filter, group, sort, or summarize the data to reduce its size and complexity.?
Aggregate data products enhance the performance, usability, and relevance of the data, as well as enable standardization and consistency for data consumers. They can be used for formal data solutions such as data reporting and data visualization, for ad hoc analysis (e.g., as part of an investigation), or as input to another aggregate data product.
Challenges for the aggregate data product team include ensuring that data brought together from different sources is truly comparable, the data is maintained at an appropriate quality, there are controls over who has access to data, and that they have visibility of actual data usage downstream.?
Examples of aggregate data products for the pharmaceutical industry and what they contain include:?
3 Consumer data products?
Consumer data products deliver value for end users and should be produced by business experts without needing deep technology expertise. They include formal data solutions, ad-hoc data analytics, and application-level analytics (see Figure 3).?
Formal data solutions should be based on aggregate data products so that effort is only invested once for global benefit. It’s common to have dashboards, visualizations, and AI systems to provide insight and prediction to enable recommendation or even prescriptive analytics.?
Ad-hoc data analytics are made easier by aggregating data products, e.g., an investigation can be quicker because there is less effort to interpret and contextualize the data that’s being interrogated. Data lineage tools can provide a trace to the source.??
Application-level data solutions are provided alongside business applications and allow the analysis of data from that system. For example, the Quality Management System, Historian, and Laboratory Information Management System may each have its own reporting and analysis suite.?
Implications for managing data products?
By understanding the different types of data products, we can see that the emphasis of the roles to manage a data product varies slightly by data product type.
In the next article, we’ll explore the implications of forming teams to manage data as a product.?
The previous article in this series was called: How do we deliver well-managed data products??
Connecting the biopharm industry ? Coordinating collaboration ? Accelerating change
2 个月#biophorum #biopharma #bioprocessing