Operating Models for Data and Analytics
Patrick Bangert
Chief of AI | Data Science | Artificial Intelligence (AI) | Machine Learning (ML) | Data Analytics | Product Development | Software Engineering | CTO
Data and its related analytics, charts and reports are increasingly important for all enterprises. They are accessed by an ever larger and more diverse group of people around the enterprise. This leads to two sets of challenges. First, managing the data itself involving its format, storage, governance, documentation, quality, and correctness. Second, managing the analytical reports resulting from the data including keeping them up-to-date, managing access, creating, and customizing more reports, dealing with one-time analytical requests, and helping to interpret the meaning of analytics that will be used as a basis for decision-making.
Centralization
Conflicting tendencies emerge from these complexities. The tendency to centralize comes from a desire to control all of this in one place by one coherent team that can impose a set of rules with one set of tools and do so at a reasonable expense. However, this comes at the cost of a reduced ability to respond to short-term requests and being out of touch with the other locations and groups in the company.
Requirements for audits or regulatory compliance are reasons requiring centralized models at least for those parts of the company’s data and its governance.
Federalization
The tendency to federalize comes from a desire to respond quickly to local needs in a distributed enterprise. Each location or department may have its own small analytics team or suitably trained and enabled staff. This comes at the cost of an expanding set of tools that must be maintained and a growing landscape of pockets of data and reports. Different departments will develop and adhere to different sets of rules and regulations leading to confusion and challenges in interpretation of analytics.
Decentralized models tend to produce datasets confined to files (spreadsheet or CSV files for example) that may be stored on individual drives or emailed to others. Such datasets quickly become stale, are usually not documented, and cannot be quality controlled or governed. Conclusions based thereon must be treated as preliminary at best. In all, this approach leads to increased financial costs.
Compromise: Hub and Spoke
Enterprises have observed a pendulum swing from one to the other extreme every five to seven years as management or company ownership changes and the focus shifts from cost cutting to growth and backwards. Changing these operating models itself represents a great cost in effort, money, and data uncertainty. Just as return-on-investment is achieved, the next operating model change is heralded as a panacea and the wheel keeps on turning.
There is widespread agreement that a compromise solution is the answer. This is known as the hub-and-spoke model where some aspects are centralized, and others are federated. In essence, it is good to centralize the management of the data and to federalize the analytics and reporting.
One might say that no one really wants to be fully federalized, and everyone finds that being fully centralized is practically impossible, making a hub-and-spoke model the only future-proof methodology anyway. In this way, risk is centralized and speed is federalized with the best of both worlds included.
领英推荐
Why and How it Works
The various locations and departments in an enterprise do not really need their own copies of data or to rearrange how the data is stored. All they need is the ability to obtain the data they require for a certain analytical need. If they data is stored centrally, governed, and documented properly and in a manner that can be queried in a flexible manner, there is no reason why the various requests could not be met by on-the-fly queries as opposed to creating local copies that will go out of date. With the right cloud storage, cyber-security, and access controls, this is possible. This central data system can be maintained centrally at a relatively low cost in both staff and fees.
The need is for customized reports and analytics, which are usually just aggregations, computations, or charts based on the (centrally managed) data. They can be built by staff in the local offices and groups that are trained and enabled to do so according to centrally governed policies. They can thus respond to last-minute and custom requests at a reasonable additional cost in staff effort. If everything is properly managed, and in the cloud, this should not generate additional costs in fees as the tooling is centrally governed because all analytics refer to a single source of truth.
Most companies will have several systems, each of which stores important data. Customer relationship management (CRM), enterprise resource planning (ERP), Ecommerce systems and so on are examples of this. They are all a part of this single source of truth and should be connected by master data management for further streamlining.
Conclusion
As always, we need to first identify what we are solving for, before coming up with a solution. Where are the bottlenecks? Who is making the decisions? Some general principles are almost always present. Everyone must know what the numbers mean and so they need documentation. The data is only useful if it is correct and current, for which we need data quality checks and a single source of truth. Not everyone can have access to all the data and so we must have security and access controls. Many people will want their own views of the data or be able to answer ad hoc questions and so require flexible access to it.
?
?
?
This emerged from a discussion that I moderated at a boardroom meeting at the Evanta, a Gartner Company CDAO Summit in San Francisco on Nov 7, 2023.
Automation pioneer in Data, AI, Cloud and Identity. Inventor of the data fabric architecture in 2015.
8 个月You're close to right: A hub and spoke model is much better than point to point, but a distributed mesh is best.
Socio-Economic-Political Analyst
11 个月Hi, I have a query need your suggestion. A ecommerce startup portal selling apparels Jean's n jewellery. Problems faced: A manufacturer register as vendor on their ecommerce website and uploads products which are also open for sales on other platforms like counter sales,agent sales etc. So if product is not sold first on ecommerce or say if these products are sold first on other platforms it becomes out of stock which is not updated by manufacturer and remains active on ecommerce website.So if order is placed need to cancel that. How can we counter this or if anything you can suggest to tackle this problem
Senior Data Scientist | IBM Certified Data Scientist | AI Researcher | Chief Technology Officer | Deep Learning & Machine Learning Expert | Public Speaker | Help businesses cut off costs up to 50%
11 个月Spot on! Centralizing data management and governance while federalizing dashboards and reports is a winning combination! #TalkDataToMe ??
-- Executive AI Strategist | Data Scientist | Speaker | Consultant -- xCERN, xUBC Prof, xKing, xRitchie, xMcK
11 个月Roughly yes. However, in practice it is better to act as a service department and treat the other BUs as customers. Work loads fluctuate so you need to move the analytics staff around