Why is a Cloud-Native Master Data Management platform important?
I was recently speaking with an analyst in the master data management industry and I was informed from him that "companies are not the ones asking for cloud native-ness with Master Data Management, it is the Vendors pulling them in this direction. "
I could speculate on this, but would rather discuss what we have seen here at CluedIn. Achieving a successful Master Data Management implementation is really hard. It is fraught with possible failure, no matter the company, no matter the people, no matter the technology or data. Hence, I believe this is at least one of the reasons why cloud might not be a top priority for a Modern Master Data Management projects with companies.
So it's now my job to attempt to convince you why Cloud should be on your checklist when you are building your MDM platform. It is worth mentioning that you can retrofit a non-cloud-native Master Data Management platform to achieve some of the below items, but a truly cloud native will more naturally benefit from these items.
Why your Data Management Platform should be Cloud Native?
Minimal Upfront Cost
If the Master Data Management platform has been designed properly for the cloud then there should be close to a zero upfront cost from having an installation with little to no data in it. This is important because it lowers the risk and gives the customers a clearer view on return of investment.
Unlimited Cloud Scale.
For those who have been working with the cloud for some time, you will know that most of the cloud providers will allow a limited amount of resources in your cloud tenant and then ask you to submit tickets to the cloud vendor to open up access to create more scaling possibilities. This is very responsible of the cloud providers as it is very tempting to test that unlimited scalability out to only be left with a large bill. When we look at the possible hurdles and restrictions in implementing MDM, the LAST thing we want to block us are trivial things like scale.
Access to Lots and Lots of Services
What I think it most important for Master Data Management in the cloud, is actually the access to an immense amount of services in the cloud. Whether it is cost management, business intelligence, machine learning, security or data processing - this is where all the investments from these companies is going and hence this is where you want your data and infrastructure to be as well to limit the amount of hurdles when it is time to utilize these services.
Built for proper cost optimization.
It was an eye-opening and game-changing moment for me when I realized that the business model of the cloud was aimed at being able to run workloads faster AND cheaper. Yes, you heard that right. If we have a data processing job to do, then it will always be cheaper and faster to scale to the largest boxes available to the closest 1 hour resolution. Naturally, for the cloud providers to get even better, they will need to start offering second resolution - but for right now, 1 hour is still good. What this means is that the cloud model makes it possible to have your environments run at a lower cost than managing your Master Data Management platform yourself.
Because that is where all the investment is going.
Microsoft, Amazon and Google are not focusing on on-premise, full stop. It is important to remember that although these Vendors will typically always have a solution for an on-premise or PAAS Master Data Management platform, if it is not cloud-native and built FOR the cloud, then the setup will always be subpar and not what the cloud providers intend.
Less need for IT, more focus on the problem and solution.
It is without a doubt that IT will need to be involved for you to manage your cloud environment, but what is true is that the cloud does simplify some of the processes that would typically be harder in stitching different pieces of your Master Data Management environment together.
Much faster to market with ideas.
In the interest of innovation and prototyping ideas, the cloud is designed for this. Imagine a situation where you have your Master Data and you are wanting to utilize it to generate insights. Then imagine that someone in your group has an idea to use Microsoft Cognitive Services to automate some insights on that data. In the cloud, this is literally something that can be spun up, integrated and running within a day. If the experiment fails, there are no high residual costs that will lie around.
It enables the ability to spin up services to run micro jobs.
Have you ever had those moments where you could have run a deduplication faster, a merging process more efficiently, but the red tape and forms to fill out internally are too much work that it is easier to just wait for your job to finish. The cloud infrastructure removes this. Knowing that the cloud has built its pricing model to make these large jobs ALWAYS (if the data is predictable) cheaper to run on large boxes for a smaller amount of time, means that this is a thing of the past.
Cost effective backup and restore.
In an effort to make things more simple across the board, the cloud enables some foundational pieces of data infrastructure that are just expected i.e. Disaster Recovery.
More ability to enforce best practices when you don't have these dedicated internal resources e.g. Policies.
You can probably imagine the immense amount of investment that has been made by the cloud providers to instill best practices into their environments. I have lost track of the number of times that a cloud service has been available for something critical to your data foundation that I would have not instinctively thought of. You are in good company with these cloud providers, they are genuinely trying to make it safer and more robust to host your applications.
What are some pitfalls of cloud-native Master Data Management platforms?
It is really, really hard to know what things will cost. I have lost track of the times that our engineering team has accidentally overspent.
I can already hear all the cloud providers yelling at me for this, but it is a niggling component that has surprised us on many occasions. The cloud IS unpredictable with its pricing, and I am the first person to say that the solution is not obvious. Half of the beauty of the cloud is how simple they have made it (once you know what you are doing) to spin up infrastructure - but the double-edge sword is that you are often unaware of the impact.
In defense of the cloud, we have only had a handful of cases like this and they were all very good at giving us credits to cater for this in the next month.
The majority of platforms are not built and designed for the cloud...still.
I cannot understate how much the cloud needs to be in back of mind for your entire product engineer team for the pay-off to be realized. It is amazing how if even one part of your platform it not cloud-native, then it can balloon the costs of operation. Speaking directly, your Master Data Management platform needs to be built with abstraction and a scale-out backbone. As you move your platform from cloud provider to cloud provider, you need to realize that the backbones will change e.g. moving from Azure Service Bus to Amazon SQS. It should not matter what the underlying backbone is, your Master Data Management platform should be able to work no matter the technology underneath.
There really is no predictability in price.
The bottom line is that to gain some of the benefits of real-time, streaming and free-flowing data, there is the possibility of unpredictable data - typically driven via unpredictable workloads of data. As a simple example, imagine you are receiving data from a source, but have no idea when the data will stop flowing. In this example, it is very hard to know if scaling up is the right thing to do, or if as soon as you do scale up the data will stop flowing. The good thing is that most cloud providers will give you "credit" for the next time you need to spin up for the time that was lost to the closest hour resolution.
Most cloud providers are not fast enough at keeping up with PAAS offerings
The technology landscape is a fast one. You blink and there are 15 new possible technology choices that you could make. I do however believe that there are some critical and foundational PAAS offerings from the cloud vendors that is missing that restricts our ability to make truly cloud-native platforms. This is mainly a challenge now by the number of Vendors that need to be involved in the event of spinning up a cloud native data platform. For example, there are 15 different types of database technologies available today, yet most cloud providers are still not proving purpose fit answers for this. What this results in is a mess of Data Processing Agreements and the need for Vendors (like us) to provide that data backbone as part of the product. This, in my mind, is not cloud native. Now I know that the cloud providers should not be responsible for implementing every platform under the sun, but it should provide a native solution for the core foundational development tools e.g. a native Graph solution, a native Time-Series solution.
It is close to impossible to keep up with the innovation in the cloud i.e. there is a chance you will often run into being depreciated without you knowing it.
The benefits of having so much innovation is great, but the downside is that there is no way to keep up with it, let alone, do this across the different cloud providers. One of the "benefits" of the cloud is that they are always trying to get on the latest versions of everything e.g. Kubernetes. Unfortunately the side effect of this is that you can blink and suddenly your data infrastructure is outdated. There are still many parts of the cloud that don't auto-update e.g. version of Kubernetes.
The Cloud Consoles are very hard to navigate as they are typically so large.
They say the hardest thing in programming is what to name your variables. Although the saying is being slightly facetious, it doesn't take away from the fact that cloud specific terminology makes it so hard to know how to wield all of these environments.
In summary, a Master Data Management platform that is cloud-native is an investment that will continue to pay off in many areas, even outside the Master Data Management scope itself. It will streamline future ambitions with data and fast-track the ability for Master Data to be proliferated throughout your business. Although CluedIn can run on-premise, the clear insight from our team is that you will getter a better ROI if you are running CluedIn in your cloud environment.
Data & AI Enthusiast, and VP, IBM APAC
2 年Tim, great insight on Cloud-native MDM
InfoActive in Political Journalism
3 年Great article Tim. For me an interesting look into the matter.
Enterprise Architecture & Data Management as a Service / Digital Strategist and transformer & collaborator / Speaker / Simplifying your business
3 年Manoj Shivanna Nikita Atkins Alan Riesenweber
sr. B2B marketing consultant with a knack for using the people, process and technology methodology to help grow B2B tech businesses
3 年great write-up on the cloud benefits! I could see that it's partly vendors that are driving this movement. As this is seen as benefits for the customer: more approachable, easier to get started, more affordable to get started (often, not always), less headaches on the interdepartmental collaborations. All +'s in my book. Also loving the pricing cartoon Tim! would love to borrow that!
Co-founder of Saint Atlas, Sunshine Coast.
3 年Love the summary. Really interesting.